Sentence Decomposition Task

From GM-RKB
Jump to navigation Jump to search

A Sentence Decomposition Task is a sentence processing task that involves breaking down complex sentences into simpler, independent clauses or phrases without losing the original meaning or context.

  • Context:
  • Example(s):
    • SentDecomp("Upon the occurrence of casualty damage to the Leased Premises, the Lessor shall have the obligation to promptly commence repairs and to diligently pursue such repairs to completion in a timely manner until all damage has been fully remediated.")

      =>

      "Casualty damage may occur to the Leased Premises (1). If casualty damage occurs, the Lessor shall have certain obligations (2). The Lessor shall promptly commence repairs (3). The Lessor shall diligently pursue such repairs to completion (4). The repairs shall be completed in a timely manner (5). The Lessor's obligations continue until all damage has been fully remediated (6)."

    • SentDecomp("Notwithstanding any other provision of this agreement, the tenant shall not make any alterations to the premises without the prior written consent of the landlord, which consent shall not be unreasonably withheld or delayed.")

      =>

      "The tenant is restricted from making alterations to the premises (1). This restriction applies despite any other agreement provisions (2). Landlord's prior written consent is required for any alterations (3). The landlord's consent cannot be unreasonably withheld or delayed (4)."

    • ...
  • Counter-Example(s):
  • See: Clause Extraction, Long Contract Sentence, Information Extraction.


References

2023

  • (Fan et al., 2023) ⇒ Yunlong Fan, Bin Li, Yikemaiti Sataer, Miao Gao, Chuanqi Shi, Siyi Cao, and Zhiqiang Gao. (2023). “Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences.” Applied Sciences, 13(16). https://doi.org/10.3390/app13169412
    • ABSTRACT: Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple sentence decomposition (SSD), etc. However, these works are not applicable for semantic parsing such as abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments with semantic relations and unavailabilities to preserve the original semantics. Following the same intuition and avoiding the deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), for capturing clausal structures of complex sentences, based on the linguistic research of clause hierarchy. With the HCA framework, we annotated a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decomposed HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more-silver annotations. In evaluating the proposed models on our manually annotated HCA dataset, the performances of clause segmentation and parsing resulted in 91.3% F1-scores and 88.5% Parseval scores, respectively. Due to the same model architectures employed, the performance differences of the clause/discourse segmentation and parsing subtasks was reflected in our HCA corpus and compared discourse corpora, where our sentences contained more segment units and fewer interrelations than those in the compared corpora.

2017

  • (Bast & Haussmann, 2013) ⇒ Hannah Bast, and Elmar Haussmann. (2013). “Open Information Extraction via Contextual Sentence Decomposition.” In: 2013 IEEE Seventh International Conference on Semantic Computing, pp. 154-159 . IEEE,
    • NOTES:
      1. Contextual Sentence Decomposition (CSD) is a technique that decomposes a sentence into its basic constituents, called "contexts", which are arranged in a tree structure to capture the semantic relationships between different parts of the sentence.
      2. The Sentence Constituent Identification (SCI) phase of CSD identifies the basic building blocks of contexts in a sentence, focusing on relative clauses, enumeration items, and other constituents that typically contain separate facts not directly related to the rest of the sentence.
      3. The Sentence Constituent Recombination (SCR) phase of CSD recursively combines the constituents identified by the SCI phase to form the final contexts, following a set of rules to determine whether the constituents should be combined using a cross-product (CONC) or a union (ENUM) operation.
      4. CSD can be implemented using the output of a constituent parser, by applying a small set of manually created rules to derive an SCI tree from the parse tree, which can then be used to generate the final contexts.
      5. The CSD approach enables the extraction of more accurate, minimal, and high-coverage facts compared to other Open Information Extraction systems, as it explicitly considers the semantic relationships between different parts of the sentence during the decomposition process.