SemEval-1 Task 4

From GM-RKB
Jump to navigation Jump to search

See: SemEval Task, Relation Recognition from Text Task.



References

2007

  • http://nlp.cs.swarthmore.edu/semeval/tasks/task04/description.shtml
    • There is growing interest in the task of classifying semantic relations between pairs of words. However, many different classification schemes have been used, which makes it difficult to compare the various classification algorithms. We will create a benchmark dataset and evaluation task that will enable researchers to compare their algorithms.

      Rosario and Hearst (2001) classify noun-compounds from the medical domain, using a set of 13 classes that describe the semantic relation between the head noun and the modifier in a given noun-compound. Rosario et al. (2002). classify noun-compounds using a multi-level hierarchy of semantic relations, with 15 classes at the top level. Nastase and Szpakowicz (2003) present a two-level hierarchy for classifying noun-modifier relations in general domain text, with 5 classes at the top and 30 classes at the bottom. Their class scheme and dataset have been used by other researchers (Turney and Littman, 2005; Turney, 2005; Nastase et al., 2006). Moldovan et al. (2004). use a 35-class scheme to classify relations in noun phrases. The same scheme has been applied to noun compounds (Girju et al., 2005). Chklovski and Pantel (2004) use a 5-class scheme, designed specifically for characterizing verb-verb semantic relations. Stephens et al. (2001) use a 17-class scheme created for relations between genes. Lapata (2002) uses a 2-class scheme for classifying relations in nominalizations.

      Algorithms for classifying semantic relations have potential applications in Information Retrieval, Information Extraction, Summarization, Machine Translation, Question Answering, Paraphrasing, Recognizing Textual Entailment, Thesaurus Construction, Semantic Network Construction, Word Sense Disambiguation, and Language Modeling. As the techniques for semantic relation classification mature, some of these applications are being tested. Tatu and Moldovan (2005) applied the 35-class scheme of Moldovan et al. (2004). to the PASCAL Recognizing Textual Entailment (RTE) challenge, obtaining significant improvement in a state-of-the-art algorithm.

      There is no consensus on schemes for classifying semantic relations, and it seems unlikely that any single scheme could be useful for all applications. For example, the gene-gene relation scheme of Stephens et al. (2001) includes relations such as "X phosphorylates Y", which are not very useful for general domain text. Even if we focus on general domain text, the verb-verb relations of Chklovski and Pantel (2004) are unlike the noun-modifier relations of Nastase and Szpakowicz (2003) or the noun phrase relations of Moldovan et al. (2004).

      We will create a benchmark dataset for evaluating semantic relation classification algorithms, embracing several different existing classification schemes, instead of attempting the daunting chore of creating a single unified standard classification scheme. We will treat each semantic relation separately, as a single two-class (positive negative) classification task, rather than taking a whole N class scheme of relations as an N class classification task (Nastase and Szpakowicz, 2003).

      To constrain the scope of the task, we have chosen a specific application for semantic relation classification, relational search (Cafarella et al., 2006). We describe this application in Section 2. The application we envision is a kind of search engine that can answer queries such as "list all X such that X causes asthma" (Girju, 2001). Given this application, we have decided to focus on semantic relations between nominals (i.e., nouns and base noun phrases, excluding named entities).

      The dataset for the task will consist of annotated sentences. We will select a sample of relation classes from several different classification schemes and then gather sentences from the Web using a search engine. We will manually markup the sentences, indicating the nominals and their relations. Algorithms will be evaluated by their average classification performance over all of the sampled relations, but we will also be able to see whether some relations are more difficult to classify than others, and whether some algorithms are best suited for certain types of relations.

2007