- (Beigman Klebanov & Shamir, 2005) ⇒ Beata Beigman Klebanov, Eli Shamir. (2005). “Guidelines for Annotation of Concept Mention Patterns." Technical Report 2005–8, Leibniz Center for Research in Computer Science, The Hebrew University of Jerusalem, Israel.
- It is associated to (Klebanov & Shamir, 2006) ⇒ Beata Beigman Klebanov, and Eli Shamir. (2006). “Reader-based Exploration of Lexical Cohesion.” In: Language Resources and Evaluation, 40(2). doi:10.1007/s10579-006-9004-6
- This document contains technical material related to experiment on anchoring concept mentions. It is made available to the community in order to facilitate replication and detailed criticism. Description and analysis of the experiment are currently being prepared for publication; if the reader intends to use the following material, please contact the authors for up-to-date reference for citation.
- Sections 1 to 4 are the guidelines given to the subjects. Section 5 shows the example text given to subjects for trial annotation. Section 6 reproduces a summary circulated among the subjects after reviewing the trial annotations. Section 7 contains guidelines for validation experiment.
- Graeme Hirst writes:
- “... it is becoming clear that while the knowledge used in interpreting natural language is broad, the reasoning is shallow. Although we can’t yet characterize it precisely, it seems to be pretty much limited to reasoning about quite simple commonsense knowledge: knowledge of kinds, of associations, of typical situations, and even typical utterances.” (Hirst, 2000).
- It is this shallow and broad commonsense knowledge that we want to tap into. We envision it as a web of concept interconnections. This web can be glimpsed through texts people write, which is why we think it possible to annotate texts for “projections” of the concept web.
- For every concept first mentioned in the text, the annotator asks herself/himself which previously mentioned concepts help the easy accommodation of the current concept into the evolving story, if indeed it is easily accommodated, based on the commonsense knowledge as it is perceived by the annotator. We call this phenomenon anchoring.
- This treats a text as a whole where certain things are surprising, but other ones easily fit within an emerging picture once observed, striking the delicate balance between “Be relevant” (say things that have some relation to what is being discussed), “Be informative” (do not tell people what they already know) and “Be clear” (do not say things in an obscure way)2.
2 Concepts vs. Words
- While words constitute concept mentions, a concept is construed as a sub-web, comprised from words that mention it and connections to concepts that anchor it and are anchored by it.
- A word can be used in more than one concept mention simultaneously. For example, the words “Scientific” and “American” combine to be a mention of the magazine, and also mention a concept each one separately. Thus, subsequent mentions of magazine, scientist and national would be anchored to Scientific American, Scientific and American, respectively.
- If the same word is repeated in the text, we assume it mentions the same concept as its previous appearance in the text, so we are not trying to anchor it to any other concept. This is a simplification, since a word can be used in a different sense when it is repeated (as in The Central Bank is located on the river bank). We conjecture that such cases would be rare.
|2005 GuidelinesForAnnotOfConcMentPat||Beata Beigman Klebanov|
|Guidelines for Annotation of Concept Mention Patterns||http://www.cs.huji.ac.il/~beata/2005-TR-BBK-ES.pdf||2005|