2004 AttributeBasedAndValueBasedClustering

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Term Clustering Algorithm, Lexicon Construction Task.

Notes

Cited By

2006

2005

Quotes

Abstract

In most research on concept acquisition from corpora, concepts are modeled as vectors of relations extracted from syntactic structures. In the case of modifiers, these relations often specify values of attributes, as in (attr red); this is unlike what typically proposed in theories of knowledge representation, where concepts are typically defined in terms of their attributes (e.g., color). We compared models of concepts based on values with models based on attributes, using lexical clustering as the basis for comparison. We find that attribute-based models work better than value-based ones, and result in shorter descriptions; but that mixed models including both the best attributes and the best values work best of all.

1 Introduction

In most recent research on concept acquisition from corpora (e.g., for lexicon construction), concepts are viewed as vectors of relations, or properties, extracted from syntactic structures (Grefenstette, 1993; Lin, 1998; Curran and Moens, 2002; Kilgarriff, 2003, and many others). These properties often specify values of attributes such as color, shape, or size: for example, the vector used by Lin (1998) for the concept dog includes the property (dog adj-mod brown). (We will use the term values here to refer to any modifier.) To our knowledge, however, no attempt has been made by computational linguists to use the attributes themselves in such vectors: i.e., to learn that the description of the concept dog includes elements such as (dog color) or (dog size). This is surprising when considering that most models of concepts in the AI literature are based on such attributes (Brachman and Levesque, 1985).

Two problems need to be addressed when trying to identify concept attributes. The first problem is that values are easier to extract. We found, however, that patterns like the X of the dog, already used in (Berland and Charniak, 1999; Poesio et al., 2002) to find part-of relations (using techniques derived from those used in (Hearst, 1998; Caraballo, 1999) to find hyponymy relations) are quite effective at finding attributes. A second problem might be that instances of such patterns are less frequent than those used to extract values, even in large corpora such as the British National Corpus (BNC). But this problem, as well, is less serious when using the Web as a corpus (Kilgarriff and Schuetze, 2003; Keller and Lapata, 2003; Markert et al., submitted).

We report on two experiments whose goal was to test whether identifying attributes leads to better lexical descriptions of concepts. We do this by comparing the results obtained by using attributes or more general modifiers – that we will simply call values – as elements of concept vectors used to identify concept similarities via clustering. In Section 2, we discuss how Web data were used to build attribute- and value- based concept vectors, and our clustering and evaluation methods. In Section 3, we discuss a first experiment using the set of concepts used in (Lund and Burgess, 1996). In Section 4, we discuss a second experiment using 214 concepts from WordNet (Fellbaum, 1998). In Section 5 we return to the notion of attribute.

6 Conclusions

Simple text patterns were used to automatically extract both basic value-based and attribute-based concept descriptions for clustering purposes. Our preliminary results suggest, first of all, that when large amounts of data such as the Web are accessed, these simple patterns may be sufficient to compute descriptions rich enough to discriminate quite well, at least with small sets of concepts belonging to clearly distinct classes. Secondly, we found that even though attributes are fewer than values, attribute-based descriptions need not be as long as value-based ones to achieve as good or better results. Finally, we found that the best descriptions included both attributes and more general properties. We plan to extend this work both by refining our notion of attribute and by using more sophisticated patterns working off the output of a parser.

References

  • M. Berland and Eugene Charniak. (1999). Finding parts in very large corpora. In: Proceedings of the 37th ACL, pages 57–64, University of Maryland.
  • Ronald J. Brachman and H. J. Levesque, editors. (1985). Reading in Knowledge Representation. Morgan Kaufmann, California.
  • S. A. Caraballo. (1999). Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of the 37th ACL.
  • D. J. Cook and L. B. Holder. (2000). Graph-based data mining. IEEE Intelligent Systems, 15(2), 32-41.
  • J. R. Curran and M. Moens. (2002). Improvements in automatic thesaurus extraction. In: Proceedings of the ACL Workshop on Unsupervised Lexical Acquisition, pages 59–66.
  • R. M. W. Dixon. (1991). A New Approach to English Grammar, on Semantic Principles. Clarendon Press, Oxford.
  • C. Fellbaum, editor. (1998). WordNet: An electronic lexical database. The MIT Press.
  • D. H. Fisher. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2:139–172.
  • Gregory Grefenstette. (1993). SEXTANT: Extracting semantics from raw text implementation details. Heuristics: The Journal of Knowledge Engineering.
  • N. Guarino. (1992). Concepts, attributes and arbitrary relations: some linguistic and ontological criteria for structuring knowledge base. Data and Knowledge Engineering, 8, pages 249–261.
  • V. Hatzivassiloglou, and Kathleen R. McKeown. (1993). “Towards the Automatic Identification of Adjectival Scales: clustering adjectives according to meaning.” In: Proceedings of the 31st ACL, pages 172–182.
  • Marti Hearst. (1998). Automated discovery of WordNet relations. In C. Fellbaum, editor, WordNet: An Electronic Lexical Database. MIT Press.
  • G. Karypis. (2002). CLUTO: A clustering toolkit. Technical Report 02-017, University of Minnesota. Available at URL: http://wwwusers.cs.umn.edu/~karypis/cluto/.
  • F. Keller and M. Lapata. (2003). Using the Web to obtain frequencies for unseen bigrams. Computational Linguistics, 29(3).
  • A. Kilgarriff and H. Schuetze. (2003). Introduction to the special issue of Computational Linguistics on the web as a corpus. Computational Linguistics.
  • A. Kilgarriff. (2003). Thesauruses for Natural Language Processing. In: Proceedings of the IEEE 2003 International Conference on Natural Language Processing and Knowledge Engineering (NLPKE' 03), Beijing.
  • S. Laurence and E. Margolis. (1999). Concepts and Cognitive Science. In E. Margolis and S. Laurence, editors, Concepts: Core Readings. Cambridge, MA., Bradford Books/MIT Press, pages 3-81.
  • Dekang Lin. (1998). Automatic retrieval and clustering of similar words. In: Proceedings of COLING-ACL, 768-774.
  • K. Lund and C. Burgess. (1996). Producing highdimensional semantic spaces from lexical cooccurrence. Behavior Research Methods, Instrumentation, and Computers, 28, 203-208.
  • Christopher D. Manning and H. Schuetze. (1999). Foundations of Statistical NLP. MIT Press.
  • K. Markert, M. Nissim, and N. Modjeska. (2004). Comparing Knowledge Sources for Nominal Anaphora Resolution. Submitted.
  • Fernando Pereira, N. Tishby, and L. Lee. (1993). Distributional clustering of English words. In: Proceedings of the 31st ACL, pages 183-190, Columbs, Ohio.
  • M. Poesio and A. Almuhareb. (2004). Feature-based vs. Property-based KR: An Empirical Perspective. Submitted.
  • M. Poesio, T. Ishikawa, S. Walde, and R. Vieira. (2002). Acquiring lexical knowledge for anaphora resolution. In: Proceedings of LREC, Las Palmas, June.
  • M. Poesio. (2003). Associative descriptions and salience. In: Proceedings of the EACL Workshop on Computational Treatments of Anaphora, Budapest.
  • James Pustejovsky. (1991). The generative lexicon. Computational Linguistics, 17(4), pages 409- 441.
  • J. A. Swets. 1969. Effectiveness of Information Retrieval Methods. American Documentation, 20, pages 72-89.
  • W. A. Woods. 1975. What’s in a link: Foundations for semantic networks. In Daniel G. Bobrow and Alan Michael Collins, editors, Representation and Understanding: Studies in Cognitive Science, pages 35-82. Academic Press, New York.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 AttributeBasedAndValueBasedClusteringAbdulrahman Almuhareb
Massimo Poesio
Attribute-based and Value-based Clustering: An evaluationhttp://acl.ldc.upenn.edu/acl2004/emnlp/pdf/Almuhareb.pdf