1997 UseOfRelationMatchingInIR

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Relation Recognition, Information Retrieval

Notes

Cited By

Quotes

Abstract

Matching with manually identified relations

Two indexing systems that make explicit use of relations are Farradane's (1950, 1952 and 1967) relational classification system and the SYNTOL model (Gardin, 1965; Levy, 1967).

Farradane (1967) used nine types of relations: Concurrence Relation, Equivalence Relation, Distinctness Relation, Self-activity Relation, Dimensional Relation, Reaction Relation, Association Relation, Appurtenance Relation, Functional Dependence Relation.

The SYNTOL project used four main types of relations that were subdivided into finer relations (Gardin, 1965):

  • Coordinative (Formal)
  • Consecutive (Dynamic)
  • Predicative
  • Associative (Intrinsic)
    • Active (Agent, Patient)
    • Inactive (Qualification, Inclusion)
    • Circumstantial (Location, Means, Goal, Sign)

A discussion of the theoretical issues related to the use of syntagmatic relations in indexing languages can be found in Green (1995),

The Aberystwyth Index Languages Test (Keen, 1973) found only a small improvement in retrieval precision for a minority of queries (13%) using Farradane’s relations compared with not using relations.

Matching With Automatically Identified Relations

[[Lu (1990)]] investigated the use of case relation matching using a small test database of abstracts. Case relations are the semantic relations that hold between a verb and the other constituents of a sentence (Fillmore, 1968; Somers, 1987). In the example sentence Harry loves Sally, the case relation experiencer holds between Harry and love, and the case relation patient holds between love and Sally. The verb love is said to assign the case role of experiencer to the noun phrase Harry and the case role of patient to Sally. Using a tree-matching method for matching relations, Lu obtained worse results than from vector-based keyword matching. The tree-matching method used is probably not optimal for information retrieval and the results may not reflect the potential of relation matching.

Finally, Liu (1997) investigated what I call partial relation matching. Instead of trying to match the whole concept-relation-concept triple (i.e. both concepts as well as the relation between them), he sought to match individual concepts together with the semantic role that the concept has in the sentence. In other words, instead of trying to find matches for "term1 ->(relation)-> term2", his system sought to find matches for "term1 ->(relation)" and "(relation)-> term2" separatelly, Liu used case roles and the vector-space retrieval model, and was able to obtain positive results only for long queries (abstracts that are used as queries).

References

  • Asher, R.E. (Ed.). (1994). The encyclopedia of language and linguistics. Oxford: Pergamon Press.
  • Austin, D. (1984). PRECIS: A manual of concept analysis and subject indexing. (2nd ed.). London: British Library, Bibliographic Services Division.
  • Berrut, C. (1990). Indexing medical reports: the RIME approach. Information Processing & Management, 26(1), 93-109.
  • Croft, W. B. (1986). Boolean queries and term dependencies in probabilistic retrieval models. Journal of the American Society for Information Science, 37(2), 71-77.
  • Croft, W. B., Turtle, H. R., & Lewis, D. D. (1991). The Use of Phrases and Structured Queries in Information Retrieval. In A. Bookstein, Y. Chiaramella, Gerard M. Salton, & V.V. Raghavan (Eds.), SIGIR '91: Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval (pp. 32-45). New York: ACM Press.
  • Dillon, M., & Gray, A. S. (1983). FASIT: A fully automatic syntactically based indexing system. Journal of the American Society for Information Science, 34(2), 99-108.
  • Fagan, J. L. (1989). The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval. Journal of the American Society for Information Science, 40(2), 115-132
  • Farradane, J.E.L. (1950). A scientific theory of classification and indexing and its practical applications. Journal of Documentation, 6(2), 83-99.
  • Farradane, J.E.L. (1952). A scientific theory of classification and indexing: further considerations. Journal of Documentation, 8(2), 73-92.
  • Farradane, J.E.L. (1967). Concept organization for information retrieval. Information Storage and Retrieval, 3(4), 297-314.
  • Fillmore, C. J. (1968). The case for case. In E. Bach & R. T. Harms (Eds.), Universals in Linguistic Theory (pp.1-88). New York : Holt, Rinehart and Winston.
  • Fox, E.A. (1983). Characterization of two new experimental collections in computer and information science containing textual and bibliographic concepts (Report No. TR83-561). Ithaca, NY: Department of Computer Science, Cornell University.
  • Gardin, J.-C. (1965). SYNTOL. New Brunswick, NJ: Graduate School of Library Service, Rutgers, The State University.
  • Gay, L.S., & Croft, W.B. (1990). Interpreting nominal compounds for information retrieval. Information Processing & Management, 26(1), 21-38.
  • Green, R. (1995). Syntagmatic relationships in index languages: A reassessment. Library Quarterly, 65(4), 365-385.
  • Jones, L. P., deBessonet, C., & Kundu, S. (1988). ALLOY: An amalgamation of expert, linguistic and statistical indexing methods. In Y. Chiaramella (Ed.), 11th International Conference on Research & Development in Information Retrieval (pp. 191-199). New York: ACM.
  • Keen, E. M. (1973). The Aberystwyth index languages test. Journal of Documentation, 29(1), 1-35.
  • Khoo, C. S.G. (1995). Automatic identification of causal relations in text and their use for improving precision in information retrieval (Doctoral dissertation, Syracuse University, 1995).
  • Kishore, J. (1986). Colon Classification: Enumerated & expanded schedules along with theoretical formulations. New Delhi: Ess Ess Publications.
  • Levy, F. (1967). On the relative nature of relational factors in classifications. Information Storage & Retrieval, 3(4), 315-329.
  • Liddy, E. D., & Myaeng, S. H. (1993). DR-LINK's linguistic-conceptual approach to document detection. In D.K. Harman (Ed.), The First Text REtrieval Conference (TREC-1) (NIST Special Publication 500-207, pp. 1-20). Gaithersburg, MD: National Institute of Standards and Technology.
  • Liu, G.Z. (1997). Semantic vector space model: Implementation and evaluation. Journal of the American Society for Information Science, 48(5), 395-417.
  • Longman dictionary of contemporary English. (1987). 2nd ed. Harlow, Essex: Longman.
  • Lu, X. (1990). An application of case relations to document retrieval (Doctoral dissertation, University of Western Ontario, 1990). Dissertation Abstracts International, 52-10, 3464A.
  • Marega, R., and Pazienza, M.T. (1994). CoDHIR: An information retrieval system based on semantic document representation. Journal of Information Science, 20(6), 399-412.
  • Mauldin, M.L. (1991). Retrieval performance in FERRET: A conceptual information retrieval system. In A. Bookstein, Y. Chiaramella, Gerard M. Salton, & V.V. Raghavan (Eds.), SIGIR '91: Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval (pp. 347-355). New York: ACM Press.
  • Metzler, D. P., & Haas, S. W. (1989). The Constituent Object Parser: Syntactic structure matching for information retrieval. ACM Transactions on Information Systems, 7(3), 292-316.
  • Metzler, D. P., Haas, S. W., Cosic, C. L., & Weise, C. A. (1990). Conjunction, ellipsis, and other discontinuous constituents in the constituent object parser. Information Processing & Management, 26(1), 53-71.
  • Metzler, D. P., Haas, S. W., Cosic, C. L., & Wheeler, L.H. (1989). Constituent object parsing for information retrieval and similar text processing problems. Journal of the American Society for Information Science, 40(6), 398-423.
  • Myaeng, S. H., & Liddy, E.D. (1993). Information retrieval with semantic representation of texts. In: Proceedings of the 2nd Annual Symposium on Document Analysis and Information Retrieval (pp. 201-215).
  • Myaeng, S. H., Khoo, C., & Li, M. (1994). Linguistic processing of text for a large-scale conceptual information retrieval system. In Tepfenhart, W.M., Dick, J.P., & Sowa, J.F. (Eds.), Conceptual Structures: Current Practices: Second International Conference on Conceptual Structures, ICCS '94 (pp. 69-83). Berlin: Springer-Verlag.
  • Nishida, F., & Takamatsu, S. (1982). Structured-information extraction from patent-claim sentences. Information Processing & Management, 18(1), 1-13.
  • Ranganathan, S.R. (1965). The Colon Classification. New Brunswick, N.J.: Graduate School of Library Service, Rutgers, the State University.
  • Rau, L. (1987). Knowledge organization and access in a conceptual information system. Information Processing & Management, 23(4), 269-283.
  • Rau, L.F., Jacobs, P. S., & Zernik, U. (1989). Information extraction and text summarization using linguistic knowledge acquisition. Information Processing & Management, 25(4), 419-428.
  • Roget's international thesaurus. (1962). 3rd ed. New York: Thomas Y. Crowell Company.
  • Ruge, G., Schwarz, C., & Warner, A.J. (1991). Effectiveness and efficiency in natural language processing for large amounts of text. Journal of the American Society for Information Science, 42(6), 450-456.
  • Gerard M. Salton, & McGill, M.J. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.
  • Gerard M. Salton, Buckley, C., & Smith, M. (1990). On the application of syntactic methodologies in automatic text analysis. Information Processing & Management, 26(1), 73-92.
  • Gerard M. Salton, Yang, C.S., & Yu, C.T. (1975). A theory of term importance in automatic text analysis. Journal of the American Society for Information Science, 26(1), 33-44.
  • Schwarz, C. (1990a). Automatic syntactic analysis of free text. Journal of the American Society for Information Science, 41(6), 408-417.
  • Schwarz, C. (1990b). Content based text handling. Information Processing & Management, 26(2), 219-226.
  • Sheridan, P., & Smeaton, A.F. (1992). The application of morpho-syntactic language processing to effective phrase matching. Information Processing & Management, 28(3), 349-369.
  • Smeaton, A.F. (1990). Natural language processing and information retrieval. Information Processing & Management, 26(1), 19-20.
  • Smeaton, A.F., & C. J. van Rijsbergen (1988). Experiments on incorporating syntactic processing of user queries into a document retrieval strategy. In Y. Chiaramella (Ed.), 11th International Conference on Research & Development in Information Retrieval (pp. 31-51). New York: ACM.
  • Smeaton, A.F., O'Donnell, R., & Kelledy, F. (1995). Indexing structures derived from syntax in TREC-3: System description. In D.K. Harman (Ed.), Overview of the Third Text REtrieval Conference (TREC-3) (NIST Special Publication 500-225, pp. 55-67). Gaithersburg, MD: National Institute of Standards and Technology.
  • Somers, H.L. (1987). Valency and case in computational linguistics. Edinburgh : Edinburgh University Press.
  • Karen Spärck Jones (1997, Feb. 10). Summary performance comparisons TREC-2, TREC-3, TREC-4, TREC-5 [Postscript file]. In TREC-5 Proceedings. Available: http://www-nlpir.nist.gov/TREC/trec5.papers/sparckjones.ps (visited 3 July 1997).
  • Strzalkowski, T., Carballo, J.P., & Marinescu, M. (1995). Natural language information retrieval: TREC-3 report. In D.K. Harman (Ed.), Overview of the Third Text REtrieval Conference (TREC-3) (NIST Special Publication 500-225, pp. 39-53). Gaithersburg, MD: National Institute of Standards and Technology.,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1997 UseOfRelationMatchingInIRChristopher Soo-Guan KhooThe Use of Relation Matching in Information RetrievalLIBRES: Library and Information Science Research Electronic Journalhttp://libres.curtin.edu.au/libre7n2/khoo.htm1997