1975 AVectorSpaceModelForAutomaticIndexing

Jump to navigation Jump to search

Subject Headings: TF-IDF, Information Retrieval Algorithm, Document Word Vector, SMART System.

Cited By




In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstrating the usefulness of the model.


  • 1. Gerard Salton, Automatic Information Organization and Retrieval., McGraw Hill Text, 1968
  • 2. Gerard M. Salton, and Yang, C.S. On the specification of term values in automatic indexing. J. Documen. 29, 4 (Dec. 1973), 351-372.
  • 3. Karen Spärck Jones A statistical interpretation of term specificity and its application to retrieval. J. Documen. 28, 1 (March 1972), 11-20.
  • 4. Williamson, R.E. Real-time document retrieval. Ph.D. Th., Computer Sci. Dep., Cornell U., June 1974.
  • 5. Wong, A. An investigation of the effects of different indexing methods on the document space configuration. Sci. Rep. ISR-22, Computer Sci. Dep., Cornell U., Section II, Nov. 1974.
  • 6. Gerard M. Salton, Theory of Indexing, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1975
  • 7. Gerard M. Salton, Yang, C.S., and Yu, C.T. Contribution to the theory of indexing. Proceedings of IFIP Congress 74, Stockholm, August 1974. American Elsevier, New York, 1974.


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1975 AVectorSpaceModelForAutomaticIndexingGerard M. Salton
A. Wong
C. Yang
A Vector Space Model for Automatic IndexingCommunications of the ACMhttp://www.scils.rutgers.edu/~muresan/IR/Docs/Articles/cacmSalton1975.pdf10.1145/361219.3612201975