Subject Headings: Document Classification Task, Document Classification Algorithm.
- A number of techniques have been studied for the automatic assignment of controlled subject headings and classifications from free indexing. These techniques involve the automatic manipulation and truncation of the free-index phrases assigned to a document and the use of a manually-constructed thesaurus and automatically-generated dictionaries together with statistical ranking and weighting methods. These are based on the use of a statistically-generated ‘adhesion coefficient’ which reflects the degree of association between the free-indexing terms, the controlled subject headings, and the classifications. By the analysis of a large sample of manually-indexed documents the system generates dictionaries of free-language and controlled-language terms together with their associated classifications and adhesion coefficients. Having learnt from the manually-indexed documents the system uses these dictionaries in the subsequent automatic classification procedure. The accuracy and cost-effectiveness of the automatically-assigned subject headings and classifications has been compared with that of the manual system. The results were encouraging and the costs comparable to those of a manual system.,