2004 CorpusPatternAnalysis

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Corpus Pattern Analysis (CPA)

Notes

Cited By

2013

Quotes

Abstract

Evidence from large corpora shows striking patterns of word use in natural language, the details of which are only now beginning to be adequately recognized and studied. These patterns of usage can be analysed and applied in lexicography as a way of deciding what counts as a lexical meaning distinction and of showing how different meanings are associated with different uses of a word. This has major implications for dictionaries, as well as for lexicons used in computational natural language processing, but lexicography has been slow to respond to the challenges presented by the data. After a discussion of afferent kinds of corpus evidence and analytic procedures in corpus lexicography, the paper presents a new project of corpus-driven lexicographic analysis of English.

1. Introduction

Corpus Pattern Analysis (CPA) is a new technique for mapping meaning onto words in text. It is based on the Theory of Norms and Exploitations (TNE, see Hanks forthcoming (a) and (b)). TNE in turn is a theory that owes much to the work of Sinclair and Halliday on the lexicon (e.g. Sinclair 1966, 1987, 1991; Halliday 1966), to the Cobuild project in lexical computing (Sinclair, Hanks, et al. 1987), and to the Hector project (Atkins 1993; Hanks 1994). Some recent work in American linguistics (Jackendoff 2002) has complained about the excessive ' syntactocentrism' of American linguistics in the 20th century. TNE offers a lexicocentric approach, with opportunities for synthesis, which will go some way towards redressing the balance.

The focus of the analysis is on the prototypical syntagmatic patterns with which words in use are associated. Patterns for verbs and patterns for nouns are different in kind. Noun patterns consist ofanumber ofcorpus-derived gnomic statements, into which the most significant collocates are grouped and incorporated. Verb patterns consist not only of the basic ' argument structure' or ' valency structure' ofeach verb (typically with semantic values stated for each of the elements), but also of subvalency features, where relevant, such as the presence or absence of a determiner in noun phrases constituting a direct object. For example, the meaning of take place is quite different from the meaning of take his place. The possessive determiner makes all the difference to the meaning.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 CorpusPatternAnalysisPatrick HanksCorpus Pattern Analysis2004