2004 NamedEntityDiscUsingCompNewsArt

Jump to: navigation, search

Subject Headings: Named Entity Recognition Algorithm.




  • In this paper we describe a way to discover Named Entities by using the distribution of words in news articles. Named Entity recognition is an important task for today's natural language applications, but it still suffers from data sparseness. We used an observation that a Named Entity is likely to appear synchronously in several news articles, whereas a common noun is less likely. Exploiting this characteristic, we successfully obtained rare Named Entities with 90% accuracy just by comparing time series distributions of a word in two newspapers. Although the achieved recall is not sufficient yet, we believe that this method can be used to strengthen the lexical knowledge of a Named Entity tagger.


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 NamedEntityDiscUsingCompNewsArtYusuke Shinyama
Satoshi Sekine
Named Entity Discovery Using Comparable News ArticlesProceedings of the 20th International Conference on Computational Linguisticshttp://acl.ldc.upenn.edu/coling2004/MAIN/pdf/122-120.pdf10.3115/1220355.12204772004