(Redirected from gazetteer)
- See: Reference Work, Data-Driven NER Task, Gazetteer-based Term Recognition, Term Gazetteer Construction Task.
- (Smith et al., 2008) ⇒ Larry Smith, Lorraine K. Tanabe, Rie J. Ando, Cheng-Ju Kuo, I-Fang Chung, Chun-Nan Hsu, Yu-Shi Lin, Roman Klinger, Christoph M. Friedrich, and Kuzman Ganchev, Manabu Torii, Hongfang Liu, Barry Haddow, Craig A. Struble, Richard J. Povinelli, Andreas Vlachos, William A. Baumgartner, Lawrence Hunter, Bob Carpenter, Richard TH Tsai, Hong-Jie Dai, Feng Liu, Yifei Chen, Chengjie Sun, Sophia Katrenko, Pieter Adriaans, Christian Blaschke, Rafael Torres, Mariana Neves, Preslav Nakov, Anna Divoli, Manuel Maña-López, Jacinto Mata, and W. John Wilbur. (2008). “Overview of BioCreative II Gene Mention Recognition.” In: Genome biology, 9(Suppl 2). doi:10.1186/gb-2008-9-s2-s2
- QUOTE: NER seeks to identify the words and phrases in text that reference entities in a given category, such as people, places, or companies, or in this application genes and proteins. NER is frequently accomplished with B-I-O tagging, which classifies each token as being at the beginning of the named entity (B), continuing the entity (I), or outside of any entity to be tagged (O). There are several lexical resources (sources of information about words) commonly used in solving the NER problem. A gazetteer is a list of names belonging to a particular category, such as places, persons, companies, genes, and so on. A lexicon is a source of information about different forms or grammatical properties of words. A thesaurus is a source of information indicating words with similar and/or related meanings. Systems in the BioCreative I challenge were classified as open if they used lexical resources, particularly gazetteers, and otherwise closed. A commonly used lexical resource is the Unified Medical Language System (UMLS), a controlled vocabulary of biomedical terminology maintained by the US National Library of Medicine.
- (Kozareva, 2006) ⇒ Zornitsa Kozareva. (2006). “Bootstrapping Named Entity Recognition with Automatically Generated Gazetteer Lists.” In: Proceedings of EACL (EACL 2006).
- (Mikheev et al., 1999) ⇒ Andrei Mikheev, Marc Moens, and Claire Grover. (1999). “Named Entity Recognition Without Gazetteers.” In: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics. doi:10.3115/977035.977037