Open main menu


Stanford Named Entity Recognizer System



    • CRFClassifier is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names, or gene and protein names. The software provides a general (arbitrary order) implementation of linear chain Conditional Random Field (CRF) sequence models, of the sort pioneered by Lafferty, McCallum, and Pereira (2001), coupled with well-engineered feature extractors for Named Entity Recognition. Included are a good 3 class (PERSON, ORGANIZATION, LOCATION) named entity recognizer for English (in versions with and without additional distributional similarity features) and another pair of models trained on the CoNLL 2003 English training data. The distributional similarity features improve performance but the models require considerably more memory.