- (Tablan, 2010) ⇒ Valentin Tablan. (2010). “Toward Portable Information Extraction." PhD Thesis, University of Sheffield.
- In this thesis we present work which aims to reduce the effort required for creating new Information Extraction (IE) systems, or for adapting existing ones to new purposes. This makes IE techniques more cost-effective in areas where they are currently unsuitable, such as applications where the data volume is small, or where the information needs are changing frequently.
- To support these dynamic needs of IE systems, we created JAPE -- a formalism for specifying transduction rules over annotation graphs, and we implemented an execution environment for applying such rules. The features that distinguish JAPE from other pattern matching frameworks, such as regular expressions, include the use of a graph structure as input, and hierarchical matching based on ontologies.
- In support of machine learning systems, we also developed OLLIE, an environment for computer-assisted collaborative annotation. It incorporates an algorithm for bi-directional translation between textual annotations and feature vectors that supports, in a generic fashion, the application of machine learning methods to any text annotation problem, including IE.
|2010 TowardsPortablaInformationExtraction||Valentin Tablan||Toward Portable Information Extraction||http://nlp.shef.ac.uk/Completed PhD Projects/tablan.pdf||2010|