Information Extraction Algorithm
(Redirected from Information extraction technique)
		
		
		
		Jump to navigation
		Jump to search
		An Information Extraction Algorithm is a data processing algorithm that can be applied by an information extraction system (to solve an information extraction task.
- AKA: IE Algorithm, Information Extraction from Text Algorithm.
 - Context:
- It can range from being an Information Extraction from Tables Algorithm to being an Information Extraction from Text Algorithm to being an Information Extraction from Images Algorithm.
 - It can range from being: a Heuristic IE Algorithm, Data-Driven IE Algorithm (such as an Unsupervised IE Algorithm, Semi-Supervised IE Algorithm, Fully-Supervised IE Algorithm).
 - It can be supported by:
- an Information Retrieval Algorithm.
 - a Syntactic Analysis Algorithm.
 - a Lexical Semantic Analysis Algorithm, such as an Entity Mention Recognition Algorithm, Entity Mention Coreference Resolution Algorithm, Entity Mention Normalization Algorithm, or Semantic Relation Mention Recognition Algorithm,
 - a Semantic Relation Recognition Algorithm (e.g. Semantic Relation Mention Recognition Algorithm),
 - a Duplicate Record Detection Algorithm, to identify records with redundant information.
 - a Record Canonicalization Algorithm, to create a single non-redundant record.
 
 
 - Example(s):
- Information Extraction from Text Algorithms, such as: Snowball, AutoSlog, TextRunner Algorithm, KnowItAll Algorithm.
 - any Terminology Extraction Algorithm.
 - any unified IE Algorithm(?) as proposed by (McCallum & Jensen, 2003).
 - …
 
 - Counter-Example(s):
 - See: Relation Recognition from Text Algorithm.
 
References
2008
- (Sarawagi, 2008) ⇒ Sunita Sarawagi. (2008). “Information Extraction.” In: Foundations and Trends in Databases, 1(3). doi:10.1561/1900000003
 
2007
- (McCallum, 2007) ⇒ Andrew McCallum. (2007). “Information Extraction.” In: Introduction to Natural Language Processing, CMPSCI 585, Fall (2007).
 
2006
- (Chang et al., 2006) ⇒ C. H. Chang, M. Kayed, M. R. Girgis, and K. Shaalan. (2006). “A Survey of Web Information Extraction Systems.” In: IEEE Transactions On Knowledge and Data Engineering, 18(10).
 
2005
- (Agichtein, 2005) ⇒ Eugene Agichtein. (2005). “Scaling Information Extraction to Large Document Collections. IEEE Data Eng. Bull., 28(4).
 
2003
- (McCallum & Jensen, 2003) ⇒ Andrew McCallum, and David Jensen. (2003). “A Note on the Unification of Information Extraction and Data Mining using Conditional-Probability, Relational Models.” In: Proceedings of the IJCAI03 Workshop on Learning Statistical Models from Relational Data.
- 1) DM begins from a populated DB, unaware of where the data came from, or its inherent errors and uncertainties.
 - 2) IE is unaware of emerging patterns and regularities in the DB.
 
 
2002
- (Laender et al., 2002) ⇒ Alberto H. F. Laender, Berthier A. Ribeiro-Neto, Altigran S. da Silva, and Juliana S. Teixeira. (2002). “A Brief Survey of Web Data Extraction Tools.” In: SIGMOD Record, 31(2). doi:10.1145/565117.565137
 
2001
- (Yangarber, 2001) ⇒ R. Yangarber. (2001). “Scenario Customization for Information Extraction." PhD Thesis, New York University.
 
1999
- (Soderland, 1999) ⇒ Stephen Soderland. (1999). “Learning Information Extraction Rules for Semi-structured and Free Text.” In: Machine Learning, 44(1-3):233–272, 1999.
 
1997
- (Kushmerick, 1997) ⇒ Nicholas Kushmerick. (1997). “Wrapper Induction for Information Extraction." Ph.D. Thesis, Dept of Computer Science & Engineering, Univ of Washington. Technical Report UW-CSE-97-11-04
 
1996
- (Strzalkowski & Wang, 1996) ⇒ Tomek Strzalkowski, and Jin Wang. (1996). “A self-learning universal concept spotter.” In: Proceedings of 16th International Conference on Computational Linguistics (COLING-96), Copenhagen, August 1996.
 
1993
- (Riloff, 1993) ⇒ Ellen Riloff. (1993). “Automatically Constructing a Dictionary for Information Extraction Tasks.” In: Proceedings of AAAI-93.
 - (Cardie, 1993) ⇒ Claire Cardie. (1993). “A Case-based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis.” In: Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93).