- (Mansuri & Sarawagi, 2006) ⇒ Imran R. Mansuri, Sunita Sarawagi. (2006). “A System for Integrating Unstructured Data into Relational Databases.” In: Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006). doi:10.1109/ICDE.2006.83
Subject Headings: Entity Mention Normalization Algorithm.
- "Record matching techniques are primarily relevant to structured records , but can be extended to semi-structured records (such as paper citations and street addresses) by including text-segmentation as a pre-processing step [1, 17]."
- In this paper we present a system for automatically integrating unstructured text into a multi-relational database using state-of-the-art statistical models for structure extraction and matching. We show how to extend current highperforming models, Conditional Random Fields and their semi-markov counterparts, to effectively exploit a variety of recognition clues available in a database of entities, thereby significantly reducing the dependence on manually labeled training data. Our system is designed to load unstructured records into columns spread across multiple tables in the database while resolving the relationship of the extracted text with existing column values, and preserving the cardinality and link constraints of the database. We show how to combine the inference algorithms of statistical models with the database imposed constraints for optimal data integration.
|2006 ASysForIntegUnstrDataIntoRelDBs||Imran R. Mansuri|
|A System for Integrating Unstructured Data into Relational Databases||Proceedings of the 22nd IEEE International Conference on Data Engineering||http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73.8098&rep=rep1&type=pdf||10.1109/ICDE.2006.83||2006|
Facts about "2006 ASysForIntegUnstrDataIntoRelDBs"
|Author||Imran R. Mansuri + and Sunita Sarawagi +|
|journal||Proceedings of the 22nd IEEE International Conference on Data Engineering +|
|title||A System for Integrating Unstructured Data into Relational Databases +|