2011 WebInformationExtractionUsingMa

From GM-RKB

Jump to navigation Jump to search

(Satpal et al., 2011) ⇒ Sandeepkumar Satpal, Sahely Bhadra, Sundararajan Sellamanickam, Rajeev Rastogi, and Prithviraj Sen. (2011). “Web Information Extraction Using Markov Logic Networks.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020615

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Algorithms; experimentation; information extraction; machine learned models; markov logic networks; miscellaneous; performance; probabilistic models

Abstract

In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages and sites. We use Markov Logic Networks (MLNs) to capture both content and structural features in a single unified framework, and this enables us to perform more accurate inference. MLNs allow us to model a wide range of rich structural features like proximity, precedence, alignment, and contiguity, using first-order clauses. We show that inference in our information extraction scenario reduces to solving an instance of the maximum weight subgraph problem. We develop specialized procedures for solving the maximum subgraph variants that are far more efficient than previously proposed inference methods for MLNs that solve variants of MAX-SAT. Experiments with real-life datasets demonstrate the effectiveness of our MLN-based approach compared to existing state-of-the-art extraction methods.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2011 WebInformationExtractionUsingMa	Rajeev Rastogi Sandeepkumar Satpal Sahely Bhadra Sundararajan Sellamanickam Prithviraj Sen			Web Information Extraction Using Markov Logic Networks				10.1145/2020408.2020615		2011

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2011_WebInformationExtractionUsingMa&oldid=844916"

Facts

... more about "2011 WebInformationExtractionUsingMa"

Sandeepkumar Satpal +, Sahely Bhadra +, Sundararajan Sellamanickam +, Rajeev Rastogi + and Prithviraj Sen +

10.1145/2020408.2020615 +

Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +

Web Information Extraction Using Markov Logic Networks +

2011 +