2009 NamedEntityMiningfromClickthrou
- (Xu et al., 2009) ⇒ Gu Xu, Shuang-Hong Yang, and Hang Li. (2009). “Named Entity Mining from Click-through Data Using Weakly Supervised Latent Dirichlet Allocation.” In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2009). doi:10.1145/1557019.1557165
Subject Headings:
Notes
- Categories and Subject Descriptors: H.2.8 Database Management: Data Mining — Log Mining; H.3.3 Information Storage and Retrieval: Information Search and Retrieval — Query formulation.
- General Terms: Algorithms, Experimentation.
Cited By
Quotes
Author Keywords
Named Entity Recognition, Search Log Mining, Topic Model, Web Mining
Abstract
This paper addresses Named Entity Mining (NEM), in which we mine knowledge about named entities such as movies, games, and books from a huge amount of data. NEM is potentially useful in many applications including web search, online advertisement, and recommender system. There are three challenges for the task : finding suitable data source, coping with the ambiguities of named entity classes, and incorporating necessary human supervision into the mining process. This paper proposes conducting NEM by using click-through data collected at a web search engine, employing a topic model that generates the click-through data, and learning the topic model by weak supervision from humans. Specifically, it characterizes each named entity by its associated queries and URLs in the click-through data. It uses the topic model to resolve ambiguities of named entity classes by representing the classes as topics. It employs a method, referred to as Weakly Supervised Latent Dirichlet Allocation (WS-LDA), to accurately learn the topic model with partially labeled named entities. Experiments on a large scale click-through data containing over 1.5 billion query-URL pairs show that the proposed approach can conduct very accurate NEM and significantly outperforms the baseline.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2009 NamedEntityMiningfromClickthrou | Shuang-Hong Yang Hang Li Gu Xu | Named Entity Mining from Click-through Data Using Weakly Supervised Latent Dirichlet Allocation | KDD-2009 Proceedings | 10.1145/1557019.1557165 | 2009 |