2017 LightweightMultilingualEntityEx

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Text Analytics.

Notes

Cited By

Quotes

Abstract

Text analytics systems often rely heavily on detecting and linking entity mentions in documents to knowledge bases for downstream applications such as sentiment analysis, question answering and recommender systems. A major challenge for this task is to be able to accurately detect entities in new languages with limited labeled resources. In this paper we present an accurate and lightweight [1], multilingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) Lightweight named entity recognition with competitive accuracy; 2) Candidate entity retrieval that uses search click-log data and entity embeddings to achieve high precision with a low memory footprint; and 3) efficient entity disambiguation. Our system achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data.


References

  • 1. R. Al-Rfou, V. Kulkarni, B. Perozzi, and S. Skiena. Polyglot-NER: Massive Multilingual Named Entity Recognition. In Proc. ICDM, 2015. doi:10.1137/1.9781611974010.66
  • 2. A. Alhelbawy and R. Gaizauskas. Collective Named Entity Disambiguation Using Graph Ranking and Clique Partitioning Approaches. In Proc. COLING, 2014.
  • 3. S. Austin, R. Schwartz, and P. Placeway. The Forward-backward Search Algorithm. In Proc. ICASSP, 1991. doi:10.1109/ICASSp.1991.150435
  • 4. Roi Blanco, Giuseppe Ottaviano, Edgar Meij, Fast and Space-Efficient Entity Linking for Queries, Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, February 02-06, 2015, Shanghai, China doi:10.1145/2684822.2685317
  • 5. R. Bunescu and M. Pasca. Using Encyclopedic Knowledge for Named Entity Disambiguation. In Proc. EACL, 2006.
  • 6. Diego Ceccarelli, Claudio Lucchese, Salvatore Orlando, Raffaele Perego, Salvatore Trani, Learning Relatedness Measures for Entity Linking, Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, October 27-November 01, 2013, San Francisco, California, USA doi:10.1145/2505515.2505711
  • 7. W. Che, M. Wang, C. D. Manning, and T. Liu. Named Entity Recognition with Bilingual Constraints. In Proc. HLT-NAACL, 2013.
  • 8. X. Cheng and D. Roth. Relational Inference for Wikification. In Proc. EMNLP, 2013.
  • 9. A. Chisholm and B. Hachey. Entity Disambiguation with Web Links. Trans. of the ACL, 3:145--156, 2015.
  • 10. S. Cucerzan. Large-scale Named Entity Disambiguation based on Wikipedia Data. In Proc. EMNLP, 2007.
  • 11. Bhavana Dalvi, Einat Minkov, Partha P. Talukdar, William W. Cohen, Automatic Gloss Finding for a Knowledge Base Using Ontological Constraints, Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, February 02-06, 2015, Shanghai, China doi:10.1145/2684822.2685288
  • 12. Nemanja Djuric, Hao Wu, Vladan Radosavljevic, Mihajlo Grbovic, Narayan Bhamidipati, Hierarchical Neural Language Models for Joint Representation of Streaming Documents and their Content, Proceedings of the 24th International Conference on World Wide Web, May 18-22, 2015, Florence, Italy doi:10.1145/2736277.2741643
  • 13. G. Durrett and D. Klein. A Joint Model for Entity Analysis: Coreference, Typing, and Linking. Trans. Of the ACL, 2:477--490, 2014.
  • 14. Peter Elias, Efficient Storage and Retrieval by Content and Address of Static Files, Journal of the ACM (JACM), v.21 n.2, p.246-260, April 1974 doi:10.1145/321812.321820
  • 15. A. Fahrni, B. Heinzerling, T. Göckel, and M. Strube. HITS' Monolingual and Cross-lingual Entity Linking System at TAC 2013. In Proc. TAC, 2013.
  • 16. N. Fernandez Garcia, J. Arias Fisteus, and L. Sanchez Fernandez. Comparative Evaluation of Link-based Approaches for Candidate Ranking in Link-to-wikipedia Systems. Journal of Artificial Intelligence Research, 49:733--773, 2014.
  • 17. Jenny Rose Finkel, Trond Grenager, Christopher Manning, Incorporating Non-local Information Into Information Extraction Systems by Gibbs Sampling, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, p.363-370, June 25-30, 2005, Ann Arbor, Michigan doi:10.3115/1219840.1219885
  • 18. B. J. Frey and D. Dueck. Clustering by Passing Messages Between Data Points. Science, 315(5814):972--976, 2007. doi:10.1126/science.1136800
  • 19. Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff, Thomas Hofmann, Probabilistic Bag-Of-Hyperlinks Model for Entity Linking, Proceedings of the 25th International Conference on World Wide Web, April 11-15, 2016, Montréal, Québec, Canada doi:10.1145/2872427.2882988
  • 20. Zhaochen Guo, Denilson Barbosa, Robust Entity Linking via Random Walks, Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, November 03-07, 2014, Shanghai, China doi:10.1145/2661829.2661887
  • 21. B. Hachey, W. Radford, and J. R. Curran. Graph-based Named Entity Linking with Wikipedia. In Proc. WISE, 2011. doi:10.1007/978-3-642-24434-6_16
  • 22. D. Hakkani-Tür Et Al. Probabilistic Enrichment of Knowledge Graph Entities for Relation Detection in Conversational Understanding. In Proc. INTERSPEECH, 2014.
  • 23. Xianpei Han, Le Sun, Jun Zhao, Collective Entity Linking in Web Text: A Graph-based Method, Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, July 24-28, 2011, Beijing, China doi:10.1145/2009916.2010019
  • 24. Z. He Et Al. Learning Entity Representation for Entity Disambiguation. In Proc. ACL, 2013.
  • 25. J. Ho Art Et Al. Robust Disambiguation of Named Entities in Text. In Proc. EMNLP, 2011.
  • 26. H. Ji, J. Nothman, and B. Hachey. Overview of\ TAC-KBP2014 Entity Discovery and Linking Tasks. In Proc. TAC, 2014.
  • 27. Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti, Collective Annotation of Wikipedia Entities in Web Text, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, June 28-July 01, 2009, Paris, France doi:10.1145/1557019.1557073
  • 28. J. La Erty, A. McCallum, and F. Pereira. Conditional Random Elds: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. ICML, 2001.
  • 29. G. Lample Et Al. Neural Architectures for Named Entity Recognition. ArXiv Preprint ArXiv:1603.01360, 2016.
  • 30. Q. Le and T. Mikolov. Distributed Representations of Sentences and Documents. In Proc. ICML, 2014.
  • 31. X. Ling, S. Singh, and D. Weld. Design Challenges for Entity Linking. Trans. of the ACL, 3:315--328, 2015.
  • 32. G. Luo, X. Huang, C.-Y. Lin, and Z. Nie. Joint Named Entity Recognition and Disambiguation. In Proc. EMNLP, 2015.
  • 33. X. Ma and E. Hovy. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. ArXiv Preprint ArXiv:1603.01354, 2016.
  • 34. E. Meij, K. Balog, and D. Odijk. Entity Linking and Retrieval Tutorial. http://ejmeij.github.io/entity-linking-and-retrieval-tutorial/, 2014.
  • 35. Y. Merhav Et Al. Basis Technology at TAC 2013 Entity Linking. In Proc. TAC, 2013.
  • 36. T. Mikolov Et Al. Distributed Representations of Words and Phrases and their Compositionality. In Proc. NIPS, 2013.
  • 37. N. Okazaki. CRFsuite: A Fast Implementation of Conditional Random Elds (CRFs). http://www.chokkan.org/software/crfsuite/, 2007.
  • 38. N. Okazaki and J. Nocedal. Liblbfgs: A Library of Limited-memory Broyden- Etcher-goldfarb-shanno (l-bfgs). URL http://www.chokkan.org/software/liblbfgs, 2010.
  • 39. A. Passos, V. Kumar, and A. McCallum. Lexicon Infused Phrase Embeddings for Named Entity Resolution. ArXiv Preprint ArXiv:1404.5367, 2014.
  • 40. F. Pedregosa Et Al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011.
  • 41. D. Rao, P. McNamee, and M. Dredze. Entity Linking: Finding Extracted Entities in a Knowledge Base. In Multi-source, Multilingual Information Extraction and Summarization, Pages 93--115. Springer, 2013.
  • 42. L. Ratinov and D. Roth. Design Challenges and Misconceptions in Named Entity Recognition. In Proc. CoNLL, 2009. doi:10.3115/1596374.1596399
  • 43. (Roth et al., 2014) ⇒ D. Roth, H. Ji, M.-W. Chang, and T. Cassidy. Wiki Cation and Beyond: The Challenges of Entity and Concept Grounding. Proc. ACL, 2014.
  • 44. Wei Shen, Jianyong Wang, Ping Luo, Min Wang, Linking Named Entities in Tweets with Knowledge Base via User Interest Modeling, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 11-14, 2013, Chicago, Illinois, USA doi:10.1145/2487575.2487686
  • 45. M. Shirakawa Et Al. Entity Disambiguation based on a Probabilistic Taxonomy. Technical Report MSR-TR-2011-125, Microsoft Research, 2011.
  • 46. Avirup Sil, Alexander Yates, Re-ranking for Joint Named-entity Recognition and Linking, Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, October 27-November 01, 2013, San Francisco, California, USA doi:10.1145/2505515.2505601
  • 47. M. Speriosu, N. Sudan, S. Upadhyay, and J. Baldridge. Twitter Polarity Classi Cation with Label Propagation over Lexical Links and the Follower Graph. In Proc. EMNLP, 2011.
  • 48. J. Suzuki and H. Isozaki. Semi-supervised Sequential Labeling and Segmentation Using Giga-word Scale Unlabeled Data. In Proc. ACL-HLT, 2008.
  • 49. Partha Pratim Talukdar, Koby Crammer, New Regularized Algorithms for Transductive Learning, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, September 07-11, 2009, Bled, Slovenia doi:10.1007/978-3-642-04174-7_29
  • 50. Erik F. Tjong Kim Sang, Fien De Meulder, Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recognition, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, p.142-147, May 31, 2003, Edmonton, Canada doi:10.3115/1119176.1119195
  • 51. M. Yu, S. Wang, C. Zhu, and T. Zhao. Semi-supervised Learning for Word Sense Disambiguation Using Parallel Corpora. In Proc. FSKD, 2011.
  • 52. Y. Zhou Et Al. Resolving Surface Forms to Wikipedia Topics. In Proc. COLING, 2010.
  • 53. Erik F. Tjong Kim Sang, Introduction to the CoNLL-2002 Shared Task: Language-independent Named Entity Recognition, Proceedings of the 6th Conference on Natural Language Learning, p.1-4, August 31, 2002 doi:10.3115/1118853.1118877
  • 54. E. F. Tjong Kim Sang. Introduction to the CoNLL-2002 Shared Task: Language-independent Named Entity Recognition. In Proc. CoNLL, 2002.

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2017 LightweightMultilingualEntityExAasish Pappu
Roi Blanco
Yashar Mehdad
Amanda Stent
Kapil Thadani
Lightweight Multilingual Entity Extraction and Linking10.1145/3018661.30187242017
  1. By lightweight, we mean easily extensible to additional languages, with a low memory footprint, and fast.