2005 TextMiningAndNLP


  • This paper provides an introduction to this special issue of SIGKDD Explorations devoted to Natural Language Processing and Text Mining.
  • What the field needs now is a sober scientific assessment of what linguistic concepts and NLP techniques are beneficial for what text mining applications. This would involve a clear classification of the various linguistic concepts that might be of use (e.g. part-of-speech, gramatical role, phrasal parsing) and the various technologies for getting at these concepts (e.g. full parsing vs. shallow parsing vs. heuristics to get at role information), as well as a classification of text mining applications and of propertis of text and corpora (collections of text data). It would further involve innovative experimental designs and new approaches to evaluation. Finaly, it would require some hard work comparing various techniques on a wide range of application types and corpora.