2010 SemanticContAccessUsingDomIndepNLPOntol

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Quotes

Abstract

  • We present a lightweight, user-centered approach for document navigation and analysis that is based on an ontology of text mining results. This allows us to bring the result of existing text mining pipelines directly to end users. Our approach is domain-independent and relies on existing NLP analysis tasks such as automatic multi-document summarization, clustering, question-answering, and opinion mining. Users can interactively trigger semantic processing services for tasks such as analyzing product reviews, daily news, or other document sets.

1. Introduction

  • In this paper, we focus on user-driven NLP for managing large amounts of textual information. That is, NLP analysis is explicitly requested by a user for a certain task at hand, not pre-computed on a server. To access the results of these analyses, we propose an NLP ontology that focuses on the tasks (like summarization or opinion analysis), rather than the domain (like news, biology, or software engineering).
  • Table 1. Main concepts in the NLP ontology and their definition

| Concept | Definition | | Document | Set of source URIs containing information in natural language (e.g., news articles, product reviews, blog posts) Content Natural language text appearing either in a source document or generated as a result from text mining pipelines | DocContent | Natural language text appearing in a source document Summary NLP analysis artifact derived through applying specific algorithms to a set of input documents with optional contextual information | SingleSummary | An automatically generated summary of a single input document | ShortSumm | A keyphrase-like automatically generated summary indicating the major topics of a single document | Classical SingleSumm | An essay-like text of user-configurable length containing the most salient information of the source document | MultiSummary | An automatically generated summary of a set of input documents | ClassicalMultiSumm | An essay-like text of user-configurable length containing the most important (common) topics appearing in all source documents | FocusedSumm | An essay-like text of user-configurable length that addresses a specific user context (e.g., concrete questions the user needs to be addressed by the summary, or another reference document in order to find related content) | ConstrastiveSumm | Multi-document summarization method that generates (a) the commonalities (shared topics) across all input documents and (b) content specific to a single or subset of documents (contrasts) | Chain | Single- or cross-document coreference chain (NLP analysis artifact) | Chunk Specific | content fragments generated or manipulated by NLP analysis (e.g., noun phrases, verb,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2010 SemanticContAccessUsingDomIndepNLPOntolRené Witte
Ralf Krestel
Semantic Content Access using Domain-Independent NLP Ontologieshttp://www.l3s.de/web/upload/documents/1/nldb2010-ontonlp.pdf