2013 USIAnswersNaturalLanguageQuesti

From GM-RKB
Jump to navigation Jump to search

Subject Headings: NLIDB

Notes

Cited By

Quotes

Abstract

The paper reports on the progress towards the goal of offering easy access to enterprise data to a large number of business users, most of whom are not familiar with the specific syntax or semantics of the underlying data sources. Additional complications come from the nature of the data, which comes both as structured and unstructured. The proposed solution allows users to express questions in natural language, makes apparent the system's interpretation of the query, and allows easy query adjustment and reformulation. The application is in use by more than 1500 users from Siemens Energy. We evaluate our approach on a data set consisting of fleet data.

1. Introduction

Todays enterprises need to make decisions based on analyzing massive and heterogeneous data sources. More and more aspects of business are driven by data, and as a result more and more business users need access to data. Offering easy access to the right data to diverse business users is of growing importance. There are several challenges that must be overcome to meet this goal. One is the sheer volume: enterprise data is predicted to grow by 800 percent in the next five years. The biggest part (80 percent) is stored in documents, most of them missing informative meta data or semantic tags (beyond date, size and author) that might help in accessing them. A third challenge comes from the need to offer access to this data to different types of users, most of whom are not familiar with the underlying syntax or semantics of the data. Unified Service Intelligence (USI) is a project of Siemens Corporation, Corporate Technologies and Siemens Energy focused on generating actionable insight from large bodies of data in the energy service domain. USI Answers, the focus of this paper, is a sub-project of USI, focused specifically on offering easy and reliable natural language access to the large bodies of data that are used in the planning and delivery of service by Siemens Energy. The focus is on detecting and responding to events and trends more efficiently and enabling new business models.

2. Related Work

Natural Language Understanding (NLU) has long been a goal of AI. Considered an AI-complete task, it consists of mapping natural language sentence into a complete, unambiguous, formal meaning representation expressed in a formal language which supports other tasks such as automated reasoning, or question answering.

Natural Language access to databases (NLIDB) is a NLU task where the target language is a structured query language (e.g. SQL). NLIDB has been around for a long time, starting with the LUNAR system (Woods 1970). Early NLIDB systems took mainly a hand-built, syntax-based approach (Woods 1970; Warren and Pereira 1982; Dowding et al. 1993; Bos et al. 1996) which proved to be not only labor-intensive but also brittle. A number of learning approaches were developed (Zelle and Mooney 1996; Miller et al. 1996) and more recently (Kate, Wong, and Mooney 2005; Kate and Mooney 2006; Zettlemoyer and Collins 2005; Wong and Mooney 2006; 2007), and (Lu et al. 2008). With two exceptions (Miller et al. 1996) and (Zettlemoyer and Collins 2005), they all adopted a semantic driven approach. Academic question answering systems showed great promise: (Gunning et al. 2012) showed that domain experts with little training and no knowledge of the underlying knowledge base can use such systems to answer complex questions in scientific domains like Chemistry, Biology, and Physics.

Recently there has been an emerging interest from the industry sector to have computer systems not only to analyze the vast amount of relevant information (Ferrucci et al. 2010), but also to provide intuitive user interface to pose questions in natural language in an interactive dialogue manner (Sonntag 2009; Waltinger, Breuing, andWachsmuth 2012). Several industrial applications of question answering have raised the interest and awareness of question answering as an effective way to interact with a system: IBM Watsons Jeopardy challenge (Ferrucci et al. 2010) showed that open domain QA can be done accurately and at scale. Wolfram Alphas[1] computational knowledge engine centered around Mathematica is one source behind Apples Siri[2], which has proven a successful interaction medium for mobile devices.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 USIAnswersNaturalLanguageQuestiDan Tecuci
Ulli Waltinger
Mihaela Olteanu
Vlad Mocanu
Sean Sullivan
USI Answers: Natural Language Question Answering Over (Semi-) Structured Industry Data