- (Fayyad et al., 1996d) ⇒ Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth. (1996). “From Data Mining to Knowledge Discovery in Databases.” In: AI Magazine, 17(3).
Subject Headings: Data Mining Discipline.
- Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges involved in real-world applications of knowledge discovery, and current and future research directions in the field.
Data Mining and KDD
- Historically, the notion of finding useful patterns in data has been given a variety of names, including data mining, knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing. The term data mining has mostly been used by statisticians, data analysts, and the management information systems (MIS) communities. It has also gained popularity in the database field. The phrase knowledge discovery in databases was coined at the first KDD workshop in 1989 (Piatetsky-Shapiro 1991) to emphasize that knowledge is the end product of a data-driven discovery. It has been popularized in the AI and machine-learning fields.
- In our view, KDD refers to the overall process of discovering useful knowledge from data, and data mining refers to a particular step in this process. Data mining is the application of specific algorithms for extracting patterns from data. The distinction between the KDD process and the data-mining step (within the process) is a central point of this article. The additional steps in the KDD process, such as data preparation, data selection, data cleaning, incorporation of appropriate prior knowledge, and proper interpretation of the results of mining, are essential to ensure that useful knowledge is derived from the data. Blind application of data-mining methods (rightly criticized as data dredging in the statistical literature) can be a dangerous activity, easily leading to the discovery of meaningless and invalid patterns.
The Interdisciplinary Nature of KDD
- Related AI research fields include machine discovery, which targets the discovery of empirical laws from observation and experimentation (Shrager and Langley 1990) (see Kloesgen and Zytkow  for a glossary of terms common to KDD and machine discovery), and causal modeling for the inference of causal models from data (Spirtes, Glymour, and Scheines 1993). ...
|1996 FromDataMiningToKDInDBs||Usama M. Fayyad|
|From Data Mining to Knowledge Discovery in Databases||AI Magazine||https://www.aaai.org/aitopics/assets/PDF/AIMag17-03-2-article.pdf||1996|