Machine Learning Research Task

Jump to navigation Jump to search

A Machine Learning Research Task is a research task of a machine learning research question (that investigates machines improving performance over time).



  • (Wikipedia, 2015) ⇒ Retrieved:2015-7-18.
    • As a scientific endeavour, machine learning grew out of the quest for artificial intelligence. Already in the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed “neural networks"; these were mostly perceptrons and other models that were later found to be reinventions of the generalized linear models of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis. However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation. By 1980, expert systems had come to dominate AI, and statistics was out of favor. Work on symbolic/knowledge-based learning did continue within AI, leading to inductive logic programming, but the more statistical line of research was now outside the field of AI proper, in pattern recognition and information retrieval. Neural networks research had been abandoned by AI and computer science around the same time. This line, too, was continued outside the AI/CS field, as “connectionism", by researchers from other disciplines including Hopfield, Rumelhart and Hinton. Their main success came in the mid-1980s with the reinvention of backpropagation. Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory. It also benefited from the increasing availability of digitized information, and the possibility to distribute that via the internet.

      Machine learning and data mining often employ the same methods and overlap significantly. They can be roughly distinguished as follows:

      • Machine learning focuses on prediction, based on known properties learned from the training data.
      • Data mining focuses on the discovery of (previously) unknown properties in the data. This is the analysis step of Knowledge Discovery in Databases.
    • The two areas overlap in many ways: data mining uses many machine learning methods, but often with a slightly different goal in mind. On the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in Knowledge Discovery and Data Mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data.

      Machine learning also has intimate ties to optimization: many learning problems are formulated as minimization of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the pre-assigned labels of a set examples). The difference between the two fields arises from the goal of generalization: while optimization algorithms can minimize the loss on a training set, machine learning is concerned with minimizing the loss on unseen samples.


    • Machine Learning is a discipline dedicated to the design and study of artificial learning systems, particularly systems that learn from examples.
  • (Friedman, 2009) ⇒ Jerome H. Friedman
    • QUOTE: One such area that is receiving considerable recent attention is machine learning ("neural networks"). Here one has a system under study that responds to a set of simultaneous input signals. The response is characterized by a set of output signals. The goal is to learn the relationship between the inputs and the outputs in the most general way possible. This exercise generally has two purposes: prediction and understanding. With prediction one is given a set of input values and wishes to predict or forecast likely values of the corresponding outputs without having to actually run the system. Sometimes prediction is the only purpose. Often, however, one wishes to use the derived relationship to gain understanding of how the system works. Such knowledge is often useful in its own right, for example in science, or it may be used to help improve the characteristics of the system, as in industrial or engineering applications.


  • (Mitchell, 2006) ⇒ Tom M. Mitchell. (2006). “The Discipline of Machine Learning." Machine Learning Department technical report CMU-ML-06-108, Carnegie Mellon University.
    • QUOTE: A "Machine Learning research[er] asks “How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?
    • QUOTE: Whereas Computer Science has focused primarily on how to manually program computers, Machine Learning focuses on the question of how to get computers to program themselves (from experience plus some initial structure).
    • QUOTE: Whereas Statistics has focused primarily on what conclusions can be inferred from data, Machine Learning incorporates additional questions about what computational architectures and algorithms can be used to most effectively capture, store, index, retrieve and merge these data, how multiple learning subtasks can be orchestrated in a larger system, and questions of computational tractability.