# 1997 MachineLearning

- (Mitchell, 1997) ⇒ Tom M. Mitchell. (1997). “Machine Learning.” McGraw-Hill. ISBN:0070428077

**Subject Headings:** Machine Learning Textbook.

## Notes

- Book Website: http://cs.cmu.edu/~tom/mlbook.html
- It can cover ML topics, such as: Target Function, Lazy Learner, Eager Learner, Supervised Learning Algorithm, Concept Learning Algorithm, General-to-Specific Ordering Algorithm, Decision Tree Learning Algorithm, Artificial Neural Network, Hypothesis Evaluation, Bayesian Learning Algorithm, Computational Learning Theory, Instance-based Learning Algorithm, Genetic Algorithm, Rule Learning Algorithm, Analytical Learning Algorithm, Reinforcement Learning Algorithm, Regression Algorithm, Residual, Kernel Function.

## Cited By

- (Bishop, 2007) ⇒ Christopher M. Bishop. (2006). “Pattern Recognition and Machine Learning. Springer, Information Science and Statistics.

## Quotes

### Table of Contents

- 1. Introduction
- 2. Concept Learning and the General-to-Specific Ordering
- 3. Decision Tree Learning
- 4. Artificial Neural Networks
- 5. Evaluating Hypotheses
- 6. Bayesian Learning
- 7. Computational Learning Theory
- 8. Instance-based Learning
- 9. Genetic Algorithms
- 10. Learning Sets of Rules
- 11. Analytical Learning
- 12. Combining Inductive and Analytical Learning
- 13. Reinforcement Learning

### 1.2.2 Choosing the Target Function.

The next design choice is to determin exasctly what type of kowledge will be learned and how this will be used by the performance program. … Let us call this target function [math]\displaystyle{ V }[/math] and again use the notation [math]\displaystyle{ V }[/math] : [math]\displaystyle{ B }[/math] → * R* to denote that [math]\displaystyle{ V }[/math] maps any legal board state from the set [math]\displaystyle{ B }[/math] to some real value. We intend for this target function [math]\displaystyle{ V }[/math] to assign higher scores to better board states … Thus, we have reduced the learning task in this case to the problem of discover an

*operational description of the ideal target function V*. It may be very difficult in general to learn such an operational form of [math]\displaystyle{ V }[/math] perfectly. In fact we often expect learning algorithms to acquire only some

*approximation*to the target function, and for this reason the process of learning the target function is often called function approximation

*. In the current discussion we will use the symbol*V^

*to refer to the function that is actually learned by our program, to distinguish it from the ideal target function*V

*.*

### 8.2.3

Much of the literature on nearest-neighbor methods and weighted local regression uses a terminology that has arisen from the field of statistical pattern recognition....

*Regression*means approximating a real-valued target function.*Residual*is the error*f*(^{^}*x*) - [math]\displaystyle{ f }[/math](*x*) in approximating the target function.*Kernel function*is the function of distance that is used to determine the weight of each training example. In other words, the kernel function is the function [math]\displaystyle{ K }[/math] such that w_{i}=*K*(*d*(*x*,_{i}*x*))._{q}

### 8.3 Locally Weighted Regression

### 8.6 Remarks on Lazy and Easter Learning

In this chapter we considered three *lazy* learning methods: the *k*-Nearest Neighbor algorithm, locally weighted regression, and case-based reasoning. We call these methods lazy because they defer the decision of how to generalize beyond the training data until each new query instance in encountered. We also discussed on *eager* learning method: the method for learning radial basis function networks. We call this method eager because it generalize beyond the training data before observe the new query, committing at training time to the network structure and weights that define its approximation to the target function. In this same sense, every other algorithm discussed elsewhere in this book (e.g., Backpropagation, C4.5) is an eager learning algorithm.

- Lazy methods may consider the query instance
*x*when deciding how to generalize beyond the training data D_{q}*.* - Eager methods cannot. By the time they observe the query instance
*x*they have already chosen their (global) approximation to the target function._{q}

…

The key point in the above paragraph is that a lazy learning has the option of (implicitly) representing the target function by a combination of many local approximations, whereas an eager learner must commit at training time to a single global approximation. The distinction between eager and lazy learning is thus related to the distinction between global and local approximations to the target function.

## References

## References

,

Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|

1997 MachineLearning | Tom M. Mitchell | Machine Learning | http://www.cs.cmu.edu/~tom/mlbook.html | 1997 |