2009 TheElementsOfStatisticalLearning

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Statistical Learning, Supervised Learning, Linear Methods for Regression, Linear Methods for Classification, Basis Expansion and Regularization, Kernel Method, Model Assessment, Model Selection, Model Inference and Model Averaging, Additive Model, Boosting, Additive Tree, Artificial Neural Network, Support Vector Machines, Flexible Discriminants, Prototype Method, k-Nearest Neighbor Algorithm, Unsupervised Learning

Notes

Cited By

2013

Quotes

Book Overview

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting --- the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide data (p bigger than n), including multiple testing and false discovery rates.

Table of Contents

  • 1
  • 2 Overview of Supervised Learning 9
  • 3 Linear Methods for Regression 41
  • 4 Linear Methods for Classification 79
  • 5 Basis Expansions and Regularization 115
  • 6 Kernel Methods 165
  • 7 Model Assessment and Selection 193
  • 8 Model Inference and Averaging 225
  • 9 Additive Models, Trees, and Related Methods 257 https://www.stat.auckland.ac.nz/~yee/784/files/ch09AdditiveModelsTrees.pdf
  • 10 Boosting and Additive Trees 299
  • 11 Neural Networks 347
  • 12 Support Vector Machines and Flexible Discriminants 371
  • 13 Prototype Methods and Nearest-Neighbors 411
  • 14 Unsupervised Learning 437

2 Overview of Supervised Learning

3 Linear Methods for Regression

4 Linear Methods for Classification

5 Basis Expansions and Regularization

6 Kernel Methods

7 Model Assessment and Selection

8 Model Inference and Averaging

9. Additive Models, Trees, and Related Methods

In this chapter we begin our discussion of some specific methods for supervised learning. These techniques each assume a (different) structured form for the unknown regression function, and by doing so they finesse the curse of dimensionality. Of course, they pay the possible price of misspecifying the model, and so in each case there is a tradeoff that has to be made. They take off where Chapters 3–6 left off. We describe five related techniques: generalized additive models, trees, multivariate adaptive regression splines, the patient rule induction method, and hierarchical mixtures of experts.

9.1 Generalized Additive Models

Regression models play an important role in many data analyses, providing prediction and classification rules, and data analytic tools for understanding the importance of different inputs.

Although attractively simple, the traditional linear model often fails in these situations: in real life, effects are often not linear. In earlier chapters we described techniques that used predefined basis functions to achieve nonlinearities. This section describes more automatic flexible statistical methods that may be used to identify and characterize nonlinear regression effects. These methods are called “generalized additive models.”

10 Boosting and Additive Trees

11 Neural Networks

12 Support Vector Machines and Flexible Discriminants

13 Prototype Methods and Nearest-Neighbors

14 Unsupervised Learning


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 TheElementsOfStatisticalLearningJerome H. Friedman
Trevor Hastie
The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition)http://www-stat.stanford.edu/~hastie/Papers/ESLII.pdf