2012 DesignPrinciplesofMassiveRobust

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Most data mining research is concerned with building high-quality classification models in isolation. In massive production systems, however, the ability to monitor and maintain performance over time while growing in size and scope is equally important. Many external factors may degrade classification performance including changes in data distribution, noise or bias in the source data, and the evolution of the system itself. A well-functioning system must gracefully handle all of these. This paper lays out a set of design principles for large-scale autonomous data mining systems and then demonstrates our application of these principles within the m6d automated ad targeting system. We demonstrate a comprehensive set of quality control processes that allow us monitor and maintain thousands of distinct classification models automatically, and to add new models, take on new data, and correct poorly-performing models without manual intervention or system disruption.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2012 DesignPrinciplesofMassiveRobustFoster Provost
Brian Dalessandro
Claudia Perlich
Ori Stitelman
Troy Raeder
Design Principles of Massive, Robust Prediction Systems10.1145/2339530.23397402012