2015 MonitoringLeastSquaresModelsofD

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Least squares regression is widely used to understand and predict data behavior in many fields. As data evolves, regression models must be recomputed, and indeed much work has focused on quick, efficient and accurate computation of linear regression models. In distributed streaming settings, however, periodically recomputing the global model is wasteful: communicating new observations or model updates is required even when the model is, in practice, unchanged. This is prohibitive in many settings, such as in wireless sensor networks, or when the number of nodes is very large. The alternative, monitoring prediction accuracy, is not always sufficient: in some settings, for example, we are interested in the model's coefficients, rather than its predictions. We propose the first monitoring algorithm for multivariate regression models of distributed data streams that guarantees a bounded model error. It maintains an accurate estimate using a fraction of the communication by recomputing only when the precomputed model is sufficiently far from the (hypothetical) current global model. When the global model is stable, no communication is needed.

Experiments on real and synthetic datasets show that our approach reduces communication by up to two orders of magnitude while providing an accurate estimate of the current global model in all nodes.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2015 MonitoringLeastSquaresModelsofDAssaf Schuster
Moshe Gabel
Daniel Keren
Monitoring Least Squares Models of Distributed Streams10.1145/2783258.27833492015