Outlier Detection Task
(Redirected from anomaly detection)
- AKA: Anomaly Detection.
- It can range from being a Univariate Outlier Detection Task to being a Multivariate Outlier Detection Task (such as a bivariate outlier detection task).
- It can range from being a Numerical Outlier Detection Task to being a Categorical Outlier Detection Task.
- It can range from being a Unordered-Data Outlier Detection Task to being a Sequential-Data Outlier Detection Task (such as a temporal outlier detection task).
- It can range from being a I.I.D. Outlier Detection Task to being a Non-I.I.D. Outlier Detection Task.
- It can be solved by an Outlier Detection System (that applies an outlier detection algorithm).
- It can support an Outlier Mining Task (to find interesting outliers).
- See: Pattern Detection Task, Cost-Benefit Matrix, Distance Measure, Statistical Process Control.
- (Aggarwal, 2013) ⇒ Charu C. Aggarwal. (2013). “Outlier Analysis." Springer Publishing Company, Incorporated. ISBN:1461463955, 9781461463955 doi:10.1007/978-1-4614-6396-2
- (Hauskrecht et al., 2013) ⇒ Milos Hauskrecht, Iyad Batal, Michal Valko, Shyam Visweswaran, Gregory F Cooper, and Gilles Clermont. (2013). “Outlier Detection for Patient Monitoring and Alerting.” In: Journal of Biomedical Informatics, 46(1). doi:10.1016/j.jbi.2012.08.004
- (Chandola et al., 2012) ⇒ Varun Chandola, Arindam Banerjee, and Vipin Kumar. (2012). “Anomaly Detection for Discrete Sequences: A Survey.” In: IEEE Transactions on Knowledge and Data Engineering Journal, 24(5). doi:10.1109/TKDE.2010.235
- (Chandola et al., 2009) ⇒ Varun Chandola, Arindam Banerjee, and Vipin Kumar. (2009). “Anomaly Detection: A survey.” In: ACM Computing Surveys, 41(3) doi:10.1145/1541880.1541882
- QUOTE: Anomaly detection refers to the problem of finding patterns in data that do not conform to expected behavior. These non-conforming patterns are often referred to as anomalies, outliers, discordant observations, exceptions, aberrations, surprises, peculiarities or contaminants in different application domains. Of these, anomalies and outliers are two terms used most commonly in the context of anomaly detection; sometimes interchangeably. Anomaly detection finds extensive use in a wide variety of applications such as fraud detection for credit cards, insurance or health care, intrusion detection for cyber-security, fault detection in safety critical systems, and military surveillance for enemy activities.
- (Ben-Gal, 2005) ⇒ Irad E. Ben-Gal. (2005). “Outlier Detection.” In: Maimon O. and Rockach L. (Eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers," Kluwer Academic Publishers. ISBN:0387244352.
- ABSTRACT: Outlier detection is a primary step in many data-mining applications. We present several methods for outlier detection, while distinguishing between univariate vs. multivariate techniques and parametric vs. nonparametric procedures. In presence of outliers, special attention should be taken to assure the robustness of the used estimators. Outlier detection for data mining is often based on distance measures, clustering and spatial methods.
- (Hodge & Austin, 2004) ⇒ Victoria Hodge, and Jim Austin. (2004). “A Survey of Outlier Detection Methodologies.” In: Artificial Intelligence Review, 22(2). doi:10.1023/B:AIRE.0000045502.10941.a9
- (Markou & Singh, 2003) ⇒ Markos Markou, and Sameer Singh. (2003). “Novelty Detection: A Review — part 1: Statistical Approaches.” In: Signal processing, 83(12).
- QUOTE: Novelty detection is the identification of new or unknown data or signal that a machine learning system is not aware of during training. Novelty detection is one of the fundamental requirements of a good classification or identification system since sometimes the test data contains information about objects that were not known at the time of training the model. In this paper we provide state-of-the-art review in the area of novelty detection based on statistical approaches.
- (Rousseeuw & Leroy, 2003) ⇒ Peter J. Rousseeuw, and Annick M. Leroy. (2003). “Robust Regression and Outlier Detection." Wiley-IEEE. ISBN:0471488550
- (Breunig et al., 2000) ⇒ Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. (2000). “LOF: identifying density-based local outliers.” In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD 2000). doi:10.1145/335191.335388
- (Knorr & Ng, 1998) ⇒ E. Knorr, and Raymond Ng. (1998). “Algorithms for Mining Distance-based Outliers in Large Data Sets.” In: Proceedings of the 24th International Conference on Very Large Databases (VLDB 1998).