Outlier Detection (OD) Task

(Redirected from Anomaly Detection Task)
Jump to navigation Jump to search

An Outlier Detection (OD) Task is a detection task that detects outlier observations in a data set.



  • (Yang, Zhou et al., 2021) ⇒ Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. (2021). “Generalized Out-of-Distribution Detection: A Survey.” arXiv preprint arXiv:2110.11334
    • QUOTE: 2021 GeneralizedOODDetection.png
      Figure 2: Exemplar problem settings for tasks under generalized OOD detection framework. Tags on test images refer to model’s expected predictions. (a) In sensory anomaly detection, test images with covariate shift will be considered as OOD. No semantic shift occurs in this setting. (b) In semantic anomaly detection and one-class novelty detection, normality/ID images belong to one class. Test images with semantic shift will be considered as OOD. No covariate shift occurs in this setting. (c) In multi-class novelty detection, ID images belong to multiple classes. Test images with semantic shift will be considered as OOD. No covariate shift occurs in this setting. (d) Open set recognition is identical to multi-class novelty detection in the task of detection, with the only difference that open set recognition further requires in-distribution classification. (e) Out-of-distribution detection is a super-category that covers semantic AD, one-class ND, multi-class ND, and open-set recognition, which canonically aims to detect test samples with semantic shift without losing the ID classification accuracy. (f) Outlier detection does not follow a train-test scheme. All observations are provided. It fits in the generalized OOD detection framework by defining the majority distribution as ID. Outliers can have any distribution shift from the majority samples.









  • (Knorr & Ng, 1998) ⇒ E. Knorr, and Raymond Ng. (1998). “Algorithms for Mining Distance-based Outliers in Large Data Sets.” In: Proceedings of the 24th International Conference on Very Large Databases (VLDB 1998).
    • NOTES: It defines outliers as those data points (vectors) with values different from those of the remaining set of data.