# 2011 AlgorithmsforSpeedingUpDistance

- (Bhaduri et al., 2011) ⇒ Kanishka Bhaduri, Bryan L. Matthews, and Chris R. Giannella. (2011). “Algorithms for Speeding Up Distance-based Outlier Detection.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020554

**Subject Headings:**

## Notes

## Cited By

- http://scholar.google.com/scholar?q=%222011%22+Algorithms+for+Speeding+Up+Distance-based+Outlier+Detection
- http://dl.acm.org/citation.cfm?id=2020408.2020554&preflayout=flat#citedby

## Quotes

### Author Keywords

- Algorithms; data mining; distributed computing; experimentation; nearest neighbor; outlier detection; performance

### Abstract

The problem of distance-based outlier detection is difficult to solve efficiently in very large datasets because of potential quadratic time complexity. We address this problem and develop sequential and distributed algorithms that are significantly more efficient than state-of-the-art methods while still guaranteeing the same outliers. By combining simple but effective indexing and disk block accessing techniques, we have developed a sequential algorithm iOrca that is up to an order-of-magnitude faster than the state-of-the-art. The indexing scheme is based on sorting the data points in order of increasing distance from a fixed reference point and then accessing those points based on this sorted order. To speed up the basic outlier detection technique, we develop two distributed algorithms (DOoR and iDOoR) for modern distributed multi-core clusters of machines, connected on a ring topology. The first algorithmpasses data blocks from each machine around the ring, incrementally updating the nearest neighbors of the points passed. By maintaining a cutoff threshold, it is able to prune a large number of points in a distributed fashion. The second distributed algorithm extends this basic idea with the indexing scheme discussed earlier. In our experiments, both distributed algorithms exhibit significant improvements compared to the state-of-the-art distributed method [13].

## References

;

Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|

2011 AlgorithmsforSpeedingUpDistance | Kanishka Bhaduri Bryan L. Matthews Chris R. Giannella | Algorithms for Speeding Up Distance-based Outlier Detection | 10.1145/2020408.2020554 | 2011 |

Author | Kanishka Bhaduri +, Bryan L. Matthews + and Chris R. Giannella + |

conference | KDD-2011 + |

doi | 10.1145/2020408.2020554 + |

proceedings | Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining + |

title | Algorithms for Speeding Up Distance-based Outlier Detection + |

year | 2011 + |