- (Chierichetti et al., 2014) ⇒ Flavio Chierichetti, Nilesh Dalvi, and Ravi Kumar. (2014). “Correlation Clustering in MapReduce.” In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2014) Journal. ISBN:978-1-4503-2956-9 doi:10.1145/2623330.2623743
Correlation clustering is a basic primitive in data miner's toolkit with applications ranging from entity matching to social network analysis. The goal in correlation clustering is, given a graph with signed edges, partition the nodes into clusters to minimize the number of disagreements. In this paper we obtain a new algorithm for correlation clustering. Our algorithm is easily implementable in computational models such as MapReduce and streaming, and runs in a small number of rounds. In addition, we show that our algorithm obtains an almost 3-approximation to the optimal correlation clustering. Experiments on huge graphs demonstrate the scalability of our algorithm and its applicability to data mining problems.
|2014 CorrelationClusteringinMapReduc||Flavio Chierichetti|
|Correlation Clustering in MapReduce||10.1145/2623330.2623743||2014|
|Author||Flavio Chierichetti +, Nilesh Dalvi + and Ravi Kumar +|
|proceedings||Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +|
|title||Correlation Clustering in MapReduce +|