2008 ColibriFastMiningofLargeStatica

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Low-rank approximations of the adjacency matrix of a graph are essential in finding patterns (such as communities) and detecting anomalies. Additionally, it is desirable to track the low-rank structure as the graph evolves over time, efficiently and within limited storage. Real graphs typically have thousands or millions of nodes, but are usually very sparse. However, standard decompositions such as SVD do not preserve sparsity. This has led to the development of methods such as CUR and CMD, which seek a non-orthogonal basis by sampling the columns and/or rows of the sparse matrix.

However, these approaches will typically produce overcomplete bases, which wastes both space and time. In this paper we propose the family of Colibri methods to deal with these challenges. Our version for static graphs, Colibri-S, iteratively finds a non-redundant basis and we prove that it has no loss of accuracy compared to the best competitors (CUR and CMD), while achieving significant savings in space and time: on real data, Colibri-S requires much less space and is orders of magnitude faster (in proportion to the square of the number of non-redundant columns). Additionally, we propose an efficient update algorithm for dynamic, time-evolving graphs, Colibri-D. Our evaluation on a large, real network traffic dataset shows that Colibri-D is over 100 times faster than the best published competitor (CMD).

References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 ColibriFastMiningofLargeStaticaChristos Faloutsos
Philip S. Yu
Jimeng Sun
Hanghang Tong
Spiros Papadimitriou
Colibri: Fast Mining of Large Static and Dynamic Graphs10.1145/1401890.1401973