2001 SPADEAnEfficientAlgorithmforMin

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Frequent Pattern Mining, Sequential Pattern Mining.

Notes

Cited By

Quotes

Author Keywords

Abstract

In this paper we present SPADE, a new algorithm for fast discovery of Sequential Patterns. The existing solutions to |this problem make repeated database scans, and use complex hash structures which have poor locality. SPADE utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations. All sequences are discovered in only three database scans. Experiments show that SPADE outperforms the best previous algorithm by a factor of two, and by an order of magnitude with some pre-processed data. It also has linear scalability with respect to the number of input-sequences, and a number of other database parameters. Finally, we discuss how the results of sequence mining can be applied in a real application domain.



References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2001 SPADEAnEfficientAlgorithmforMinMohammed J. ZakiSPADE: An Efficient Algorithm for Mining Frequent Sequences10.1023/A:10076525023152001