2004 ParallelBuddyPrima

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Frequent Itemset, Closed Frequent Itemset.

Notes

Cited By

Quotes

Author Keywords

Parallel data mining, Association mining, top-down approach, Candidate distribution.

Abstract

Frequent itemset mining is essential for the discovery of association rules, strong rules, episodes, and minimal keys. This paper describes a Parallel approach for association mining, based on Buddy Prima algorithm, that combines bottom up and top down approach. Apriori algorithm, the widely used association mining technique uses the breadth-first search, bottom up approach. The Apriori algorithm performs well only when the frequent itemsets are short. Algorithms with top down approach are suitable for long frequent itemsets. This Parallel Buddy Prima algorithm combines both bottom-up and top-down approach. The PRIMA representation consumes less memory as each transaction is replaced with the product of the equivalent prime numbers of their items. It reduces the time taken to determine the support count of the Itemset. Candidate distribution technique is adopted to handle large datasets with large itemsets. The performance of this algorithm is compared with the other existing algorithms and the results are tabulated. The proposed algorithm reduces the time and data complexity. Experimental results of this algorithm on Microsoft Anonymous Data show that this parallel approach outperforms the existing algorithms approximately by a factor of two.


References

  • [1] R. Agrawal, T. Imielinski, and R. Srikant. "Mining association rules between sets of items in large databases " SIGMOD, May 1993.
  • [2] R. Agrawal and J. C. Shafer. “Parallel Mining of association rules: Design, Implementation and Experience”. IBM Research Report RJ10004, Feb. 1996.
  • [3] J. Han, Y. Fu. “Discovery of multiple-level association rules from large databases”. In 21st VLDB, Sept. 1995.
  • [4] R. Agrawal and R. Srikant. "Fast algorithms for mining association rules in large databases”, In: Proceedings. 20th VLDB, Sept. 1994.
  • [5] Burdick, D., Calimlim, M., Gehrke, J., "MAFIA :A Maximal Frequent Itemset Algorithm for Transactional databases", In Intl. Conference on Data Engineering 2001.
  • [6] Lin D I., Kedem, Z M., "Pincer Search: A new algorithm for discovering the maximum frequent

set", Intl Conference on Extending database technology,1998.

  • [7] A. Sarasere, E. Omiecinsky, and S. Navethe. “An efficient algorithms for mining association

rules in large databases”. In: Proceedings. 21st VLDB, Sept. 1995.

  • [8] H. Toivonen. “Discovery of frequent patterns in large data collections”. Technical Report A-

1996-5 of the Department of Computer Science, University of Helsinki, Finland, 1996.

  • [9] J. S. Park, M.-S. Chen and P. S.Yu. "Efficient Parallel Data Mining for Association rules", IBM

Research Report, RC 20156, August 1995.

  • [10] J. S. Park, M.-S. Chen and P. S.Yu. "An Effective Hash based Algorithm for Association rules". Proceedings of ACM SIGMOD, May, 1995. IBM Research Report, RC 20156, August 1995.
  • [11] Zheng, Z., Kohavi, R., and Mason, L. "Real world performance of association rule algorithm" , In: Proceedings. 7-th International Conference on Knowledge Discovery and Data Mining. 2001.
  • [12] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A. I. Berkamo. “Finding interesting rules from large sets of discovered association rules”. In: Proceedings. Third International Conference on Information and Knowledge Management, Nov. 1994.
  • [13] A. Mueller. “Fast sequential and parallel algorithms for association rule mining: A comparison”. Technical Report No. CS-TR-3515 of CS Department, University of Maryland-College Park.
  • [14] R. Srikant and R. Agrawal. “Mining generalized association rules”. In 21st VLDB, Sept. 1995.
  • [15] Website : www.cs.helsinki.fi/u/goethals/,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 ParallelBuddyPrimaS. N. Sivanandam
D. Sumathi
T. Hamsapriya
K. Babu
Parallel Buddy Prima – A Hybrid Parallel Frequent itemset mining algorithm for very large databaseshttp://www.acadjournal.com/2004/v13/part6/p3/par buddy.pdf