# Apriori Algorithm

An Apriori Algorithm is an iterative association rule learning algorithm that computes the frequent itemsets where iteration [math]i[/math] computes all frequent i-itemsets (itemsets with [math]i[/math] elements) and each iteration has a candidate generation step, candidate counting step and candidate selection step.

**Context:**- Its Apriori Candidate Generation Step generates all i-itemsets.
- It generates large numbers of subsets

- Its Apriori Candidate Counting Step counts the support of all i-itemsets.
- It uses a breadth-first search and a tree structure.

- Its Apriori Candidate Selection Step selects itemsets above required threshold.

- Its Apriori Candidate Generation Step generates all i-itemsets.
**See:**Association Rule Learning Task; Frequent-Itemset Mining Algorithm; Association Rule; Basket Analysis; Constraint-Based Mining; Frequent Itemset; Frequent Pattern.

## References

### 2011

- (Toivonen, 2011a) ⇒ Hannu Toivonen. (2011). “Apriori Algorithm.” In: (Sammut & Webb, 2011) p.40

### 2009

- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Apriori_algorithm
- In computer science and data mining,
**Apriori**is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions (Winepi and Minepi), or having no timestamps (DNA sequencing). - As is common in association rule mining, given a set of
*itemsets*(for instance, sets of retail transactions, each listing individual items purchased), the algorithm attempts to find subsets which are common to at least a minimum number C of the itemsets. Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as*candidate generation*), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found. - Apriori uses breadth-first search and a tree structure to count candidate item sets efficiently. It generates candidate item sets of length [math]k[/math] from item sets of length [math]k-1[/math]. Then it prunes the candidates which have an infrequent sub pattern. According to the downward closure lemma, the candidate set contains all frequent [math]k[/math]-length item sets. After that, it scans the transaction database to determine frequent item sets among the candidates.
- Apriori, while historically significant, suffers from a number of inefficiencies or trade-offs, which have spawned other algorithms. Candidate generation generates large numbers of subsets (the algorithm attempts to load up the candidate set with as many as possible before each scan). Bottom-up subset exploration (essentially a breadth-first traversal of the subset lattice) finds any maximal subset S only after all [math]2^{|S|}-1[/math] of its proper subsets.

- In computer science and data mining,
- (Wu & Kumar, 2009) ⇒ Xindong Wu, and Vipin Kumar, editors. (2009). “The Top Ten Algorithms in Data Mining.” Chapman & Hall. ISBN:1420089641

### 1994

- Rakesh Agrawal, and Rakesh Srikant. (1994). “Fast Algorithms for Mining Association Rules.” In: Proceedings of VLDB 1994. ISBN 1-55860-153-8.
- Heikki Mannila, H. Toivonen, and A. I. Verkamo. (1994). “Efficient Algorithms for Discovering Association Rules.” In: AAAI Workshop on Knowledge Discovery in Databases (SIGKDD).

### 1993

- (Agrawal et al., 1993) ⇒ Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. (1993). “Mining Association Rules Between Sets of Items in Large Databases.” In: Proceedings of ACM SIGMOD Conference (SIGMOD 1993). do>10.1145/170035.170072.