Support Measure

From GM-RKB
(Redirected from support measure)
Jump to navigation Jump to search

A Support Measure is an association rule measure of significance that estimates how often an itemset appears in a dataset.



References

2018

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Association_rule_learning#Support Retrieved:2018-10-7.
    • Support is an indication of how frequently the itemset appears in the dataset.

      The support of [math]\displaystyle{ X }[/math] with respect to [math]\displaystyle{ T }[/math] is defined as the proportion of transactions [math]\displaystyle{ t }[/math] in the dataset which contains the itemset [math]\displaystyle{ X }[/math] .

      [math]\displaystyle{ \mathrm{supp}(X) = \frac{|\{t \in T; X \subseteq t\}|}{|T|} }[/math]

      In the example dataset, the itemset [math]\displaystyle{ X=\{\mathrm{beer, diapers}\} }[/math] has a support of [math]\displaystyle{ 1/5=0.2 }[/math] since it occurs in 20% of all transactions (1 out of 5 transactions). The argument of [math]\displaystyle{ \mathrm{supp}() }[/math] is a set of preconditions, and thus becomes more restrictive as it grows (instead of more inclusive).

2011

  • (Han, Pei & Kamber, 2011) ⇒ Jiawei Han, Jian Pei, and Micheline Kamber (2011). "Data mining: concepts and techniques" (PDF). Elsevier. ISBN 978-0-12-381479-1
    • QUOTE: Let [math]\displaystyle{ I = \{I_1 , I_2 , \cdots , I_m\} }[/math] be an itemset. Let [math]\displaystyle{ D }[/math], the task-relevant data, be a set of database transactions where each transaction [math]\displaystyle{ T }[/math] is a nonempty itemset such that [math]\displaystyle{ T \subseteq I }[/math]. Each transaction is associated with an identifier, called a TID. Let [math]\displaystyle{ A }[/math] be a set of items. A transaction [math]\displaystyle{ T }[/math] is said to contain A if [math]\displaystyle{ A \subseteq T }[/math]. An association rule is an implication of the form [math]\displaystyle{ A \Rightarrow B }[/math], where [math]\displaystyle{ A \subset I,\; B \subset I,\; A = \emptyset,\; B = \emptyset }[/math], and [math]\displaystyle{ A \cap B = \emptyset }[/math]. The rule [math]\displaystyle{ A \Rightarrow B }[/math] holds in the transaction set [math]\displaystyle{ D }[/math] with support [math]\displaystyle{ s }[/math], where [math]\displaystyle{ s }[/math] is the percentage of transactions in [math]\displaystyle{ D }[/math] that contain [math]\displaystyle{ A \cup B }[/math] (i.e., the union of sets [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] say, or, both [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math]). This is taken to be the probability, [math]\displaystyle{ P(A \cup B) }[/math]. The rule [math]\displaystyle{ A \Rightarrow B }[/math] has confidence [math]\displaystyle{ c }[/math] in the transaction set [math]\displaystyle{ D }[/math], where [math]\displaystyle{ c }[/math] is the percentage of transactions in [math]\displaystyle{ D }[/math] containing [math]\displaystyle{ A }[/math] that also contain [math]\displaystyle{ B }[/math]. This is taken to be the conditional probability, [math]\displaystyle{ P(B|A) }[/math]. That is,

      [math]\displaystyle{ support (A\Rightarrow B) = P(A ∪ B) \quad\quad }[/math] (6.2).

2008

where [math]\displaystyle{ c_{XY} }[/math] represents the number of transactions which contain all items in [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] , and [math]\displaystyle{ m }[/math] is the number of transactions in the database.

2005

1993