A Confusion Matrix is a Square Matrix that represents the Count of a Classification Function's Predictions with respect to the Actuals on some Labeled Learning Record Set.
- Context:
- Example:
- The sample confusion matrix below shows a confusion matrix for a classification task with the three (c=3) output classes: A, B and C. The test set used to evaluate the algorithm contained 100 cases with a distribution of 30 As, 35 Bs and 35 Cs. A perfect classifier would have only made predictions along the diagonal, but the results below show that the algorithm was only correct on (20+25+24)/100 = 69% of the cases. The matrix also shows that the classifier often confuses dairy for cans (11 incorrect) and cans for dairy (9 wrong).
| ACTUAL/PREDICTED | A | B | C | SUM | |
| A | 20 | 2 | 11 | 34 | |
| B | 2 | 25 | 1 | 28 | |
| C | 9 | 5 | 24 | 38 | |
| SUM | 31 | 32 | 36 | 100 | |
References
- (Wikipedia, 2009) http://en.wikipedia.org/wiki/Confusion_matrix
- In the field of artificial intelligence, a confusion matrix is a visualization tool typically used in supervised learning (in unsupervised learning it is typically called a matching matrix). Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class. One benefit of a confusion matrix is that it is easy to see if the system is confusing two classes (i.e. commonly mislabelling one as another).
- When a data set is unbalanced (when the number of samples in different classes vary greatly) the error rate of a classifier is not representative of the true performance of the classifier. This can easily be understood by an example: If there are for example 990 samples from class A and only 10 samples from class B, the classifier can easily be biased towards class A. If the classifier classifies all the samples as class A, the accuracy will be 99%. This is not a good indication of the classifier's true performance. The classifier has a 100% recognition rate for class A but a 0% recognition rate for class B.
1998
- (Kohavi & Provost, 1998) => Ron Kohavi, and Foster Provost. (1998). "Glossary of Terms." In: Machine Leanring 30(2-3).
- Confusion matrix: A matrix showing the predicted and actual classifications. A confusion matrix is of size LxL, where L is the number of different label values. The following confusion matrix is for L=2:
|
actual \ predicted |
negative |
positive |
|
Negative |
a |
b |
|
Positive |
c |
d |