k-Nearest Neighbor (kNN) Algorithm
From GM-RKB
A k-Nearest Neighbor (kNN) Algorithm is a search algorithm that can be implemented by a kNN system to solve a kNN task (that locates the most similar neighbor items according to some distance function).
- Context:
- It can range from being a k-Nearest Neighbor Search Algorithm to being a k-Nearest Neighbor Prediction Algorithm.
- It can range from being a k-Nearest Neighbor Classification Algorithm to being a k-Nearest Neighbor Regression Algorithm.
- It can solve a k-Nearest Neighbor Search Task.
- It can be implemented by a Nearest Neighbor System (to solve a nearest neighbor task).
- It can be a Nearest Neighbor Search Algorithm, to solve a Nearest Neighbor Search Task.
- It can range from being a Nearest Neighbor Classification Algorithm to being a Nearest Neighbor Regression Algorithm.
- It can (often) be used as an Instance-based Learning Algorithm.
- Example(s):
- Counter-Example(s):
- See: Similarity Search Algorithm, Radial Search Algorithm, Case-based Reasoning Algorithm, Neighbor Relationship, Non-Parametric Statistics, k-Means.
References
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/k-nearest_neighbors_algorithm Retrieved:2017-8-30.
- In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression: :* In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. :* In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors. k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms. Both for classification and regression, it can be useful to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor. ^{[1]}
The neighbors are taken from a set of objects for which the class (for k-NN classification) or the object property value (for k-NN regression) is known. This can be thought of as the training set for the algorithm, though no explicit training step is required.
A peculiarity of the k-NN algorithm is that it is sensitive to the local structure of the data.The algorithm is not to be confused with k-means, another popular machine learning technique.
- In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression: :* In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor. :* In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors. k-NN is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. The k-NN algorithm is among the simplest of all machine learning algorithms. Both for classification and regression, it can be useful to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor. ^{[1]}
- ↑ This scheme is a generalization of linear interpolation.
2017
- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Nearest_neighbour_algorithm Retrieved:2017-8-30.
- The nearest neighbour algorithm was one of the first algorithms used to determine a solution to the travelling salesman problem. In it, the salesman starts at a random city and repeatedly visits the nearest city until all have been visited. It quickly yields a short tour, but usually not the optimal one.
2011
- (Keogh, 2011b) ⇒ Eamonn Keogh. (2011). “Nearest Neighbor." In: (Sammut & Webb, 2011) p.714
2009
- (Wu & Kumar, 2009) ⇒ Xindong Wu, and Vipin Kumar, editors. (2009). The Top Ten Algorithms in Data Mining. Chapman & Hall. ISBN: 1420089641
2006
- (Zezula et al., 2006) ⇒ Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, and Michal Batko. “Similarity Search: The Metric Space Approach."
- QUOTE: the nearest neighbor query
NN(q) = x x ∈ X, ∀y ∈ X, d(q,x) ≤ d(q,y)
- QUOTE: k-nearest neighbor query
k-NN(q,k) = A A ⊆ X, |A| = k ∀x ∈ A, y ∈ X – A, d(q,x) ≤ d(q,y)
1997
- (Mitchell, 1997) ⇒ Tom M. Mitchell. (1997). “Machine Learning." McGraw-Hill.
- QUOTE: Section 8.6 Remarks on Lazy and Eager Learning: In this chapter we considered three lazy learning methods: the k-Nearest Neighbor algorithm, locally weighted regression, and case-based reasoning.
1991
- (Aha et al., 1991) ⇒ D. W. Aha, D. Kibler, and M. K. Albert. (1991). “Instance-based learning algorithms." In: Machine Learning, 6(1).
1973
- (Duda & Hart, 1973) ⇒ Richard O. Duda, and Peter E. Hart. (1973). “Pattern Classification and Scene Analysis." John Wiley & Sons, New York, NY.