# Spectral Clustering Algorithm

(Redirected from Spectral Clustering)

Jump to navigation
Jump to search
A Spectral Clustering Algorithm is a clustering algorithm that performs dimensionality reduction (for clustering in fewer dimensions) by means of the spectrum of the affinity matrix.

**Context:**- It can be applied by a Spectral Clustering System (that solve s a spectral clustering task).

**Example(s):****Counter-Example(s):****See:**Kernel Principal Component Analysis; k-Way Spectral Clustering.

## References

### 2011

- (Sammut & Webb, 2011) ⇒ Claude Sammut, and Geoffrey I. Webb. (2011). “Spectral Clustering.” In: (Sammut & Webb, 2011) p.907
- (Wikipedia - Spectral Clustering, 2011-Jun-13) ⇒ http://en.wikipedia.org/wiki/Cluster_analysis#Spectral_clustering
- QUOTE: Given a set of data points A, the similarity matrix may be defined as a matrix [math]\displaystyle{ S }[/math] where [math]\displaystyle{ S_{ij} }[/math] represents a measure of the similarity between points [math]\displaystyle{ i, j\in A }[/math]. Spectral clustering techniques make use of the spectrum of the similarity matrix of the data to perform dimensionality reduction for clustering in fewer dimensions. One such technique is the
*Normalized Cuts algorithm*by Shi-Malik, commonly used for image segmentation. It partitions points into two sets [math]\displaystyle{ (S_1,S_2) }[/math] based on the eigenvector [math]\displaystyle{ v }[/math] corresponding to the second-smallest eigenvalue of the Laplacian matrix [math]\displaystyle{ L = I - D^{-1/2}SD^{-1/2} }[/math] of [math]\displaystyle{ S }[/math], where [math]\displaystyle{ D }[/math] is the diagonal matrix [math]\displaystyle{ D_{ii} = \sum_{j} S_{ij}. }[/math] This partitioning may be done in various ways, such as by taking the median [math]\displaystyle{ m }[/math] of the components in [math]\displaystyle{ v }[/math], and placing all points whose component in [math]\displaystyle{ v }[/math] is greater than [math]\displaystyle{ m }[/math] in [math]\displaystyle{ S_1 }[/math], and the rest in [math]\displaystyle{ S_2 }[/math]. The algorithm can be used for hierarchical clustering by repeatedly partitioning the subsets in this fashion. A related algorithm is the*Meila-Shi algorithm*, which takes the eigenvectors corresponding to the [math]\displaystyle{ k }[/math] largest eigenvalues of the matrix [math]\displaystyle{ P = SD^{-1} }[/math] for some [math]\displaystyle{ k }[/math], and then invokes another algorithm (e.g.*k*-means) to cluster points by their respective [math]\displaystyle{ k }[/math] components in these eigenvectors.

- QUOTE: Given a set of data points A, the similarity matrix may be defined as a matrix [math]\displaystyle{ S }[/math] where [math]\displaystyle{ S_{ij} }[/math] represents a measure of the similarity between points [math]\displaystyle{ i, j\in A }[/math]. Spectral clustering techniques make use of the spectrum of the similarity matrix of the data to perform dimensionality reduction for clustering in fewer dimensions. One such technique is the

### 2003

- (Bengio et al., 2003b) ⇒ Yoshua Bengio, Jean-François Paiement, Pascal Vincent, Olivier Delalleau, Nicolas Le Roux, Marie Ouimet. (2003). “Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering.” In: Advances in Neural Information Processing Systems 16 (NIPS 2003).

### 2001

- (Ng, Jordan & Weiss, 2001) ⇒ Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. (2001). “On Spectral Clustering: Analysis and an algorithm.” In: Advances in Neural Information Processing Systems, 14 (NIPS 2001)
- QUOTE: Despite many empirical successes of spectral clustering methods - algorithms that cluster points using eigenvectors of matrices derived from the data - there are several unresolved issues. First, there are a wide variety of algorithms that use the eigenvectors in slightly different ways. Second, many of these algorithms have no proof that they will actually compute a reasonable clustering. In this paper, we present a simple spectral clustering algorithm that can be implemented using a few lines of Matlab.