High-Dimensionality Clustering Algorithm

References

(Agrawal et al., 1999) ⇒ Rakesh Agrawal, Johannes Ernst Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan. (1999). “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications." US Patent 6,003,029,
- Emerging data mining applications place special requirements on clustering techniques, such as the ability to handle high dimensionality, assimilation of cluster descriptions by users, description minimation, and scalability and usability. Regarding high dimensionality of data clustering, an object typically has dozens of attributes in which the domains of the attributes are large. Clusters formed in a high-dimensional data space are not likely to be meaningful clusters because the expected average density of points anywhere in the high-dimensional data space is low. The requirement for high dimensionality in a data mining application is conventionally addressed by requiring a user to specify the subspace for cluster analysis.