2005 SurveyOfClustering

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Clustering Algorithm, Cluster Validation, Artificial Neural Network, Proximity, Self-Organizing Feature Map.

Notes

Cited By

Quotes

Index Terms

Adaptive resonance theory (ART), clustering, clustering algorithm, cluster validation, neural networks, proximity, self-organizing feature map (SOFM).

Abstract

Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.

Introduction

We are living in a world full of data. Every day, people encounter a large amount of information and store or represent it as data, for further analysis and management. One of the vital means in dealing with these data is to classify or group them into a set of categories or clusters. Actually, as one of the most primitive activities of human beings [14], classification plays an important and indispensable role in the long history of human development. In order to learn a new object or understand a new phenomenon, people always try to seek the features that can describe it, and further compare it with other known objects or phenomena, based on the similarity or dissimilarity, generalized as proximity, according to some certain standards or rules. “Basically, classification systems are either supervised or unsupervised, depending on whether they assign new inputs to one of a finite number of discrete supervised classes or unsupervised categories, respectively [38], [60], [75]. In supervised classification, the mapping from a set of input data vectors (where is the input space dimensionality), to a finite set of discrete class labels (where is the total number of class types), is modeled in terms of some mathematical function, where is a vector of adjustable parameters. The values of these parameters are determined (optimized) by an inductive learning algorithm (also termed inducer), whose aim is to minimize an empirical risk functional (related to an inductive principle) on a finite data set of input–output examples,, where is the finite cardinality of the available representative data set [38], [60], [167]. When the inducer reaches convergence or terminates, an induced classifier is generated [167].

In unsupervised classification, called clustering or exploratory data analysis, no labeled data are available [88], [150]. The goal of clustering is to separate a finite unlabeled data set into a finite and discrete set of “natural,” hidden data structures, rather than provide an accurate characterization of unobserved samples generated from the same probability distribution [23], [60]. This can make the task of clustering fall outside of the framework of unsupervised predictive learning problems, such as vector quantization [60] (see Section II-C), probability density functionestimation [38] (see Section II-D), [60], and entropy maximization [99]. It is noteworthy that clustering differs from multidimensional scaling (perceptual maps), whose goal is to depict all the evaluated objects in a way that minimizes the topographical distortion while using as few dimensions as possible. Also note that, in practice, many (predictive) vector quantizers are also used for (nonpredictive) clustering analysis [60].

Nonpredictive clustering is a subjective process in nature, which precludes an absolute judgment as to the relative efficacy of all clustering techniques [23], [152]. As pointed out by Backer and Jain [17], “in cluster analysis a group of objects is split up into a number of more or less homogeneous subgroups on the basis of an often subjectively chosen measure of similarity (i.e., chosen subjectively based on its ability to create “interesting” clusters), such that the similarity between objects within a subgroup is larger than the similarity between objects belonging to different subgroups”.

Clustering algorithms partition data into a certain number of clusters (groups, subsets, or categories). There is no universally agreed upon definition [88]. Most researchers describe a cluster by considering the internal homogeneity and the external separation [111], [124], [150], i.e., patterns in the same cluster should be similar to each other, while patterns in different clusters should not. Both the similarity and the dissimilarity should be examinable in a clear and meaningful way. Here, we give some simple mathematical descriptions of several types of clustering, based on the descriptions in [124].

Fig. 1 depicts the procedure of cluster analysis with four basic steps.

  1. Feature selection or extraction. As pointed out by Jain et al. [151], [152] and Bishop [38], feature selection chooses distinguishing features from a set of candidates, while feature extraction utilizes some transformations to generate useful and novel features from the original ones. Both are very crucial to the effectiveness of clustering applications. Elegant selection of features can greatly decrease the workload and simplify the subsequent design process. Generally,ideal features should be of use in distinguishing patterns belonging to different clusters, immune to noise, easy to extract and interpret. We elaborate the discussion on feature extraction in Section II-L, in the context of data visualization and dimensionality reduction. More information on feature selection can be found in [38], [151], and [250].
  2. Clustering algorithm design or selection. The step is usually combined with the selection of a corresponding proximity measure and the construction of a criterion function. Patterns are grouped according to whether they resemble each other. Obviously, the proximity measure directly affects the formation of the resulting clusters. Almost all clustering algorithms are explicitly or implicitly connected to some definition of proximity measure. Some algorithms even work directly on the proximity matrix, as defined in Section II-A. Once a proximity measure is chosen, the construction of a clustering criterion function makes the partition of clusters an optimization problem, which is well defined mathematically, and has rich solutions in the literature. Clustering is ubiquitous,anda wealth of clustering algorithms has been developed to solve different problems in specific fields.However, there is no clustering algorithm that can be universally used to solve all problems. “It has been very difficult to develop a unified framework for reasoning about it (clustering) at a technical level, and profoundly diverse approaches to clustering” [166], as proved through an impossibility theorem. Therefore, it is important to carefully investigate the characteristics of the problem at hand, in order to select or design an appropriate clustering strategy.
  3. Cluster validation. Given a data set, each clustering algorithm can always generate a division, no matter whether the structure exists or not. Moreover, different approaches usually lead to different clusters; and even for the same algorithm, parameter identification or the presentation order of input patterns may affect the final results. Therefore, effective evaluation standards and criteria are important to provide the users with a degree of confidence for the clustering results derived from the used algorithms. These assessments should be objective and have no preferences to any algorithm. Also, they should be useful for answering questions like how many clusters are hidden in the data, whether the clusters obtained are meaningful or just an artifact of the algorithms, or why we choose some algorithm instead of another. Generally, there are three categories of testing criteria: external indices, internal indices, and relative indices. These are defined on three types of clustering structures, known as partitional clustering, hierarchical clustering, and individual clusters [150]. Tests for the situation, where no clustering structure exists in the data, are also considered [110], but seldom used, since users are confident of the presence of clusters. External indices are based on some prespecified structure, which is the reflection of prior information on the data, and used as a standard to validate the clustering solutions. Internal tests are not dependent on external information (prior knowledge). On the contrary, they examine the clustering structure directly from the original data. Relative criteria place the emphasis on the comparison of different clustering structures, in order to provide a reference, to decide which one may best reveal the characteristics of the objects.We will not survey the topic in depth and refer interested readers to [74], [110], and [150]. However, we will cover more details on how to determine the number of clusters in Section II-M. Some more recent discussion can be found in [22], [37], [121], [180], and [181]. Approaches for fuzzy clustering validity are reported in [71], [104], [123], and [220].
  4. Results interpretation. The ultimate goal of clustering is to provide users with meaningful insights from the original data, so that they can effectively solve the problems encountered. Experts in the relevant fields interpret the data partition. Further analyzes, even experiments, may be required to guarantee the reliability of extracted knowledge.

Clustering Algorithms

Hierarchical Algorithms

Noticing the restriction of centroid-based HC, which is unable to identify arbitrary cluster shapes, Guha, Rastogi, and Shim developed a HC algorithm, called CURE, to explore more sophisticated cluster shapes [116]. The crucial feature of CURE lies in the usage of a set of well-scattered points to represent each cluster, which makes it possible to find rich cluster shapes other than hyperspheres and avoids both the chaining effect [88] of the minimum linkage method and the tendency to favor clusters with similar sizes of centroid. These representative points are further shrunk toward the cluster centroid according to an adjustable parameter in order to weaken the effects of outliers. CURE utilizes random sample (and partition) strategy to reduce computational complexity. Guha et al. also proposed another agglomerative HC algorithm, ROCK, to group data with qualitative attributes [117]. They used a novel measure “link” to describe the relation between a pair of objects and their common neighbors. Like CURE, a random sample strategy is used to handle large data sets. Chameleon is constructed from graph theory and will be discussed in Section II-E.

References

  • F. Abascal and A. Valencia, “Clustering of proximal sequence space for the identification of protein families,” Bioinformatics, vol. 18, pp. 908–921, 2002.
  • C. Aggarwal and P. Yu, “Redefining clustering for high-dimensional applications,” IEEE Trans. Knowl. Data Eng., vol. 14, no. 2, pp. 210–225, Feb. 2002.
  • Rakesh Agrawal, J. Gehrke, D. Gunopulos, and Prabhakar Raghavan, “Automatic subspace clustering of high dimensional data for data mining applications,” In: Proceedings of ACM SIGMOD International Conference Management of Data, 1998, pp. 94–105.
  • Hitotogu Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom. Control, vol. AC-19, no. 6, pp. 716–722, Dec. 1974.
  • A. Alizadeh et al., “Distinct types of diffuse large B-cell Lymphoma identified by gene expression profiling,” Nature, vol. 403, pp. 503–511, 2000.
  • U. Alon, N. Barkai, D. Notterman, K. Gish, S. Ybarra, D. Mack, and A. Levine, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” Proceedings of Nat. Acad. Sci. USA, pp. 6745–6750, 1999.
  • C. Alpert and A. Kahng, “Multi-way partitioning via spacefilling curves and dynamic programming,” In: Proceedings of 31st ACM/IEEE Design Automation Conf., 1994, pp. 652–657.
  • “Recent directions in netlist partitioning: A survey,” VLSI J., vol. 19, pp. 1–81, 1995.
  • K. Al-Sultan, “A Tabu search approach to the clustering problem,” Pattern Recognit., vol. 28, no. 9, pp. 1443–1451, 1995.
  • S. Altschul et al., “Gapped BLAST and PSI-BLAST: A new generation of protein database search programs,” Nucleic Acids Res., vol. 25, pp. 3389–3402, 1997.
  • S. Altschul et al., “Basic local alignment search tool,” J. Molec. Biol., vol. 215, pp. 403–410, 1990.
  • G. Anagnostopoulos and M. Georgiopoulos, “Hypersphere ART and ARTMAP for unsupervised and supervised incremental learning,” In: Proceedings of IEEE-INNS-ENNS International Joint Conference Neural Networks (IJCNN’00), vol. 6, Como, Italy, pp. 59–64.
  • “Ellipsoid ART and ARTMAP for incremental unsupervised and supervised learning,” In: Proceedings of IEEE-INNS-ENNS International Joint Conference Neural Networks (IJCNN’01), vol. 2, Washington, DC, 2001, pp. 1221–1226.
  • M. Anderberg, Cluster Analysis for Applications. New York: Academic, 1973.
  • G. Babu and M. Murty, “A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm,” Pattern Recognit. Lett., vol. 14, no. 10, pp. 763–769, 1993.
  • “Clustering with evolution strategies,” Pattern Recognit., vol. 27, no. 2, pp. 321–329, 1994.
  • E. Backer and A. Jain, “A clustering performance measure based on fuzzy set decomposition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-3, no. 1, pp. 66–75, Jan. 1981.
  • P. Baldi and S. Brunak, Bioinformatics: The Machine Learning Approach, 2nd ed. Cambridge, MA: MIT Press, 2001.
  • P. Baldi and K. Hornik, “Neural networks and principal component analysis: Learning from examples without local minima,” Neural Netw., vol. 2, pp. 53–58, 1989.
  • P. Baldi and A. Long, “A Bayesian framework for the analysis of microarray expression data: Regularized t-test and statistical inferences of gene changes,” Bioinformatics, vol. 17, pp. 509–519, 2001.
  • G. Ball and D. Hall, “A clustering technique for summarizing multivariate data,” Behav. Sci., vol. 12, pp. 153–155, 1967.
  • S. Bandyopadhyay and U. Maulik, “Nonparametric genetic clustering: Comparison of validity indices,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 31, no. 1, pp. 120–125, Feb. 2001.
  • A. Baraldi and E. Alpaydin, “Constructive feedforward ART clustering networks — Part I and II,” IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 645–677, May 2002.
  • A. Baraldi and P. Blonda, “A survey of fuzzy clustering algorithms for pattern recognition — Part I and II,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 29, no. 6, pp. 778–801, Dec. 1999.
  • A. Baraldi and L. Schenato, “Soft-to-hard model transition in clustering: A review,”, Tech. Rep. TR-99-010, 1999.
  • D. Barbará and P. Chen, “Using the fractal dimension to cluster datasets,” In: Proceedings of 6th ACM SIGKDD International Conference Knowledge Discovery and Data Mining, 2000, pp. 260–264.
  • M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and clustering,” in Advances in Neural Information Processing Systems, Thomas G. Dietterich, S. Becker, and Zoubin Ghahramani, Eds. Cambridge, MA: MIT Press, 2002, vol. 14.
  • R. Bellman, Adaptive Control Processes: A Guided Tour. Princeton, NJ: Princeton Univ. Press, 1961.
  • A. Ben-Dor, R. Shamir, and Z. Yakhini, “Clustering gene expression patterns,” J. Comput. Biol., vol. 6, pp. 281–297, 1999.
  • Y. Bengio, “Markovian models for sequential data,” Neural Comput. Surv., vol. 2, pp. 129–162, 1999.
  • A. Ben-Hur, D. Horn, H. Siegelmann, and Vladimir N. Vapnik, “Support vector clustering,” J. Mach. Learn. Res., vol. 2, pp. 125–137, 2001.
  • “A support vector clustering method,” In: Proceedings of International Conference Pattern Recognition, vol. 2, 2000, pp. 2724–2727.
  • P. Berkhin. (2001) Survey of clustering data mining techniques. [Online]. Available: http://www.accrue.com/products/rp_cluster_review.pdf http://citeseer.nj.nec.com/berkhin02survey.html
  • K. Beyer, Jonathan Goldstein, R. Ramakrishnan, and U. Shaft, “When is nearest neighbor meaningful,” In: Proceedings of 7th International Conference Database Theory, 1999, pp. 217–235.
  • J. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum, 1981.
  • J. Bezdek and R. Hathaway, “Numerical convergence and interpretation of the fuzzy c-shells clustering algorithms,” IEEE Trans. Neural Netw., vol. 3, no. 5, pp. 787–793, Sep. 1992.
  • J. Bezdek and N. Pal, “Some new indexes of cluster validity,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 28, no. 3, pp. 301–315, Jun. 1998.
  • Christopher M. Bishop, Neural Networks for Pattern Recognition. New York: Oxford Univ. Press, 1995.
  • L. Bobrowski and J. Bezdek, “c-Means clustering with the l and l norms,” IEEE Trans. Syst., Man, Cybern., vol. 21, no. 3, pp. 545–554, May-Jun. 1991.
  • H. Bock, “Probabilistic models in cluster analysis,” Comput. Statist. Data Anal., vol. 23, pp. 5–28, 1996.
  • E. Bolten, A. Sxhliep, S. Schneckener, D. Schomburg, and R. Schrader, “Clustering protein sequences — Structure prediction by transitive homology,” Bioinformatics, vol. 17, pp. 935–941, 2001.
  • N. Boujemaa, “Generalized competitive clustering for image segmentation,” In: Proceedings of 19th International Meeting North American Fuzzy Information Processing Soc. (NAFIPS’00), Atlanta, GA, 2000, pp. 133–137.
  • P. Bradley and Usama M. Fayyad, “Refining initial points for K-means clustering,” In: Proceedings of 15th International Conference Machine Learning, 1998, pp. 91–99.
  • P. Bradley, Usama M. Fayyad, and C. Reina, “Scaling clustering algorithms to large databases,” In: Proceedings of 4th International Conference Knowledge Discovery and Data Mining (KDD’98), 1998, pp. 9–15.
  • “Clustering very large databases using EM mixture models,” In: Proceedings of 15th International Conference Pattern Recognition, vol. 2, 2000, pp. 76–80.
  • “Clustering very large databases using EM mixture models,” In: Proceedings of 15th International Conference Pattern Recognition, vol. 2, 2000, pp. 76–80.
  • D. Brown and C. Huntley, “A practical application of simulated annealing to clustering,” Pattern Recognit., vol. 25, no. 4, pp. 401–412, 1992.
  • C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining Knowl. Discov., vol. 2, pp. 121–167, 1998.
  • J. Burke, D. Davison, andW. Hide, “d2 Cluster: A validated method for clustering EST and full-length cDNA sequences,” Genome Res., vol. 9, pp. 1135–1142, 1999.
  • I. Cadez, S. Gaffney, and Padhraic Smyth, “A general probabilistic framework for clustering individuals and objects,” In: Proceedings of 6th ACM SIGKDD International Conference Knowledge Discovery and Data Mining, 2000, pp. 140–149.
  • G. Carpenter and S. Grossberg, “A massively parallel architecture for a self-organizing neural pattern recognition machine,” Comput. Vis. Graph. Image Process., vol. 37, pp. 54–115, 1987.
  • “ART2: Self-organization of stable category recognition codes for analog input patterns,” Appl. Opt., vol. 26, no. 23, pp. 4919–4930, 1987.
  • “The ART of adaptive pattern recognition by a self-organizing neural network,” IEEE Computer, vol. 21, no. 3, pp. 77–88, Mar. 1988.
  • “ART3: Hierarchical search using chemical transmitters in selforganizing pattern recognition architectures,” Neural Netw., vol. 3, no. 23, pp. 129–152, 1990.
  • G. Carpenter, S. Grossberg, N. Markuzon, J. Reynolds, and D. Rosen, “Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps,” IEEE Trans. Neural Netw., vol. 3, no. 5, pp. 698–713, 1992.
  • G. Carpenter, S. Grossberg, and J. Reynolds, “ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network,” Neural Netw., vol. 4, no. 5, pp. 169–181, 1991.
  • G. Carpenter, S. Grossberg, and D. Rosen, “Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system,” Neural Netw., vol. 4, pp. 759–771, 1991.
  • G. Celeux and G. Govaert, “A classification EM algorithm for clustering and two stochastic versions,” Comput. Statist. Data Anal., vol. 14, pp. 315–332, 1992.
  • P. Cheeseman and J. Stutz, “Bayesian classification (AutoClass): Theory and results,” in Advances in Knowledge Discovery and Data Mining, Usama M. Fayyad, G. Piatetsky-Shapiro, Padhraic Smyth, and R. Uthurusamy, Eds. Menlo Park, CA: AAAI Press, 1996, pp. 153–180.
  • V. Cherkassky and F. Mulier, Learning From Data: Concepts, Theory, and Methods. New York: Wiley, 1998.
  • J. Cherng and M. Lo, “A hypergraph based clustering algorithm for spatial data sets,” In: Proceedings of IEEE International Conference Data Mining (ICDM’01), 2001, pp. 83–90.
  • J. Chiang and P. Hao, “A new kernel-based fuzzy clustering approach: Support vector clustering with cell growing,” IEEE Trans. Fuzzy Syst., vol. 11, no. 4, pp. 518–527, Aug. 2003.
  • C. Chinrungrueng and C. Séquin, “Optimal adaptive K-means algorithm with dynamic adjustment of learning rate,” IEEE Trans. Neural Netw., vol. 6, no. 1, pp. 157–169, Jan. 1995.
  • S. Chu and J. Roddick, “A clustering algorithm using the Tabu search approach with simulated annealing,” in Data Mining II — Proceedings of Second International Conference on Data Mining Methods and Databases, N. Ebecken and C. Brebbia, Eds, Cambridge, U.K., 2000, pp. 515–523.
  • I. H. G. S. Consortium, “Initial sequencing and analysis of the human genome,” Nature, vol. 409, pp. 860–921, 2001.
  • J. Corchado and C. Fyfe, “A comparison of kernel methods for instantiating case based reasoning systems,” Comput. Inf. Syst., vol. 7, pp. 29–42, 2000.
  • M. Cowgill, R. Harvey, and L. Watson, “A genetic algorithm approach to cluster analysis,” Comput. Math. Appl., vol. 37, pp. 99–108, 1999.
  • [C. Cummings and D. Relman, “Using DNA microarray to study hostmicrobe interactions,” Genomics, vol. 6, no. 5, pp. 513–525, 2000.
  • E. Dahlhaus, “Parallel algorithms for hierarchical clustering and applications to split decomposition and parity graph recognition,” J. Algorithms, vol. 36, no. 2, pp. 205–240, 2000.
  • R. Davé, “Adaptive fuzzy c-shells clustering and detection of ellipses,” IEEE Trans. Neural Netw., vol. 3, no. 5, pp. 643–662, Sep. 1992.
  • R. Davé and R. Krishnapuram, “Robust clustering methods: A unified view,” IEEE Trans. Fuzzy Syst., vol. 5, no. 2, pp. 270–293, May 1997.
  • M. Delgado, A. Skármeta, and H. Barberá, “A Tabu search approach to the fuzzy clustering problem,” In: Proceedings of 6th IEEE International Conference Fuzzy Systems, vol. 1, 1997, pp. 125–130.
  • D. Dembélé and P. Kastner, “Fuzzy c-means method for clustering microarray data,” Bioinformatics, vol. 19, no. 8, pp. 973–980, 2003.
  • Handbook of Pattern Recognition and Computer Vision, C. Chen, L. Pau, and P.Wang, Eds.,World Scientific, Singapore, 1993, pp. 3–32. R. Dubes, “Cluster analysis and related issue”.
  • R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2001.
  • J. Dunn, “A fuzzy relative of the ISODATA process and its use in detecting compact well separated clusters,” J. Cybern., vol. 3, no. 3, pp. 32–57, 1974.
  • B. Duran and P. Odell, Cluster Analysis: A Survey. New York: Springer-Verlag, 1974.
  • R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, U.K.: Cambridge Univ. Press, 1998.
  • M. Eisen and Peter F. Brown, “DNA arrays for analysis of gene expression,” Methods Enzymol., vol. 303, pp. 179–205, 1999.
  • M. Eisen, P. Spellman, Peter F. Brown, and D. Botstein, “Cluster analysis and display of genome-wide expression patterns,” In: Proceedings of Nat. Acad. Sci. USA, vol. 95, 1998, pp. 14 863–14 868.
  • Y. El-Sonbaty and M. Ismail, “Fuzzy clustering for symbolic data,” IEEE Trans. Fuzzy Syst., vol. 6, no. 2, pp. 195–204, May 1998.
  • T. Eltoft and R. deFigueiredo, “A new neural network for cluster-detection- and-labeling,” IEEE Trans. Neural Netw., vol. 9, no. 5, pp. 1021–1035, Sep. 1998.
  • A. Enright and C. Ouzounis, “GeneRAGE: A robust algorithm for sequence clustering and domain detection,” Bioinformatics, vol. 16, pp. 451–457, 2000.
  • S. Eschrich, J. Ke, L. Hall, and D. Goldgof, “Fast accurate fuzzy clustering through data reduction,” IEEE Trans. Fuzzy Syst., vol. 11, no. 2, pp. 262–270, Apr. 2003.
  • Martin Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” In: Proceedings of 2nd International Conference Knowledge Discovery and Data Mining (KDD’96), 1996, pp. 226–231.
  • V. Estivill-Castro and I. Lee, “AMOEBA: Hierarchical clustering based on spatial proximity using Delaunay diagram,” In: Proceedings of 9th International Symp. Spatial Data Handling (SDH’99), Beijing, China, 1999, pp. 7a.26–7a.41.
  • V. Estivill-Castro and J. Yang, “A fast and robust general purpose clustering algorithm,” In: Proceedings of 6th Pacific Rim International Conference Artificial Intelligence (PRICAI’00), R.Mizoguchi and J. Slaney, Eds., Melbourne, Australia, 2000, pp. 208–218.
  • B. Everitt, S. Landau, and M. Leese, Cluster Analysis. London: Arnold, 2001.
  • D. Fasulo, “An analysis of recent work on clustering algorithms,” Dept. Comput. Sci. Eng., Univ.Washington, Seattle,WA, Tech. Rep. 01-03-02, 1999.
  • M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 3, pp. 381–396, Mar. 2002.
  • D. Fisher, “Knowledge acquisition via incremental conceptual clustering,” Mach. Learn., vol. 2, pp. 139–172, 1987.
  • R. Fisher, “The use of multiple measurements in taxonomic problems,” Annu. Eugenics, pt. II, vol. 7, pp. 179–188, 1936.
  • D. Fogel, “An introduction to simulated evolutionary optimization,” IEEE Trans. Neural Netw., vol. 5, no. 1, pp. 3–14, Jan. 1994.
  • E. Forgy, “Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications,” Biometrics, vol. 21, pp. 768–780, 1965.
  • C. Fraley and A. Raftery, “MCLUST: Software for model-based cluster analysis,” J. Classificat., vol. 16, pp. 297–306, 1999.
  • “Model-based clustering, discriminant analysis, and density estimation,” J. Amer. Statist. Assoc., vol. 97, pp. 611–631, 2002.
  • Jerome H. Friedman, “Exploratory projection pursuit,” J. Amer. Statist. Assoc., vol. 82, pp. 249–266, 1987.
  • H. Frigui and R. Krishnapuram, “A robust competitive clustering algorithm with applications in computer vision,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 5, pp. 450–465, May 1999.
  • B. Fritzke. (1997) Some competitive learning methods. [Online]. Available: http://www.neuroinformatik.ruhr-uni-bochum.de/ini/VDM/research/ gsn/JavaPaper
  • B. Gabrys and A. Bargiela, “General fuzzy min-max neural network for clustering and classification,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 769–783, May 2000.
  • Venkatesh Ganti, R. Ramakrishnan, J. Gehrke, A. Powell, and J. French, “Clustering large datasets in arbitrary metric spaces,” In: Proceedings of 15th International Conference Data Engineering, 1999, pp. 502–511.
  • I. Gath and A. Geva, “Unsupervised optimal fuzzy clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 773–781, Jul. 1989.
  • GenBank Release Notes 144.0.
  • A. Geva, “Hierarchical unsupervised fuzzy clustering,” IEEE Trans. Fuzzy Syst., vol. 7, no. 6, pp. 723–733, Dec. 1999.
  • D. Ghosh and A. Chinnaiyan, “Mixture modeling of gene expression data from microarray experiments,” Bioinformatics, vol. 18, no. 2, pp. 275–286, 2002.
  • A. Ghozeil and D. Fogel, “Discovering patterns in spatial data using evolutionary programming,” In: Proceedings of 1st Annu. Conference Genetic Programming, 1996, pp. 512–520.
  • M. Girolami, “Mercer kernel based clustering in feature space,” IEEE Trans. Neural Netw., vol. 13, no. 3, pp. 780–784, May 2002.
  • F. Glover, “Tabu search, part I,” ORSA J. Comput., vol. 1, no. 3, pp. 190–206, 1989.
  • T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. Mesirov, H. Coller, M. Loh, J. Downing, M. Caligiuri, C. Bloomfield, and E. Lander, “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, vol. 286, pp. 531–537, 1999.
  • A. Gordon, “Cluster validation,” in Data Science, Classification, and Related Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. Bock, and Y. Bada, Eds. New York: Springer-Verlag, 1998, pp. 22–39.
  • Classification, 2nd ed. London, U.K.: Chapman & Hall, 1999.
  • J. Gower, “Ageneral coefficient of similarity and some of its properties,” Biometrics, vol. 27, pp. 857–872, 1971.
  • S. Grossberg, “Adaptive pattern recognition and universal encoding II: Feedback, expectation, olfaction, and illusions,” Biol. Cybern., vol. 23, pp. 187–202, 1976.
  • P. Grünwald, P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri, “Minimum encoding approaches for predictive modeling,” In: Proceedings of 14th International Conference Uncertainty in AI (UAI’98), 1998, pp. 183–192.
  • X. Guan and L. Du, “Domain identification by clustering sequence alignments,” Bioinformatics, vol. 14, pp. 783–788, 1998.
  • (Guha et al., 2001) ⇒ Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. (2001). “CURE: an efficient clustering algorithm for large databases.” In: Proceedings of the 1998 [[ACM SIGMOD] Conference (SIGMOD 1998).
  • S. Guha, R. Rastogi, and K. Shim, “ROCK: A robust clustering algorithm for categorical attributes,” Inf. Syst., vol. 25, no. 5, pp. 345–366, 2000.
  • S. Gupata, K. Rao, and V. Bhatnagar, “K-means clustering algorithm for categorical attributes,” In: Proceedings of 1st International Conference Data Warehousing and Knowledge Discovery (DaWaK’99), Florence, Italy, 1999, pp. 203–208.
  • V. Guralnik and G. Karypis, “Ascalable algorithm for clustering sequential data,” In: Proceedings of 1st IEEE International Conference Data Mining (ICDM’01), 2001, pp. 179–186.
  • D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge, U.K.: Cambridge Univ. Press, 1997.
  • M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “Cluster validity methods: Part I & II,” SIGMOD Record, vol. 31, no. 2–3, 2002.
  • L. Hall, I. Özyurt, and J. Bezdek, “Clustering with a genetically optimized approach,” IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 103–112, 1999.
  • R. Hammah and J. Curran, “Validity measures for the fuzzy cluster analysis of orientations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1467–1472, Dec. 2000.
  • P. Hansen and B. Jaumard, “Cluster analysis and mathematical programming,” Math. Program., vol. 79, pp. 191–215, 1997.
  • P. Hansen and N. Mladenoviæ, “J-means: A new local search heuristic for minimum sum of squares clustering,” Pattern Recognit., vol. 34, pp. 405–413, 2001.
  • F. Harary, Graph Theory. Reading, MA: Addison-Wesley, 1969.
  • J. Hartigan, Clustering Algorithms. New York: Wiley, 1975.
  • E. Hartuv and R. Shamir, “A clustering algorithm based on graph connectivity,” Inf. Process. Lett., vol. 76, pp. 175–181, 2000.
  • R. Hathaway and J. Bezdek, “Fuzzy c-means clustering of incomplete data,” IEEE Trans. Syst., Man, Cybern., vol. 31, no. 5, pp. 735–744, 2001.
  • R. Hathaway, J. Bezdek, and Y. Hu, “Generalized fuzzy c-means clustering strategies using L norm distances,” IEEE Trans. Fuzzy Syst., vol. 8, no. 5, pp. 576–582, Oct. 2000.
  • B. Hay, G. Wets, and K. Vanhoof, “Clustering navigation patterns on a website using a sequence alignment method,” In: Proceedings of Intelligent Techniques for Web Personalization: 17th International Joint Conference Artificial Intelligence, vol. s.l, 2001, pp. 1–6, 200.
  • S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1999.
  • Q. He, “Areviewof clustering algorithms as applied to IR,” Univ. Illinois at Urbana-Champaign, Tech. Rep. UIUCLIS-1999/6+IRG, 1999.
  • M. Healy, T. Caudell, and S. Smith, “A neural architecture for pattern sequence verification through inferencing,” IEEE Trans. Neural Netw., vol. 4, no. 1, pp. 9–20, Jan. 1993.
  • A. Hinneburg and D. Keim, “An efficient approach to clustering in large multimedia databases with noise,” In: Proceedings of 4th International Conference Knowledge Discovery and Data Mining (KDD’98), 1998, pp. 58–65.
  • “Optimal grid-clustering: Toward breaking the curse of dimensionality in high-dimensional clustering,” In: Proceedings of 25th VLDB Conf., 1999, pp. 506–517.
  • F. Hoeppner, “Fuzzy shell clustering algorithms in image processing: Fuzzy c-rectangular and 2-rectangular shells,” IEEE Trans. Fuzzy Syst., vol. 5, no. 4, pp. 599–613, Nov. 1997.
  • J. Hoey, “Clustering contextual facial display sequences,” In: Proceedings of 5th IEEE International Conference Automatic Face and Gesture Recognition (FGR’02), 2002, pp. 354–359.
  • T. Hofmann and J. Buhmann, “Pairwise data clustering by deterministic annealing,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 1, pp. 1–14, Jan. 1997.
  • J. Holland, Adaption in Natural and Artificial Systems. Ann Arbor,MI: Univ. Michigan Press, 1975.
  • F. Höppner, F. Klawonn, and R. Kruse, Fuzzy Cluster Analysis: Methods for Classification, Data Analysis, and Image Recognition. New York: Wiley, 1999.
  • Z. Huang, “Extensions to the K-means algorithm for clustering large data sets with categorical values,” Data Mining Knowl. Discov., vol. 2, pp. 283–304, 1998.
  • J. Huang, M. Georgiopoulos, and G. Heileman, “Fuzzy ART properties,” Neural Netw., vol. 8, no. 2, pp. 203–213, 1995.
  • P. Huber, “Projection pursuit,” Ann. Statist., vol. 13, no. 2, pp. 435–475, 1985.
  • R. Hughey and A. Krogh, “Hidden Markov models for sequence analysis: Extension and analysis of the basic method,” CABIOS, vol. 12, no. 2, pp. 95–107, 1996.
  • M. Hung and D. Yang, “An efficient fuzzy c-means clustering algorithm,” In: Proceedings of IEEE International Conference Data Mining, 2001, pp. 225–232.
  • L. Hunt and J. Jorgensen, “Mixture model clustering using the MULTIMIX program,” Australia and New Zealand J. Statist., vol. 41, pp. 153–171, 1999.
  • J. Hwang, J. Vlontzos, and S. Kung, “A systolic neural network architecture for hidden Markov models,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 12, pp. 1967–1979, Dec. 1989.
  • A. Hyvärinen, “Survey of independent component analysis,” Neural Comput. Surv., vol. 2, pp. 94–128, 1999.
  • A. Jain and R. Dubes, Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice-Hall, 1988.
  • A. Jain, R. Duin, and J. Mao, “Statistical pattern recognition: A review,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 1, pp. 4–37, 2000.
  • A. Jain, M. Murty, and P. Flynn, “Data clustering: A review,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323, 1999.
  • D. Jiang, C. Tang, and A. Zhang, “Cluster analysis for gene expression data: A survey,” IEEE Trans. Knowl. Data Eng., vol. 16, no. 11, pp. 1370–1386, Nov. 2004.
  • C. Jutten and J. Herault, “Blind separation of sources, Part I: An adaptive algorithms based on neuromimetic architecture,” Signal Process., vol. 24, no. 1, pp. 1–10, 1991.
  • T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, “An efficient K-means clustering algorithm: Analysis and implementation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 881–892, Jul. 2000.
  • N. Karayiannis, “A methodology for construction fuzzy algorithms for learning vector quantization,” IEEE Trans. Neural Netw., vol. 8, no. 3, pp. 505–518, May 1997.
  • N. Karayiannis, J. Bezdek, N. Pal, R. Hathaway, and P. Pai, “Repairs to GLVQ: A new family of competitive learning schemes,” IEEE Trans. Neural Netw., vol. 7, no. 5, pp. 1062–1071, Sep. 1996.
  • J. Karhunen, E. Oja, L. Wang, R. Vigário, and J. Joutsensalo, “A class of neural networks for independent component analysis,” IEEE Trans. Neural Netw., vol. 8, no. 3, pp. 486–504, May 1997.
  • [159] G. Karypis, E. Han, and Vipin Kumar, “Chameleon: Hierarchical clustering using dynamic modeling,” IEEE Computer, vol. 32, no. 8, pp. 68–75, Aug. 1999.
  • R. Kathari and D. Pitts, “On finding the number of clusters,” Pattern Recognit. Lett., vol. 20, pp. 405–416, 1999.
  • L. Kaufman and P. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis: Wiley, 1990.
  • W. Kent and A. Zahler, “Conservation, regulation, synteny, and introns in a large-scale C. Briggsae — C. elegans genomic alignment,” Genome Res., vol. 10, pp. 1115–1125, 2000.
  • P. Kersten, “Implementation issues in the fuzzy c-medians clustering algorithm,” In: Proceedings of 6th IEEE International Conference Fuzzy Systems, vol. 2, 1997, pp. 957–962.
  • J. KJiawei Han Wei, M. Ringnér, L. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, C. Antonescu, C. Peterson, and P. Meltzer, “Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,” Nature Med., vol. 7, no. 6, pp. 673–679, 2001.
  • S. Kirkpatrick, C. Gelatt, and M. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, no. 4598, pp. 671–680, 1983.
  • [166] J. Kleinberg, “An impossibility theorem for clustering,” In: Proceedings of 2002 Conference Advances in Neural Information Processing Systems, vol. 15, 2002, pp. 463–470.
  • Ron Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” In: Proceedings of 14th International Joint Conference Artificial Intelligence, 1995, pp. 338–345.
  • T. Kohonen, “The self-organizing map,” Proceedings of IEEE, vol. 78, no. 9, pp. 1464–1480, Sep. 1990.
  • Self-Organizing Maps, 3rd ed. New York: Springer-Verlag, 2001.
  • T. Kohonen, S. Kaski, K. Lagus, J. Salojärvi, J. Honkela, V. Paatero, and A. Saarela, “Self organization of a massive document collection,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 574–585, May 2000.
  • E. Kolatch. (2001) Clustering algorithms for spatial databases: A Survey. [Online]. Available: http://citeseer.nj.nec.com/436 843.html
  • J. Kolen and T. Hutcheson, “Reducing the time complexity of the fuzzy c-means algorithm,” IEEE Trans. Fuzzy Syst., vol. 10, no. 2, pp. 263–267, Apr. 2002.
  • K. Krishna and M. Murty, “Genetic K-means algorithm,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 29, no. 3, pp. 433–439, Jun. 1999.
  • R. Krishnapuram, H. Frigui, and O. Nasraoui, “Fuzzy and possiblistic shell clustering algorithms and their application to boundary detection and surface approximation — Part I and II,” IEEE Trans. Fuzzy Syst., vol. 3, no. 1, pp. 29–60, Feb. 1995.
  • R. Krishnapuram and J. Keller, “A possibilistic approach to clustering,” IEEE Trans. Fuzzy Syst., vol. 1, no. 2, pp. 98–110, Apr. 1993.
  • R. Krishnapuram, O. Nasraoui, and H. Frigui, “The fuzzy c spherical shells algorithm: A new approach,” IEEE Trans. Neural Netw., vol. 3, no. 5, pp. 663–671, Sep. 1992.
  • A. Krogh, M. Brown, I. Mian, K. Sjölander, and D. Haussler, “Hidden Markov models in computational biology: Applications to protein modeling,” J. Molec. Biol., vol. 235, pp. 1501–1531, 1994.
  • G. Lance and W. Williams, “A general theory of classification sorting strategies: 1. Hierarchical systems,” Comput. J., vol. 9, pp. 373–380, 1967.
  • M. Law and J. Kwok, “Rival penalized competitive learning for modelbased sequence clustering,” In: Proceedings of 15th International Conference Pattern Recognition, vol. 2, 2000, pp. 195–198.
  • Y. Leung, J. Zhang, and Z. Xu, “Clustering by scale-space filtering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1396–1410, Dec. 2000.
  • E. Levine and E. Domany, “Resampling method for unsupervised estimation of cluster validity,” Neural Comput., vol. 13, pp. 2573–2593, 2001.
  • C. Li and G. Biswas, “Temporal pattern generation using hidden Markov model based unsupervised classification,” in Advances in Intelligent Data Analysis. ser. Lecture Notes in Computer Science, D. Hand, K. Kok, and M. Berthold, Eds. New York: Springer-Verlag, 1999, vol. 1642.
  • “Unsupervised learning with mixed numeric and nominal data,” IEEE Trans. Knowl. Data Eng., vol. 14, no. 4, pp. 673–690, Jul.-Aug. 2002.
  • C. Li, H. Garcia-Molina, and G. Wiederhold, “Clustering for approximate similarity search in high-dimensional spaces,” IEEE Trans. Knowl. Data Eng., vol. 14, no. 4, pp. 792–808, Jul.-Aug. 2002.
  • W. Li, L. Jaroszewski, and A. Godzik, “Clustering of highly homologous sequences to reduce the size of large protein databases,” Bioinformatics, vol. 17, pp. 282–283, 2001.
  • A. Likas, N. Vlassis, and J. Verbeek, “The global K-means clustering algorithm,” Pattern Recognit., vol. 36, no. 2, pp. 451–461, 2003.
  • S. Lin and B. Kernighan, “An effective heuristic algorithm for the traveling salesman problem,” Operat. Res., vol. 21, pp. 498–516, 1973.
  • R. Lipshutz, S. Fodor, T. Gingeras, and D. Lockhart, “High density synthetic oligonucleotide arrays,” Nature Genetics, vol. 21, pp. 20–24, 1999.
  • G. Liu, Introduction to Combinatorial Mathematics. New York: Mc- Graw-Hill, 1968.
  • J. Lozano and P. Larrañaga, “Applying genetic algorithms to search for the best hierarchical clustering of a dataset,” Pattern Recognit. Lett., vol. 20, pp. 911–918, 1999.
  • J. MacQueen, “Some methods for classification and analysis of multivariate observations,” In: Proceedings of 5th Berkeley Symp., vol. 1, 1967, pp. 281–297.
  • S. C. Madeira and A. L. Oliveira, “Biclustering algorithms for biological data analysis: A survey,” IEEE/ACM Trans. Computat. Biol. Bioinformatics, vol. 1, no. 1, pp. 24–45, Jan. 2004.
  • Y. Man and I. Gath, “Detection and separation of ring-shaped clusters using fuzzy clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 16, no. 8, pp. 855–861, Aug. 1994.
  • J. Mao and A. Jain, “Aself-organizing network for hyperellipsoidal clustering (HEC),” IEEE Trans. Neural Netw., vol. 7, no. 1, pp. 16–29, Jan. 1996.
  • U. Maulik and S. Bandyopadhyay, “Genetic algorithm-based clustering technique,” Pattern Recognit., vol. 33, pp. 1455–1465, 2000.
  • G. McLachlan and T. Krishnan, The EM Algorithm and Extensions. New York: Wiley, 1997.
  • G. McLachlan and D. Peel, Finite Mixture Models. New York: Wiley, 2000.
  • G. McLachlan, D. Peel, K. Basford, and P. Adams, “The EMMIX software for the fitting of mixtures of normal and t-components,” J. Statist. Software, vol. 4, 1999.
  • C. Miller, J. Gurd, and A. Brass, “ARAPID algorithm for sequence database comparisons: Application to the identification of vector contamination in the EMBL databases,” Bioinformatics, vol. 15, pp. 111–121, 1999.
  • R. Miller et al., “A comprehensive approach to clustering of expressed human gene sequence: The sequence tag alignment and consensus knowledge base,” Genome Res., vol. 9, pp. 1143–1155, 1999.
  • W. Miller, “Comparison of genomic DNA sequences: Solved and unsolved problems,” Bioinformatics, vol. 17, pp. 391–397, 2001.
  • G. Milligan and M. Cooper, “An examination of procedures for determining the number of clusters in a data set,” Psychometrika, vol. 50, pp. 159–179, 1985.
  • R. Mollineda and E. Vidal, “A relative approach to hierarchical clustering,” in Pattern Recognition and Applications, Frontiers in Artificial Intelligence and Applications, M. Torres and A. Sanfeliu, Eds. Amsterdam, The Netherlands: IOS Press, 2000, vol. 56, pp. 19–28.
  • B. Moore, “ART1 and pattern clustering,” In: Proceedings of 1988 Connectionist Models Summer School, 1989, pp. 174–185.
  • S. Moore, “Making chips to probe genes,” IEEE Spectr., vol. 38, no. 3, pp. 54–60, Mar. 2001.
  • Y. Moreau, F. Smet, G. Thijs, K. Marchal, and B. Moor, “Functional bioinformatics of microarray data: From expression to regulation,” Proceedings of IEEE, vol. 90, no. 11, pp. 1722–1743, Nov. 2002.
  • T. Morzy, M. Wojciechowski, and M. Zakrzewicz, “Pattern-oriented hierarchical clustering,” In: Proceedings of 3rd East Eur. Conference Advances in Databases and Information Systems, 1999, pp. 179–190.
  • S. Mulder and D. Wunsch, “Million city traveling salesman problem solution by divide and conquer clustering with adaptive resonance neural networks,” Neural Netw., vol. 16, pp. 827–832, 2003.
  • K. Müller, S. Mika, G. Rätsch, K. Tsuda, and Bernhard Schölkopf, “An introduction to kernel-based learning algorithms,” IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 181–201, Mar. 2001.
  • F. Murtagh, “A survey of recent advances in hierarchical clustering algorithms,” Comput. J., vol. 26, no. 4, pp. 354–359, 1983.
  • F. Murtagh and M. Berry, “Overcoming the curse of dimensionality in clustering by means of the wavelet transform,” Comput. J., vol. 43, no. 2, pp. 107–120, 2000.
  • S. Needleman and C. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” J. Molec. Biol., vol. 48, pp. 443–453, 1970.
  • R. Ng and Jiawei Han, “CLARANS: A method for clustering objects for spatial data mining,” IEEE Trans. Knowl. Data Eng., vol. 14, no. 5, pp. 1003–1016, Sep.-Oct. 2002.
  • T. Oates, L. Firoiu, and P. Cohen, “Using dynamic time warping to bootstrap HMM-based clustering of time series,” in Sequence Learning. ser. LNAI 1828, R. Sun and C. Giles, Eds. Berlin, Germany: Springer- Verlag, 2000, pp. 35–52.
  • E. Oja, “Principal components minor components, and linear neural networks,” Neural Netw., vol. 5, pp. 927–935, 1992.
  • J. Oliver, R. Baxter, and C. Wallace, “Unsupervised learning using MML,” In: Proceedings of 13th International Conference Machine Learning (ICML’96), Lorenza, Saitta, 1996, pp. 364–372.
  • C. Olson, “Parallel algorithms for hierarchical clustering,” Parallel Comput., vol. 21, pp. 1313–1325, 1995.
  • C. Ordonez and E. Omiecinski, “Efficient disk-based K-means clustering for relational databases,” IEEE Trans. Knowl. Data Eng., vol. 16, no. 8, pp. 909–921, Aug. 2004.
  • L. Owsley, L. Atlas, and G. Bernard, “Self-organizing feature maps and hidden Markov models for machine-tool monitoring,” IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2787–2798, Nov. 1997.
  • N. Pal and J. Bezdek, “On cluster validity for the fuzzy c-means model,” IEEE Trans. Fuzzy Syst., vol. 3, no. 3, pp. 370–379, Aug. 1995.
  • N. Pal, J. Bezdek, and E. Tsao, “Generalized clustering networks and Kohonen’s self-organizing scheme,” IEEE Trans. Neural Netw., vol. 4, no. 4, pp. 549–557, Jul. 1993.
  • G. Patanè and M. Russo, “The enhanced-LBG algorithm,” Neural Netw., vol. 14, no. 9, pp. 1219–1237, 2001.
  • “Fully automatic clustering system,” IEEE Trans. Neural Netw., vol. 13, no. 6, pp. 1285–1298, Nov. 2002.
  • W. Pearson, “Improved tools for biological sequence comparison,” Proceedings of Nat. Acad. Sci., vol. 85, pp. 2444–2448, 1988.
  • D. Peel and G. McLachlan, “Robust mixture modeling using the t-distribution,” Statist. Comput., vol. 10, pp. 339–348, 2000.
  • D. Pelleg and A. Moore, “X-means: Extending K-means with efficient estimation of the number of clusters,” In: Proceedings of 17th International Conference Machine Learning (ICML’00), 2000, pp. 727–734.
  • J. Peña, J. Lozano, and P. Larrañaga, “An empirical comparison of four initialization methods for the K-means algorithm,” Pattern Recognit. Lett., vol. 20, pp. 1027–1040, 1999.
  • C. Pizzuti and D. Talia, “P-AutoClass: Scalable parallel clustering for mining large data sets,” IEEE Trans. Knowl. Data Eng., vol. 15, no. 3, pp. 629–641, May-Jun. 2003.
  • L. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989.
  • Ralf-Herwig, A. Poustka, C. Müller, C. Bull, H. Lehrach, and J. O’Brien, “Large-scale clustering of cDNA-fingerprinting data,” Genome Res., pp. 1093–1105, 1999.
  • A. Rauber, J. Paralic, and E. Pampalk, “Empirical evaluation of clustering algorithms,” J. Inf. Org. Sci., vol. 24, no. 2, pp. 195–209, 2000.
  • S. Ridella, S. Rovetta, and R. Zunino, “Plastic algorithm for adaptive vector quantization,” Neural Comput. Appl., vol. 7, pp. 37–51, 1998.
  • J. Rissanen, “Fisher information and stochastic complexity,” IEEE Trans. Inf. Theory, vol. 42, no. 1, pp. 40–47, Jan. 1996.
  • K. Rose, “Deterministic annealing for clustering, compression, classification, regression, and related optimization problems,” Proceedings of IEEE, vol. 86, no. 11, pp. 2210–2239, Nov. 1998.
  • S. Roweis and L. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000.
  • D. Sankoff and J. Kruskal, Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Stanford, CA: CSLI Publications, 1999.
  • O. Sasson, N. Linial, and M. Linial, “The metric space of proteins — Comparative study of clustering algorithms,” Bioinformatics, vol. 18, pp. s14–s21, 2002.
  • U. Scherf, D. Ross, M. Waltham, L. Smith, J. Lee, L. Tanabe, K. Kohn, W. Reinhold, T. Myers, D. Andrews, D. Scudiero, M. Eisen, E. Sausville, Y. Pommier,D. Botstein, Peter F. Brown, and J.Weinstein, “A gene expression database for the molecular pharmacology of cancer,” Nature Genetics, vol. 24, no. 3, pp. 236–244, 2000.
  • P. Scheunders, “A comparison of clustering algorithms applied to color image quantization,” Pattern Recognit. Lett., vol. 18, pp. 1379–1384, 1997.
  • Bernhard Schölkopf and Alexander J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Press, 2002.
  • Bernhard Schölkopf, Alexander J. Smola, and K.Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computat., vol. 10, no. 5, pp. 1299–1319, 1998.
  • G. Schwarz, “Estimating the dimension of a model,” Ann. Statist., vol. 6, no. 2, pp. 461–464, 1978.
  • G. Scott, D. Clark, and T. Pham, “A genetic clustering algorithm guided by a descent algorithm,” In: Proceedings of Congr. Evolutionary Computation, vol. 2, Piscataway, NJ, 2001, pp. 734–740.
  • P. Sebastiani, M. Ramoni, and P. Cohen, “Sequence learning via Bayesian clustering by dynamics,” in Sequence Learning. ser. LNAI 1828, R. Sun and C. Giles, Eds. Berlin, Germany: Springer-Verlag, 2000, pp. 11–34.
  • S. Selim and K. Alsultan, “A simulated annealing algorithm for the clustering problems,” Pattern Recognit., vol. 24, no. 10, pp. 1003–1008, 1991.
  • R. Shamir and R. Sharan, “Algorithmic approaches to clustering gene expression data,” in Current Topics in Computational Molecular Biology, T. Jiang, T. Smith, Y. Xu, and M. Zhang, Eds. Cambridge, MA: MIT Press, 2002, pp. 269–300.
  • R. Sharan and R. Shamir, “CLICK: A clustering algorithm with applications to gene expression analysis,” In: Proceedings of 8th International Conference Intelligent Systems for Molecular Biology, 2000, pp. 307–316.
  • G. Sheikholeslami, S. Chatterjee, and A. Zhang, “WaveCluster: A multiresolution clustering approach for very large spatial databases,” In: Proceedings of 24th VLDB Conf., 1998, pp. 428–439.
  • P. Simpson, “Fuzzy min-max neural networks — Part 2: Clustering,” IEEE Trans. Fuzzy Syst., vol. 1, no. 1, pp. 32–45, Feb. 1993.
  • Handbook of Pattern Recognition and Computer Vision, C. Chen, L. Pau, and P. Wang, Eds., World Scientific, Singapore, 1993, pp. 61–124. J. Sklansky and W. Siedlecki, “Large-scale feature selection”.
  • T. Smith and M.Waterman, “New stratigraphic correlation techniques,” J. Geology, vol. 88, pp. 451–457, 1980.
  • Padhraic Smyth, “Clustering using Monte Carlo cross-validation,” In: Proceedings of 2nd International Conference Knowledge Discovery and Data Mining, 1996, pp. 126–133.
  • “Clustering sequences with hidden Markov models,” in Advances in Neural Information Processing,M.Mozer, Michael I. Jordan, and T. Petsche, Eds. Cambridge, MA: MIT Press, 1997, vol. 9, pp. 648–654.
  • “Model selection for probabilistic clustering using cross validated likelihood,” Statist. Comput., vol. 10, pp. 63–72, 1998.
  • “Probabilistic model-based clustering of multivariate and sequential data,” In: Proceedings of 7th International Workshop on Artificial Intelligence and Statistics, 1999, pp. 299–304.
  • P. Sneath, “The application of computers to taxonomy,” J. Gen. Microbiol., vol. 17, pp. 201–226, 1957.
  • P. Somervuo and T. Kohonen, “Clustering and visualization of large protein sequence databases by means of an extension of the self-organizing map,” in LNAI 1967, 2000, pp. 76–85.
  • T. Sorensen, “A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyzes of the vegetation on Danish commons,” Biologiske Skrifter, vol. 5, pp. 1–34, 1948.
  • H. Späth, Cluster Analysis Algorithms. Chichester, U.K.: Ellis Horwood, 1980.
  • P. Spellman, G. Sherlock, M. Ma,V. Iyer,K. Anders, M. Eisen, Peter F. Brown, D. Botstein, and B. Futcher, “Comprehensive identification of cell cycleregulated genes of the Yeast Saccharomyces Cerevisiae by microarray hybridization,” Mol. Biol. Cell, vol. 9, pp. 3273–3297, 1998.
  • “Tech. Rep. 00–034,” Univ. Minnesota, Minneapolis, 2000.
  • K. Stoffel and A. Belkoniene, “Parallel K-means clustering for large data sets,” In: Proceedings of EuroPar’99 Parallel Processing, 1999, pp. 1451–1454.
  • M. Su and H. Chang, “Fast self-organizing feature map algorithm,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 721–733, May 2000.
  • M. Su and C. Chou, “A modified version of theK-means algorithm with a distance based on cluster symmetry,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 6, pp. 674–680, Jun. 2001.
  • R. Sun and C. Giles, “Sequence learning: Paradigms, algorithms, and applications,” in LNAI 1828, . Berlin, Germany, 2000.
  • C. Sung and H. Jin, “A Tabu-search-based heuristic for clustering,” Pattern Recognit., vol. 33, pp. 849–858, 2000.
  • SWISS-PROT Protein Knowledgebase Release 45.0 Statistics.
  • P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, and T. Golub, “Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation,” Proceedings of Nat. Acad. Sci., pp. 2907–2912, 1999.
  • S. Tavazoie, J. Hughes, M. Campbell, R. Cho, and G. Church, “Systematic determination of genetic network architecture,” Nature Genetics, vol. 22, pp. 281–285, 1999.
  • Joshua B. Tenenbaum, V. Silva, and J. Langford, “A global geometric framework for nonlinear dimensionality reduction,” Science, vol. 290, pp. 2319–2323, 2000.
  • Robert Tibshirani, Trevor Hastie, M. Eisen, D. Ross, D. Botstein, and Peter F. Brown, “Clustering methods for the analysis of DNA microarray data,” Dept. Statist., Stanford Univ., Stanford, CA, Tech. Rep..
  • Robert Tibshirani and K. Knight, “The covariance inflation criterion for adaptive model selection,” J. Roy. Statist. Soc. B, vol. 61, pp. 529–546, 1999.
  • L. Tseng and S. Yang, “A genetic approach to the automatic clustering problem,” Pattern Recognit., vol. 34, pp. 415–424, 2001.
  • Vladimir N. Vapnik, Statistical Learning Theory. New York: Wiley, 1998.
  • J. Venter et al., “The sequence of the human genome,” Science, vol. 291, pp. 1304–1351, 2001.
  • J. Vesanto and E. Alhoniemi, “Clustering of the self-organizing map,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 586–600, May 2000.
  • K. Wagstaff, S. Rogers, and S. Schroedl, “Constrained K-means clustering with background knowledge,” In: Proceedings of 8th International Conference Machine Learning, 2001, pp. 577–584.
  • C.Wallace and D. Dowe, “Intrinsic classification byMML — The SNOB program,” In: Proceedings of 7th Australian Joint Conference Artificial Intelligence, 1994, pp. 37–44.
  • H.Wang,W.Wang, J. Yang, and P. Yu, “Clustering by pattern similarity in large data sets,” In: Proceedings of ACM SIGMOD International Conference Management of Data, 2002, pp. 394–405.
  • C. Wei, Y. Lee, and C. Hsu, “Empirical comparison of fast clustering algorithms for large data sets,” In: Proceedings of 33rd Hawaii International Conference System Sciences, Maui, HI, 2000, pp. 1–10.
  • J. Williamson, “Gaussian ARTMAP: A neural network for fast incremental learning of noisy multidimensional maps,” Neural Netw., vol. 9, no. 5, pp. 881–897, 1996.
  • M. Windham and A. Culter, “Information ratios for validating mixture analysis,” J. Amer. Statist. Assoc., vol. 87, pp. 1188–1192, 1992.
  • S. Wu, A. W.-C. Liew, H. Yan, and M. Yang, “Cluster analysis of gene expression data based on self-splitting and merging competitive learning,” IEEE Trans. Inf. Technol. Biomed., vol. 8, no. 1, pp. 5–15, Jan. 2004.
  • D. Wunsch, “An optoelectronic learning machine: Invention, experimentation, analysis of first hardware implementation of the ART1 neural network,” Ph.D. dissertation, Univ. Washington, Seattle, WA, 1991.
  • D.Wunsch, T. Caudell, C. Capps, R. Marks, and R. Falk, “An optoelectronic implementation of the adaptive resonance neural network,” IEEE Trans. Neural Netw., vol. 4, no. 4, pp. 673–684, Jul. 1993.
  • Y. Xiong and D. Yeung, “Mixtures of ARMA models for model-based time series clustering,” In: Proceedings of IEEE International Conference Data Mining, 2002, pp. 717–720.
  • R. Xu, G. Anagnostopoulos, and D. Wunsch, “Tissue classification through analysis of gene expression data using a new family of ART architectures,” In: Proceedings of International Joint Conference Neural Networks (IJCNN’02), vol. 1, 2002, pp. 300–304.
  • Y. Xu, V. Olman, and D. Xu, “Clustering gene expression data using graph-theoretic approach: An application of minimum spanning trees,” Bioinformatics, vol. 18, no. 4, pp. 536–545, 2002.
  • R. Yager, “Intelligent control of the hierarchical agglomerative clustering process,” IEEE Trans. Syst., Man, Cybern., vol. 30, no. 6, pp. 835–845, 2000.
  • R. Yager and D. Filev, “Approximate clustering via the mountain method,” IEEE Trans. Syst., Man, Cybern., vol. 24, no. 8, pp. 1279–1284, 1994.
  • K. Yeung, D. Haynor, and W. Ruzzo, “Validating clustering for gene expression data,” Bioinformatics, vol. 17, no. 4, pp. 309–318, 2001.
  • F. Young and R. Hamer, Multidimensional Scaling: History, Theory, and Applications. Hillsdale, NJ: Lawrence Erlbaum, 1987.
  • L. Zadeh, “Fuzzy sets,” Inf. Control, vol. 8, pp. 338–353, 1965.
  • J. Zhang and Y. Leung, “Improved possibilistic C-means clustering algorithms,” IEEE Trans. Fuzzy Syst., vol. 12, no. 2, pp. 209–217, Apr. 2004.
  • [T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An efficient data clustering method for very large databases,” In: Proceedings of ACM SIGMOD Conference Management of Data, 1996, pp. 103–114.
  • Y. Zhang and Z. Liu, “Self-splitting competitive learning: A new on-line clustering paradigm,” IEEE Trans. Neural Networks, vol. 13, no. 2, pp. 369–380, Mar. 2002.
  • X. Zhuang, Y. Huang, K. Palaniappan, and Y. Zhao, “Gaussian mixture density modeling, decomposition, and applications,” IEEE Trans. Image Process., vol. 5, no. 9, pp. 1293–1302, Sep. (1996).

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2005 SurveyOfClusteringRui Xu
Donald Wunsch II
Survey of Clustering AlgorithmsIEEE Transactions on Neural Networkshttp://axon.cs.byu.edu/Dan/678/papers/Cluster/Xu.pdf10.1109/TNN.2005.8451412005