2002 UnsupervisedFeatSelUsingFeatureSim

Subject Headings: Feature Selection Algorithm.

Notes

Index Terms: Feature selection, classification, clustering, categorizing framework, unifying platform, real-world applications.

In this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, therefore, is fast. A new feature similarity measure, called maximum information compression index, is introduced. The algorithm is generic in nature and has the capability of multiscale representation of data sets. The superiority of the algorithm, in terms of speed and performance, is established extensively over various real-life data sets of different sizes and dimensions. It is also demonstrated how redundancy and information loss in feature selection can be quantified with an entropy measure.
… both the proposed clustering algorithm and the newly introduced feature similarity measure is geared toward two goals: minimizing the information loss (in terms of second order statistics) incurred in the process of feature reduction and minimizing the redundancy present in the reduced features subset.

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2002 UnsupervisedFeatSelUsingFeatureSim	Pabitra Mitra C. A. Murthy Sankar K. Pal			Unsupervised Feature Selection Using Feature Similarity				10.1109/34.990133