Distribution Free Decomposition of Multivariate Data

Dorin Comaniciu and Peter Meer

Department of Electrical and Computer Engineering
Rutgers University, Piscataway, NJ 08855, USA

We present a practical approach to nonparametric cluster analysis of large data sets. The number of clusters and the cluster centers are automatically derived by mode seeking with the mean shift procedure on a reduced set of points randomly selected from the data. The cluster boundaries are delineated using a k-nearest neighbor technique. The proposed algorithm is stable and efficient, a 10000 point data set being decomposed in only a few seconds. Complex clustering examples and applications are discussed, and convergence of the gradient ascent mean shift procedure is demonstrated for arbitrary distribution and cardinality of the data.

Appeared as invited paper in Pattern Analysis and Applications, 2, 22-30, 1999.
Extended version of the paper from 2nd International Workshop on Statistical Techniques in Pattern Recognition, Sydney, Australia, August 1998. Proceedings: Advances in Pattern Recognition. A. Amin, D. Dori, P. Pudil, H. Freeman (Eds), Lecture Notes in Computer Science 1451, Springer, 602-610.
Return to Research: Robust analysis of visual data       Return to List of Publications
Download the paper