Distribution Free Decomposition of Multivariate Data
Dorin Comaniciu and Peter Meer
Department of Electrical and Computer Engineering
Rutgers University, Piscataway, NJ 08855, USA
We present a practical approach to nonparametric cluster analysis of
large data sets. The number of clusters and the cluster centers are
automatically derived by mode seeking with the mean shift procedure on
a reduced set of points randomly selected from the data. The cluster
boundaries are delineated using a k-nearest neighbor technique. The
proposed algorithm is stable and efficient, a 10000 point data set
being decomposed in only a few seconds. Complex clustering examples
and applications are discussed, and convergence of the gradient ascent
mean shift procedure is demonstrated for arbitrary distribution and
cardinality of the data.
Appeared as invited paper in
Pattern Analysis and Applications, 2, 22-30, 1999.
Extended version of the paper from
2nd International Workshop on Statistical Techniques
in Pattern Recognition, Sydney, Australia, August 1998.
Proceedings:
Advances in Pattern Recognition.
A. Amin, D. Dori, P. Pudil, H. Freeman (Eds),
Lecture Notes in Computer Science 1451, Springer, 602-610.
Return to Research: Robust analysis of visual data