Projection based Robust Estimators for Computer Vision
Ph.D. Thesis Haifeng Chen
Abstract
Robust regression is the generic name of techniques which estimate
regression models in the presence of significant number
of data points not belonging to that model, i.e., outliers.
In most computer vision applications where the data is only rarely
homogeneous,
tolerance to outliers is a necessary condition for the satisfactory
performance.
Often multiple structures are present in the data, for example, when
analyzing
a dynamic scene taken by a moving camera. Related to the structure
of interest, the inliers of the other structures are called
``structured outliers''. The purpose of this thesis is two-fold.
First, we analyze the conditions necessary to achieve a robust
behavior
when solving computer vision tasks. Second, we develop a new family
of robust regression techniques which exhibit a performance
superior to the current methods.
The theory of robust estimators was developed in the last thirty years
in statistics and the two main classes of methods,
M-estimators and least median of squares (LMedS), were successfully
applied to many vision problems. However, some of the ``most'' robust
techniques used in the vision community,
Hough transform and RANSAC , are innate and %\cite{fischler81}
were developed independently to meet
the specific needs of processing visual data.
We show that all four classes of robust estimation techniques can be
regarded as particular cases of M-estimators with auxiliary scale.
The robust techniques imported from statistics
differ from those developed by the vision community
in the way the scale is obtained.
For the former the scale is estimated from the data,
while for the latter its value is set a priori.
In most cases, the success of a robust method depends
on the reliability of the scale.
Whenever the employed scale is not correct,
the robust regression may fail to recover the inlier structure.
One main contribution of the thesis is to develop a new family of
robust
regression techniques in which performance is much less conditioned
on the accuracy of additional information such as scale. We
reinterpret
M-estimation within the projection pursuit
framework and show how the need for a reliable scale estimate
can be avoided. Projection pursuit seeks ``interesting''
low-dimensional projections of multidimensional data.
The informative value of a projection is measured with a projection
index,
which is a scalar function of the probability distribution
estimated from the projected data.
The ``best'' projection corresponds to an extremum of the projection
index.
To obtain this extremum, probabilistic sampling technique customary
in some robust regression methods is replaced with a simplex based
multidimensional
direct search. The technique, called projection based
M-estimator (pbM-estimator), provides
a satisfactory inlier/outlier dichotomy for a wider range of
contaminations than the traditional M-estimators.
The pbM-estimator has been tested in several experiments both with
synthetic and real data, and gave satisfactory results in spite of
not using any prior knowledge about the data.
A new robust information fusion method is developed based on
nonparametric method for detecting the modes of a multivariate
probability distribution, the mean shift procedure.
The method recovers from data with uncertainties the number and
characteristics
of the sources which generated it.
The uncertainties of the measurements are encoded as the bandwidths of
the
mean shift which achieves much more robust behavior.
The robust fusion technique is successfully applied to solve the
model-based object recognition problem.
Robust regression in the presence of multiple structures is obtained
when the robust fusion module is integrated
with the pbM-estimator. A number of data subsets are randomly selected
based on
the density of data distribution and a probabilistic region growing
procedure. The pbM-estimator is employed in each subset to extract the inliers,
from which the model parameter and its covariance matrix are
estimated.
By performing the robust information fusion, the parameters of each
structure are extracted and the
measurements are segmented.
Several computer vision applications are provided to show the
effectiveness of our algorithm.
The thesis contains 150 pages.
Return to Theses