what's new

site map

research activities

undergraduate study

departmental resources

 

 

 

540:691 SEMINAR IN INDUSTRIAL & SYSTEMS ENGINEERING

 

Challenge of Dimensionality in
Model Selection and Classification

Dr. Jianqing Fan
Princeton University

Abstract:

Model selection and classification using high-dimensional features arise frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The connections with the sure independent screening (SIS) and iterative SIS(ISIS) of Fan and Lv (2007) in model selection will be elucidated and extended. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.


TUESDAY, February 19, 2008
SEMINAR 5:00 - 6:00 pm
CoRE – Lecture Hall


*Refreshments will be served in the IE lounge area at 4:30 prior to the seminar.

Speaker is hosted by Dr. Hoang Pham

Tel: 732-445-5471, Email: hopham@rci.rutgers.edu


 





 



Top

CoRE Building

 


Spring 2008 Seminars