We consider the problem of classifying a future observation for the categorical response variable Y at the given value of a p-dimensional vector X of predictors. This is the typical supervised learning scenario. Sufficient dimension reduction (SDR) methods aim at replacing X with a lower-dimensional function R(X) with no loss of information. In regression or classification problems we look for the reduced features set R(X) such that Y is independent from X given R(X). In the classification context the above conditional independence statement implies that no discriminatory information will be lost if classifiers are restricted to R(X). Among the several methods proposed for SDR, one of the most popular is sliced inverse regression (SIR, Li, 1991). SIR is a moment-based SDR method which gain insights into the SDR subspace of a regression by the first inverse conditional moment. Given that SIR does not assume any distribution for the predictors, either marginally or conditionally on the response, there is no strightforward method available for prediction or classification. An extension to SIR using finite mixtures of Gaussian densities to approximate the conditional distribution of the predictors has been recently proposed by (Scrucca, 2011). In this contribution we investigate its behaviour for classification purposes. We found that the proposed approach has two main advantages: (i) observations can be graphically represented on the estimated reduced projection subspace; (ii) the selection of the dimension of such a subspace is naturally obtained by minimizing an estimate of the classification error.
Classification on a Dimension Reduced Subspace
SCRUCCA, Luca
2011
Abstract
We consider the problem of classifying a future observation for the categorical response variable Y at the given value of a p-dimensional vector X of predictors. This is the typical supervised learning scenario. Sufficient dimension reduction (SDR) methods aim at replacing X with a lower-dimensional function R(X) with no loss of information. In regression or classification problems we look for the reduced features set R(X) such that Y is independent from X given R(X). In the classification context the above conditional independence statement implies that no discriminatory information will be lost if classifiers are restricted to R(X). Among the several methods proposed for SDR, one of the most popular is sliced inverse regression (SIR, Li, 1991). SIR is a moment-based SDR method which gain insights into the SDR subspace of a regression by the first inverse conditional moment. Given that SIR does not assume any distribution for the predictors, either marginally or conditionally on the response, there is no strightforward method available for prediction or classification. An extension to SIR using finite mixtures of Gaussian densities to approximate the conditional distribution of the predictors has been recently proposed by (Scrucca, 2011). In this contribution we investigate its behaviour for classification purposes. We found that the proposed approach has two main advantages: (i) observations can be graphically represented on the estimated reduced projection subspace; (ii) the selection of the dimension of such a subspace is naturally obtained by minimizing an estimate of the classification error.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.