We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of t -distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering the extent to which group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data, where it outperforms its Gaussian analogue.
Dimension reduction for model-based clustering via mixtures of multivariate t -distributions
SCRUCCA, Luca
2013
Abstract
We introduce a dimension reduction method for model-based clustering obtained from a finite mixture of t -distributions. This approach is based on existing work on reducing dimensionality in the case of finite Gaussian mixtures. The method relies on identifying a reduced subspace of the data by considering the extent to which group means and group covariances vary. This subspace contains linear combinations of the original data, which are ordered by importance via the associated eigenvalues. Observations can be projected onto the subspace and the resulting set of variables captures most of the clustering structure available in the data. The approach is illustrated using simulated and real data, where it outperforms its Gaussian analogue.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.