We consider the problem of dimension reduction for model-based clustering. For continuous data, it is often assumed that they are generated by a finite mixture of multivariate normal components. The proposed method aims at reducing dimensionality by identifying a set of linear combinations of the original features. These estimated directions depend on the fitted mixture model, and they are ordered by importance as quantified by the associated eigenvalues. Observations may then be projected onto such reduced subspace, thus providing summary plots which visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure.
Visualization of model-based clustering structures
SCRUCCA, Luca
2007
Abstract
We consider the problem of dimension reduction for model-based clustering. For continuous data, it is often assumed that they are generated by a finite mixture of multivariate normal components. The proposed method aims at reducing dimensionality by identifying a set of linear combinations of the original features. These estimated directions depend on the fitted mixture model, and they are ordered by importance as quantified by the associated eigenvalues. Observations may then be projected onto such reduced subspace, thus providing summary plots which visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.