Visualization of model-based clustering structures

Scrucca, Luca

We consider the problem of dimension reduction for model-based clustering. For continuous data, it is often assumed that they are generated by a finite mixture of multivariate normal components. The proposed method aims at reducing dimensionality by identifying a set of linear combinations of the original features. These estimated directions depend on the fitted mixture model, and they are ordered by importance as quantified by the associated eigenvalues. Observations may then be projected onto such reduced subspace, thus providing summary plots which visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure.