Model-based clustering assumes that the observed data can be represented by a finite mixture model, where each cluster is represented by a parametric distribution. In the multivariate continuous case the Gaussian distribution is often employed. Identifying the subset of relevant clustering variables allows to achieve parsimony of unknown parameters, thus yielding more efficient estimation, clearer interpretation, and, often, better clustering partitions. This paper discusses variable or feature selection for model-based clustering. The problem of subset selection is recast as a model comparison problem, and BIC is used to approximate Bayes factors. Searching over the potentially vast solution space is performed through genetic algorithms, which are stochastic search algorithms that use techniques and concepts inspired by evolutionary biology and natural selection.

Genetic algorithms for subset selection in model-based clustering

SCRUCCA, Luca
2010

Abstract

Model-based clustering assumes that the observed data can be represented by a finite mixture model, where each cluster is represented by a parametric distribution. In the multivariate continuous case the Gaussian distribution is often employed. Identifying the subset of relevant clustering variables allows to achieve parsimony of unknown parameters, thus yielding more efficient estimation, clearer interpretation, and, often, better clustering partitions. This paper discusses variable or feature selection for model-based clustering. The problem of subset selection is recast as a model comparison problem, and BIC is used to approximate Bayes factors. Searching over the potentially vast solution space is performed through genetic algorithms, which are stochastic search algorithms that use techniques and concepts inspired by evolutionary biology and natural selection.
2010
978-88-6129-566-7
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/172893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact