Variable selection in model-based clustering is often used to improve cluster identification. However, available algorithms need to operate on a large search space and, therefore, can be time consuming. Following the recent surge of interest in distributed processing, in this contribution we discuss the implementation of a parallel algorithm for variable selection in R. We conducted a simulation study to assess the performance of the proposed parallel variable selection algorithm. The results show that the increase of speed reached follows the well-known Amdahl's Law for the speedup achievable when using multiple processors.

On the implementation of a parallel algorithm for variable selection in model-based clustering

SCRUCCA, Luca
2013

Abstract

Variable selection in model-based clustering is often used to improve cluster identification. However, available algorithms need to operate on a large search space and, therefore, can be time consuming. Following the recent surge of interest in distributed processing, in this contribution we discuss the implementation of a parallel algorithm for variable selection in R. We conducted a simulation study to assess the performance of the proposed parallel variable selection algorithm. The results show that the increase of speed reached follows the well-known Amdahl's Law for the speedup achievable when using multiple processors.
2013
9788867871179
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1155893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact