Variable selection in model-based clustering is often used to improve cluster identification. However, available algorithms need to operate on a large search space and, therefore, can be time consuming. Following the recent surge of interest in distributed processing, in this contribution we discuss the implementation of a parallel algorithm for variable selection in R. We conducted a simulation study to assess the performance of the proposed parallel variable selection algorithm. The results show that the increase of speed reached follows the well-known Amdahl's Law for the speedup achievable when using multiple processors.
On the implementation of a parallel algorithm for variable selection in model-based clustering
SCRUCCA, Luca
2013
Abstract
Variable selection in model-based clustering is often used to improve cluster identification. However, available algorithms need to operate on a large search space and, therefore, can be time consuming. Following the recent surge of interest in distributed processing, in this contribution we discuss the implementation of a parallel algorithm for variable selection in R. We conducted a simulation study to assess the performance of the proposed parallel variable selection algorithm. The results show that the increase of speed reached follows the well-known Amdahl's Law for the speedup achievable when using multiple processors.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.