On the implementation of a parallel algorithm for variable selection in model-based clustering

Scrucca, Luca

Variable selection in model-based clustering is often used to improve cluster identification. However, available algorithms need to operate on a large search space and, therefore, can be time consuming. Following the recent surge of interest in distributed processing, in this contribution we discuss the implementation of a parallel algorithm for variable selection in R. We conducted a simulation study to assess the performance of the proposed parallel variable selection algorithm. The results show that the increase of speed reached follows the well-known Amdahl's Law for the speedup achievable when using multiple processors.