Finite mixture of Gaussian distributions provide a flexible semiparametric methodology for density estimation when the continuous variables under investigation have no boundaries. However, in practical applications, variables may be partially bounded (e.g., taking nonnegative values) or completely bounded (e.g., taking values in the unit interval). In this case, the standard Gaussian finite mixture model assigns nonzero densities to any possible values, even to those outside the ranges where the variables are defined, hence resulting in potentially severe bias. In this paper, we propose a transformation-based approach for Gaussian mixture modelling in case of bounded variables. The basic idea is to carry out density estimation not on the original data but on appropriately transformed data. Then, the density for the original data can be obtained by a change of variables. Both the transformation parameters and the parameters of the Gaussian mixture are jointly estimated by the expectation-maximization (EM) algorithm. The methodology for partially and completely bounded data is illustrated using both simulated data and real data applications.

A transformation-based approach to Gaussian mixture density estimation for bounded data

Scrucca L.
2019

Abstract

Finite mixture of Gaussian distributions provide a flexible semiparametric methodology for density estimation when the continuous variables under investigation have no boundaries. However, in practical applications, variables may be partially bounded (e.g., taking nonnegative values) or completely bounded (e.g., taking values in the unit interval). In this case, the standard Gaussian finite mixture model assigns nonzero densities to any possible values, even to those outside the ranges where the variables are defined, hence resulting in potentially severe bias. In this paper, we propose a transformation-based approach for Gaussian mixture modelling in case of bounded variables. The basic idea is to carry out density estimation not on the original data but on appropriately transformed data. Then, the density for the original data can be obtained by a change of variables. Both the transformation parameters and the parameters of the Gaussian mixture are jointly estimated by the expectation-maximization (EM) algorithm. The methodology for partially and completely bounded data is illustrated using both simulated data and real data applications.
2019
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1452519
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact