In recent years, the growing availability of auxiliary information in inference for finite populations has motivated the development of estimators that incorporate such information that are alternative to the unbiased Horvitz-Thompson estimator. In a design-based estimation framework, such estimators aim at increasing precision at the expenses of the introduction of a little bias. In the past decades, two main approaches have been used to this end: generalized regression (GREG) estimation and calibration (CAL) estimation. Each of these two approaches provides a whole class of estimators that share some similarities, but that can also be quite different. The former is explicitly assisted by a superpopulation model. Estimators that belong to the latter, on the other hand, are developed without any explicit reference to an assisting model, but it is well-known that linear models that can be associated with some of them. In this set of talks we will first review the extension of the classical GREG and CAL estimators to a wider set of assisting parametric models (non-linear, generalized linear and mixed models). We will then look closely at estimators in these two frameworks that are assisted by or associated to nonparametric regression models. These are models in which the relationship between the variable of interest and one (or more) auxiliary variable(s) does not have a pre-specified parametric form, but is left undefined and learnt from the data. In particular, we will consider GREG and CAL estimators that use the following nonparametric regression techniques to approximate such relationship(s): Kernel and Local Polynomials, Generalized Additive Models, penalized Splines, Generalized Additive Mixed Models, Neural Networks. Most, but not all, of the estimators considered require complete auxiliary information (i.e. the value of a set of auxiliary variables has to be known for each unit in the population). The use of nonparametric regression techniques has also been introduced in surveys in those fields in which statistical models play an important role. We will then review the use of nonparametric regression models for treatment of nonresponse and measurement error, and, in a model-dependent framework, for small area estimation. An illustration of the methods discussed will be provided on data from an environmental survey of lakes in the North-Eastern US conducted by EPA, from the Labor Force Survey conducted by ISTAT, and from the Italian Survey of Households’ Income and Wealth conducted by the Bank of Italy.

Nonparametric regression in inference for finite populations

RANALLI, Maria Giovanna
2011

Abstract

In recent years, the growing availability of auxiliary information in inference for finite populations has motivated the development of estimators that incorporate such information that are alternative to the unbiased Horvitz-Thompson estimator. In a design-based estimation framework, such estimators aim at increasing precision at the expenses of the introduction of a little bias. In the past decades, two main approaches have been used to this end: generalized regression (GREG) estimation and calibration (CAL) estimation. Each of these two approaches provides a whole class of estimators that share some similarities, but that can also be quite different. The former is explicitly assisted by a superpopulation model. Estimators that belong to the latter, on the other hand, are developed without any explicit reference to an assisting model, but it is well-known that linear models that can be associated with some of them. In this set of talks we will first review the extension of the classical GREG and CAL estimators to a wider set of assisting parametric models (non-linear, generalized linear and mixed models). We will then look closely at estimators in these two frameworks that are assisted by or associated to nonparametric regression models. These are models in which the relationship between the variable of interest and one (or more) auxiliary variable(s) does not have a pre-specified parametric form, but is left undefined and learnt from the data. In particular, we will consider GREG and CAL estimators that use the following nonparametric regression techniques to approximate such relationship(s): Kernel and Local Polynomials, Generalized Additive Models, penalized Splines, Generalized Additive Mixed Models, Neural Networks. Most, but not all, of the estimators considered require complete auxiliary information (i.e. the value of a set of auxiliary variables has to be known for each unit in the population). The use of nonparametric regression techniques has also been introduced in surveys in those fields in which statistical models play an important role. We will then review the use of nonparametric regression models for treatment of nonresponse and measurement error, and, in a model-dependent framework, for small area estimation. An illustration of the methods discussed will be provided on data from an environmental survey of lakes in the North-Eastern US conducted by EPA, from the Labor Force Survey conducted by ISTAT, and from the Italian Survey of Households’ Income and Wealth conducted by the Bank of Italy.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1030092
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact