Data linkage can be used to combine values of the variable of interest from a national survey with values of auxiliary variables obtained from another source, such as a popula- tion register, for use in small area estimation. However, linkage errors can induce bias when fitting regression mod- els; moreover, they can create non-representative outliers in the linked data in addition to the presence of potential representative outliers. In this paper, we adopt a second- ary analyst’s point of view, assuming that limited information is available on the linkage process, and develop small area estimators based on linear mixed models and M-quantile models to accommodate linked data containing a mix of both types of outliers. We illustrate the properties of these small area estimators, as well as estimators of their mean squared error, by means of model-based and design- based simulation experiments. We further illustrate the proposed methodology by applying it to linked data from the European Survey on Income and Living Conditions and the Italian integrated archive of economic and demo- graphic micro data in order to obtain estimates of the aver- age equivalised income for labour market areas in central Italy.

Small area estimation with linked data

Ranalli, M. G.;
2020

Abstract

Data linkage can be used to combine values of the variable of interest from a national survey with values of auxiliary variables obtained from another source, such as a popula- tion register, for use in small area estimation. However, linkage errors can induce bias when fitting regression mod- els; moreover, they can create non-representative outliers in the linked data in addition to the presence of potential representative outliers. In this paper, we adopt a second- ary analyst’s point of view, assuming that limited information is available on the linkage process, and develop small area estimators based on linear mixed models and M-quantile models to accommodate linked data containing a mix of both types of outliers. We illustrate the properties of these small area estimators, as well as estimators of their mean squared error, by means of model-based and design- based simulation experiments. We further illustrate the proposed methodology by applying it to linked data from the European Survey on Income and Living Conditions and the Italian integrated archive of economic and demo- graphic micro data in order to obtain estimates of the aver- age equivalised income for labour market areas in central Italy.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1478719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact