Since 1998, AlmaLaurea—a consortium of 80 Italian universities and a member of the Italian National Statistical System—has conducted an annual census on graduates’ employment status. The survey provides estimates of descriptive indicators at both the population level and for specific subpopulations (domains) of interest, such as degree programmes. Some domains have very few observations due to a small population size and non-response. In this paper, we address this estimation problem within a Small Area Estimation framework. Specifically, we propose using generalized linear mixed models that incorporate two variables as proxies for graduates’ response propensity, making the assumption of non-informative non-response more plausible. Degree programme estimates of employment rates are derived as (semi-parametric) empirical best predictions using a finite mixture of logistic regression models, with their mean squared error estimated via a second-order, bias-corrected, analytical estimator. Sensitivity analysis is conducted to assess the explanatory power of variables modelling response propensity and to evaluate potential correlations between area-specific random effects and observed heterogeneity.

When non-response makes estimates from a census a small area estimation problem: the case of the survey on graduates’ employment status in Italy

Ranalli, Maria Giovanna
;
Bartolucci, Francesco;
2025

Abstract

Since 1998, AlmaLaurea—a consortium of 80 Italian universities and a member of the Italian National Statistical System—has conducted an annual census on graduates’ employment status. The survey provides estimates of descriptive indicators at both the population level and for specific subpopulations (domains) of interest, such as degree programmes. Some domains have very few observations due to a small population size and non-response. In this paper, we address this estimation problem within a Small Area Estimation framework. Specifically, we propose using generalized linear mixed models that incorporate two variables as proxies for graduates’ response propensity, making the assumption of non-informative non-response more plausible. Degree programme estimates of employment rates are derived as (semi-parametric) empirical best predictions using a finite mixture of logistic regression models, with their mean squared error estimated via a second-order, bias-corrected, analytical estimator. Sensitivity analysis is conducted to assess the explanatory power of variables modelling response propensity and to evaluate potential correlations between area-specific random effects and observed heterogeneity.
2025
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1597834
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact