Students’ assessment tests are routinely validated through item response theory (IRT) models which assume unidimensionality and absence of observable differential item functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to lower secondary school students: the Language Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students’ gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for observable DIF effects for both tests. Besides, the assumption of unidimensionality is rejected for the Language Test, whereas it is reasonable for the Mathematics Test.

Joint assessment of the latent trait dimensionality and observed differential item functioning of students’ national tests

GNALDI, MICHELA;BACCI, Silvia
2016

Abstract

Students’ assessment tests are routinely validated through item response theory (IRT) models which assume unidimensionality and absence of observable differential item functioning (DIF). In this paper, we investigate if such assumptions hold for two national tests administered in Italy to lower secondary school students: the Language Test and the Mathematics Test. To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students’ gender and geographical area. A classification of the items into unidimensional groups is also proposed and represented by a dendrogram, which is obtained from a hierarchical clustering algorithm. The results provide evidence for observable DIF effects for both tests. Besides, the assumption of unidimensionality is rejected for the Language Test, whereas it is reasonable for the Mathematics Test.
2016
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1355904
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 3
social impact