In this study, we present a novel system for the automatic classification of text complexity in the Italian language, focusing on the phraseological dimension. This quantitative assessment of text complexity is crucial for various applications, including text readability measurement, text simplification, and support for educators during evaluation processes. We use a dataset comprising texts written by Italian L2 learners and classified according to the levels of the Common European Framework of Reference for Languages. The dataset texts serve as a basis for calculating phraseological features, which are then used as input for multiple machine-learning classifiers to compare their performance in predicting proficiency levels. Our experimental results demonstrate that the proposed framework effectively harnesses phraseological complexity features to achieve high classification accuracy in determining proficiency levels.
File in questo prodotto:
Non ci sono file associati a questo prodotto.