IRIS - Res&Arch Institutional Research Information System - Research & Archive

Natural language processing is undoubtedly one of the most active fields of research in the machine learning community. In this work we propose a supervised classification system that, given in input a text written in the Italian language, predicts its linguistic complexity in terms of a level of the Common European Framework of Reference for Languages (better known as CEFR). The system was built by considering: (i) a dataset of texts labeled by linguistic experts was collected, (ii) some vectorisation procedures which transform any text to a numerical representation, and (iii) the training of a support vector machine’s model. Experiments were conducted following a statistically sound design and the experimental results show that the system is able to reach a good prediction accuracy.

Learning to Classify Text Complexity for the Italian Language Using Support Vector Machines

Santucci V.^{Membro del Collaboration Group};Forti L.^{Membro del Collaboration Group};Santarelli F.^{Membro del Collaboration Group};Spina S.^{Membro del Collaboration Group};Milani A.^{Membro del Collaboration Group}

2020

Abstract

Natural language processing is undoubtedly one of the most active fields of research in the machine learning community. In this work we propose a supervised classification system that, given in input a text written in the Italian language, predicts its linguistic complexity in terms of a level of the Common European Framework of Reference for Languages (better known as CEFR). The system was built by considering: (i) a dataset of texts labeled by linguistic experts was collected, (ii) some vectorisation procedures which transform any text to a numerical representation, and (iii) the training of a support vector machine’s model. Experiments were conducted following a statistically sound design and the experimental results show that the system is able to reach a good prediction accuracy.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Collana o serie
	
				LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
			
	Codice ISBN
	
				978-3-030-58801-4
978-3-030-58802-1
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1476941

Citazioni

ND

44

ND

social impact