Dissertation (December 15, 2025): Vanessa Santos Soares

Student: Vanessa Santos Soares

Title: Avaliação de modelos de aprendizado de máquina para a correção automática de redações segundo as competências do ENEM

Advisors: Eduardo Bezerra da Silva (advisor) and Gustavo Paiva Guedes e Silva (co-advisor)

Committee: Eduardo Bezerra da Silva (Cefet/RJ), Gustavo Paiva Guedes e Silva (Cefet/RJ), Diego Moreira de Araújo Carvalho (Cefet/RJ) e Geraldo Bonorino Xexéo (UFRJ)

Day/Hour: December 15, 2025 / 10 a.m.

Room: Auditório V

Abstract: With the growth of remote education and the implementation of large-scale exams such as ENEM, the automation of essay grading has become an increasing necessity. This work investigates different machine learning strategies for the automatic evaluation of essays written in Portuguese, based on the five assessment competencies defined by ENEM. A total of 9,599 essays were analyzed, collected from the Vestibular Brasil Escola portal, covering 102 topics published between 2009 and 2024. Two main approaches are compared: (i) traditional methods based on TF-IDF and linguistically engineered features extracted from the texts, and (ii) pre-trained language models with fine-tuning (XLM-RoBERTa with LoRA). Model performance is evaluated using the Quadratic Weighted Kappa (QWK) metric, which measures agreement with human raters. The study aims to demonstrate that pre-trained models provide significant improvements in robustness and reliability, outperforming feature-engineering-based approaches. This research contributes to the advancement of Automatic Essay Scoring (AES) in Portuguese by offering a benchmark and comparative analysis that can support future studies and educational applications.