We Can Rely on ChatGPT as an Educational Tutor: A Cross-Sectional Study of its Performance, Accuracy, and Limitations in University Admission Tests

Saul Beltozar-Clemente; Enrique Díaz-Vega; Joselyn Zapata-Paulini; Raul Enrique Tejeda-Navarrete

doi:10.3991/ijep.v14i1.46787

We Can Rely on ChatGPT as an Educational Tutor: A Cross-Sectional Study of its Performance, Accuracy, and Limitations in University Admission Tests

Authors

Saul Beltozar-Clemente Dirección de cursos básicos, Universidad Científica del Sur, Lima, Perú https://orcid.org/0000-0002-3742-6326
Enrique Díaz-Vega Departamento de ciencias, Universidad Privada del Norte, Lima, Perú
Joselyn Zapata-Paulini Escuela de Posgrado, Universidad Continental, Lima, Perú https://orcid.org/0000-0002-4500-5249
Raul Enrique Tejeda-Navarrete Departamento de Ciencias, Universidad Tecnológica del Perú, Lima, Perú

DOI:

https://doi.org/10.3991/ijep.v14i1.46787

Keywords:

ChatGPT, performance, entrance exams, university

Abstract

The aim of this research was to evaluate the performance of ChatGPT in answering multiple-choice questions without images in the entrance exams to the National University of Engineering (UNI) and the Universidad Nacional Mayor de San Marcos (UNMSM) over the past five years. In this prospective exploratory study, a total of 1182 questions were gathered from the UNMSM exams and 559 questions from the UNI exams, encompassing a wide range of topics including academic aptitude, reading comprehension, humanities, and scientific knowledge. The results indicate a significant (p < 0.001) and higher proportion of correct answers for UNMSM, with 72% (853/1182) of questions answered correctly. In contrast, there is no significant difference (p = 0.168) in the proportion of correct and incorrect answers for UNI, with 52% (317/552) of questions answered correctly. Similarly, in the World History course (p = 0.037), ChatGPT achieved its highest performance at a general level, with an accuracy of 91%. However, this was not the case in the language course (p = 0.172), where it achieved the lowest score of 55%. In conclusion, to fully harness the potential of ChatGPT in the educational setting, continuous evaluation of its performance, ongoing feedback to enhance its accuracy and minimize biases, and tailored adaptations for its use in educational settings are essential.

Downloads

Published

2024-01-30

How to Cite

Beltozar-Clemente, S., Díaz-Vega, E., Zapata-Paulini, J., & Tejeda-Navarrete, R. E. (2024). We Can Rely on ChatGPT as an Educational Tutor: A Cross-Sectional Study of its Performance, Accuracy, and Limitations in University Admission Tests. International Journal of Engineering Pedagogy (iJEP), 14(1), pp. 50–60. https://doi.org/10.3991/ijep.v14i1.46787