NLP-Enhanced Techniques for Cheating Detection in Virtual Exams: A Comparative Study of String and Semantic Similarity Measures with K-Shingling, Minhashing, LSH, and K-Means

Authors

DOI:

https://doi.org/10.3991/ijim.v19i03.49897

Keywords:

Cheating detection, Online Exams, String-based Similarity, Semantic-based Similarity, K-shingling, Winnowing, Minhashing, Locality Sensitive Hashing (LSH), K-means.

Abstract


As online learning gains popularity, the issue of cheating becomes a topic of interest in discussions and scientific papers about virtual and distance education. The shift from in-person to online exams has raised many concerns about its potential to make cheating easier. The best way to detect cheating in online exams is to compute the closeness between answers. In this paper, we introduce a new approach based on similarity measure techniques from text mining to analyze answers in order to identify common patterns among academic responses. For this purpose, we have improved the K-Shingling-Minhashing-locality sensitive hashing (LSH) technique by incorporating the K-means clustering step, and then we applied our approach to a use case to efficiently find similar answers in a large dataset. Comparing our results with related works underscores the efficiency and applicability of our method in educational contexts, offering a novel contribution to the field of text similarity detection by highlighting the reliability and consistency of mathematical methods.

Downloads

Published

2025-02-12

How to Cite

El Rhezzali, N., Hilal, I., & Hnida, M. (2025). NLP-Enhanced Techniques for Cheating Detection in Virtual Exams: A Comparative Study of String and Semantic Similarity Measures with K-Shingling, Minhashing, LSH, and K-Means. International Journal of Interactive Mobile Technologies (iJIM), 19(03), pp. 56–72. https://doi.org/10.3991/ijim.v19i03.49897

Issue

Section

Papers