A Scalable Code Similarity Detection with Online Architecture and Focused Comparison for Maintaining Academic Integrity in Programming
DOI:
https://doi.org/10.3991/ijoe.v16i10.14289Keywords:
plagiarism detection, scalability, academic integrity, programming, computing educationAbstract
Many code similarity detection techniques have been developed to maintain academic integrity in programming. However, most of them assume that the student programs are locally available, and the computation can be run on any computer specification. Further, their comparison in raising suspicion is time-consuming as the student programs are pairwise compared to one another. This paper proposes a scalable code similarity detection with online architecture and focused comparison. The former enables student programs shared among lecturers and guarantees that the computation is runnable. The latter shortens the execution time as only some students are considered, with inclusion criteria determined by the lecturers. To boost up the scalability, the similarity algorithm is cosine correlation, which computation is linear time. Our evaluation shows that focused comparison leads to fewer comparisons and cosine correlation leads to shorter execution time.