Title: A web-based plagiarism detection method for student reports using intrinsic analysis

Authors: Maryam Elamine; Lamia Hadrich Belguith

Addresses: MIRACL Laboratory, Faculty of Economics and Management of Sfax, University of Sfax, Tunisia ' MIRACL Laboratory, Faculty of Economics and Management of Sfax, University of Sfax, Tunisia

Abstract: With the advent of complex language models and the massive amount of data available on the web, students have had an easier time committing plagiarism. This research describes a web-based system for identifying plagiarism in student reports using intrinsic analysis. To detect plagiarism, we use a combination of stylistic and semantic features as well as a similarity matching technique. We experimented with a dataset of scientific papers mostly published in French, the predominant language in our institutions. Our plagiarism detection method examines the writing style of suspect documents, locates relevant sources on the internet, and compares them to the suspicious documents using external text matching. The preliminary results are promising, with our intrinsic and extrinsic methods reaching an F-score of 40.3% and 89% accuracy, respectively.

Keywords: web-based plagiarism detection; intrinsic analysis; writing style analysis; student reports; semantic analysis; text-matching; plagiarism in education.

DOI: 10.1504/IJDMMM.2025.150989

International Journal of Data Mining, Modelling and Management, 2025 Vol.17 No.4, pp.480 - 496

Received: 14 May 2024
Accepted: 07 Oct 2024

Published online: 07 Jan 2026 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article