Title: A web-based plagiarism detection method for student reports using intrinsic analysis
Authors: Maryam Elamine; Lamia Hadrich Belguith
Addresses: MIRACL Laboratory, Faculty of Economics and Management of Sfax, University of Sfax, Tunisia ' MIRACL Laboratory, Faculty of Economics and Management of Sfax, University of Sfax, Tunisia
Abstract: With the advent of complex language models and the massive amount of data available on the web, students have had an easier time committing plagiarism. This research describes a web-based system for identifying plagiarism in student reports using intrinsic analysis. To detect plagiarism, we use a combination of stylistic and semantic features as well as a similarity matching technique. We experimented with a dataset of scientific papers mostly published in French, the predominant language in our institutions. Our plagiarism detection method examines the writing style of suspect documents, locates relevant sources on the internet, and compares them to the suspicious documents using external text matching. The preliminary results are promising, with our intrinsic and extrinsic methods reaching an F-score of 40.3% and 89% accuracy, respectively.
Keywords: web-based plagiarism detection; intrinsic analysis; writing style analysis; student reports; semantic analysis; text-matching; plagiarism in education.
DOI: 10.1504/IJDMMM.2025.150989
International Journal of Data Mining, Modelling and Management, 2025 Vol.17 No.4, pp.480 - 496
Received: 14 May 2024
Accepted: 07 Oct 2024
Published online: 07 Jan 2026 *