Title: Plagiarism detection based on semantic analysis

Authors: Indrajit Mukherjee; Bipul Kumar; Samarth Singh; Kishan Sharma

Addresses: Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215, India ' Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215, India ' Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215, India ' Department of Computer Science and Engineering, BIT Mesra, Ranchi-835215, India

Abstract: Plagiarism means copy and paste for a text or change in some words or make use of synonymous or near synonymous words without citing the source. Plagiarism is on rise especially in the academic and research field due the availability of the digital text documents in the internet which can easily be copied and pasted. Existing approaches for detecting the plagiarism have either ignored or made limited use of information about semantic similarities between the words. We proposed a method to measure the semantic similarity between the documents by mapping keywords (verbs; adverbs; adjectives; descriptors; etc.) with the nouns and then finding the similarity between the mapped words that can rectify the existing shortcomings. The efficiency of the algorithm is evaluated on the dataset (corpus of Plagiarised Short Answers) (Clough and Stevenson, 2011). The experiments showed that the proposed algorithm gives significantly accurate results in detecting semantic based similarity between the documents and found to outperform previously published methods.

Keywords: semantic similarity; plagiarism detection; documents; WordNet.

DOI: 10.1504/IJKL.2018.092316

International Journal of Knowledge and Learning, 2018 Vol.12 No.3, pp.242 - 254

Received: 15 Jun 2017
Accepted: 16 Jan 2018

Published online: 14 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article