Q-grams-imp: an improved q-grams algorithm aimed at edit similarity join
by Yunxia Liu; Zhaobin Liu; Zhiyang Li
International Journal of Computational Science and Engineering (IJCSE), Vol. 18, No. 3, 2019

Abstract: Similarity join is more and more important in many applications and has attracted wide-spread attention from scholars and communities. Similarity join has been used in many applications, such as spell checking, copy detection, entity linking, pattern recognition and so on. Actually, in many web and enterprise scenarios, where typos and misspellings often occur, we need to find an efficient algorithm to handle these situations. In this paper, we propose an improved algorithm on q-grams called q-grams-imp that is aimed at solving edit similarity join. We use this algorithm in order to reduce the number of tokens and thus reduce space costs, so it is fit best for same size strings. But for different sizes of strings, we need to handle these strings in order to fit for the algorithm. Finally, we conclude and get the results that our proposed algorithm is better than the traditional method.

Online publication date: Tue, 26-Mar-2019

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Computational Science and Engineering (IJCSE):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com