Title: Cross-lingual analysis of English and Chinese web search

Authors: Peiguang Lin; Tong Zhang; Menglong Xia; Jin Zhou; Peiyao Nie

Addresses: School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, 250001, China ' School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510000, China ' Faculty of Hospitality and Tourism Management, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, 999078, China ' Shandong Provincial Key Laboratory of Network-Based Intelligent Computing, University of Jinan, Jinan, 250001, China ' School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, 250001, China

Abstract: There is a growing number of the non-English Web in recent years. So the language-dependent and user-based search paradigms are becoming increasingly important for search engines. Unfortunately, most of the works are available on web search analysis are still English-based. In order to understand the behavioural commonality and distinction of non-English users, we propose a framework for analysing the web search behaviours of users in a cross-lingual context. This framework is composed of 10 factors, which can be applied at the query level, session level and corpus level respectively. The integral employment of these factors could help us with characterising the user behaviour of web search, even in different languages, with regard to both statistical and semantic perspectives. This framework shows a better efficiency not only in revealing the commonality and distinction of web search, but also in informing the design of search paradigms in a cross-lingual scenario.

Keywords: cross-lingual analysis; web search analysis; search query; POS distribution; search session; session entropy; query reformulation; click graph analysis; query features; web search burstiness.

DOI: 10.1504/IJWGS.2018.095663

International Journal of Web and Grid Services, 2018 Vol.14 No.4, pp.376 - 399

Received: 07 Sep 2017
Accepted: 22 Nov 2017

Published online: 13 Oct 2018 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article