Title: Framework for pattern generation from discriminating datasets

Authors: A. Muthusamy; A. Subramani

Addresses: Department of Computer Science, K.S.R. College of Arts and Science, Tiruchengode, Tamil Nadu, India ' Department of Computer Science, K.S.R. College of Arts and Science, Tiruchengode, Tamil Nadu, India

Abstract: Searching for the exact name of a person on the web is a challenging task when a single name is shared by many people. The proposed method is based on alias detection to extract the lexical pattern, attributes of a person and candidate alias ranking. Initially we construct training semi-structured datasets which consist of alias name or nickname or real name; the profession and location names of a person are framed with the help of social media networks such as Wikipedia. Secondly, pattern generator includes pattern extraction algorithm and attributes extraction algorithm. Pattern extraction is used to generate lexical patterns manually with the use of dataset, and then generated patterns are passed to the Google search engine. According to the evaluation scheme the extracted lexical patterns are ranked. For ranking the candidate alias of a person, a graph mining ranking algorithm with various similarity measures is used. This paper presents the overview of the people search engine, definition, proposed model, training semi-structured dataset and evaluation scheme of manually created lexical patterns.

Keywords: web search; people search; pattern extraction; search engine evaluation; pattern generation; discriminating datasets; search engines; name searching; alias detection; lexical patterns; person attributes; aliases; nicknames; real names; social media; social networks.

DOI: 10.1504/IJCI.2015.071234

International Journal of Collaborative Intelligence, 2015 Vol.1 No.2, pp.115 - 123

Received: 04 Apr 2015
Accepted: 25 May 2015

Published online: 17 Aug 2015 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article