Title: Survey on test collections and techniques for personal name matching

Authors: Patrick Reuther, Bernd Walter

Addresses: Department for Databases and Information Systems (DBIS), University of Trier, 54296 Trier, Germany. ' Department for Databases and Information Systems (DBIS), University of Trier, 54296 Trier, Germany

Abstract: This paper gives an overview of personal name matching. Personal name matching is of great importance for all applications that deal with personal names. The problem with personal names is that they are not unique and sometimes even for one name many variations exist. This leads to the fact that databases on the one hand may have several entries for one and the same person and on the other hand have one entry for many different persons. For the evaluation of personal name matching algorithms, test collections are of great importance. This paper gives an overview of existing test collections and presents two new test collections based on real-world bibliographic data. Additionally, state-of-the art techniques and a new approach based on semantics are also described.

Keywords: personal name matching; duplicate detection; duplicates; name disambiguation; record linkage; data test collections; social networks; co-authorship networks; personal names; semantics.

DOI: 10.1504/IJMSO.2006.011006

International Journal of Metadata, Semantics and Ontologies, 2006 Vol.1 No.2, pp.89 - 99

Published online: 03 Oct 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article