Title: A best-effort integration framework for imperfect information spaces
Authors: Ashraf Jaradat; Ahmed Abu Halimeh; Aziz Deraman; Fadi Safieddine
Addresses: College of Business Administration, American University of the Middle East (AUM), Egaila, Kuwait ' College of Engineering and Technology, American University of the Middle East (AUM), Egaila, Kuwait ' School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu, 21030 Kuala Terengganu, Terengganu, Malaysia ' College of Business Administration, American University of the Middle East (AUM), Egaila, Kuwait
Abstract: Entity resolution (ER) with imperfection management has been accepted as a major aspect while integrating heterogeneous information sources that exhibit entities with varied identifiers, abbreviated names, and multi-valued attributes. Many of novel integration applications such as personal information management and web-scale information management require the ability to represent and manipulate imperfect data. This requirement signifies the issues of starting with imperfect data to the production of probabilistic database. However, classical data integration (CDI) framework fails to cope with such requirement of explicit imperfect information management. This paper introduces an alternative integration framework based on the best-effort perspective to support instance integration automation. The new framework explicitly incorporates probabilistic management to the ER tasks. The probabilistic management includes a new probabilistic global entity, a new pair-wise-source-to-target ER process, and probabilistic decision model logic as alternatives. Together, the paper presents how these processes operate to support the current heterogeneous sources integration challenges.
Keywords: data integration; information integration; uncertainty management; best-effort integration framework; probabilistic instance integration; data quality.
DOI: 10.1504/IJIIDS.2018.096592
International Journal of Intelligent Information and Database Systems, 2018 Vol.11 No.4, pp.296 - 314
Received: 15 Oct 2017
Accepted: 12 Apr 2018
Published online: 06 Dec 2018 *