Title: Linking science: approaches for linking scientific publications across different LOD repositories

Authors: Arben Hajra; Klaus Tochtermann

Addresses: South East European University (SEEU), Tetovo/Skopje, Macedonia; Leibniz Information Centre for Economics (ZBW), Kiel/Hamburg, Germany ' Leibniz Information Centre for Economics (ZBW), Kiel/Hamburg, Germany

Abstract: Enriching the content of a digital library (DL) with additional information from other DLs and domains would facilitate the scholarly communication, scientific findings, and knowledge distribution. The implementation of semantic technologies by interlinking resources results in a new vision for interoperability among different DLs. Therefore, this research explores bibliographic Linked Open Data (LOD) repositories by investigating alignments among them. The application of global unigrams frequency is applied for determining the importance of terms on the set of metadata. The semantic relatedness of the retrieved publications is measured by comparing two main approaches with one another: Vector Space Model through TF-IDF and Cosine Similarity, versus a Deep Learning approach through Word2Vec implementation of Word Embeddings. In summary, they are performing with 40.5% difference, concerning the outcome of relevant retrieved publications. In addition to the given metadata, word embeddings achieve a better performance for short texts, such as publications titles.

Keywords: digital libraries; linked open data; semantic web; word embeddings; data mining; recommender systems.

DOI: 10.1504/IJMSO.2017.090778

International Journal of Metadata, Semantics and Ontologies, 2017 Vol.12 No.2/3, pp.124 - 141

Received: 20 Nov 2017
Accepted: 18 Dec 2017

Published online: 27 Mar 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article