Title: Unsupervised topics extraction of microblogging posts: a DBpedia-based approach

Authors: Fahd Kalloubi; El Habib Nfaoui; Omar El Beqqali

Addresses: LIIAN Laboratory, Sidi Mohamed Ben Abdellah University Fez, Morocco ' LIIAN Laboratory, Sidi Mohamed Ben Abdellah University Fez, Morocco ' LIIAN Laboratory, Sidi Mohamed Ben Abdellah University Fez, Morocco

Abstract: Automatic extraction of topics has received a great attention in social web as many applications that process social data make use of this technique to extract the central ideas in social media posts. Moreover, these applications must extract entities, link them to entities in a knowledge-base and classify them into a set of topics. However, there are few systems that address the problems of linking and classification together, especially in the context of micro-posts. Furthermore, most of them are supervised. In this paper, we present a novel system for unsupervised topics extraction in micro-posts based on DBpedia which is a community effort to extract structured information from Wikipedia. Our approach leverages the taxonomic nature of DBpedia to process a given tweet with a hierarchical resolution. Finally, to show the effectiveness of our system we compare it with a well known system for social media text.

Keywords: information extraction; unsupervised topic extraction; named entity linking; semantic web; social web; linked open data; natural language processing; NLP; DBpedia; centrality algorithm; tweet annotation; tweets; Twitter; microblogging posts; classification; social media.

DOI: 10.1504/IJCAET.2017.084912

International Journal of Computer Aided Engineering and Technology, 2017 Vol.9 No.3, pp.337 - 350

Received: 18 Jun 2015
Accepted: 20 Oct 2015

Published online: 09 Jul 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article