OLAP textual aggregation approach using the Google similarity distance
by Mustapha Bouakkaz; Sabile Loudcher; Youcef Ouinten
International Journal of Business Intelligence and Data Mining (IJBIDM), Vol. 11, No. 1, 2016

Abstract: Data warehousing and online analytical processing (OLAP) are essential elements to decision support. In the case of textual data, decision support requires new tools, mainly textual aggregation functions, for better and faster high level analysis and decision making. Such tools will provide textual measures to users who wish to analyse documents online. In this paper, we propose a new aggregation function for textual data in an OLAP context based on the K-means method. This approach will highlight aggregates semantically richer than those provided by classical OLAP operators. The distance used in K-means is replaced by the Google similarity distance which takes into account the semantic similarity of keywords for their aggregation. The performance of our approach is analysed and compared to other methods such as Topkeywords, TOPIC, TuBE and BienCube. The experimental study shows that our approach achieves better performances in terms of recall, precision, F-measure complexity and runtime.

Online publication date: Fri, 06-May-2016

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Business Intelligence and Data Mining (IJBIDM):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com