Int. J. of Big Data Intelligence   »   2015 Vol.2, No.1

 

 

Title: Terms analytics service for CouchDB: a document-based NoSQL

 

Authors: Richard K. Lomotey; Ralph Deters

 

Addresses:
Department of Computer Science, University of Saskatchewan, Saskatoon, Canada
Department of Computer Science, University of Saskatchewan, Saskatoon, Canada

 

Abstract: The reality that the scientific, industry and research communities have to deal with is the potential of 'Big Data'. The high-dimensional data (in digitised format) at our disposal can create opportunities such as discovery of new knowledge, creation of new online communities, and improvement on product and services delivery. The challenge however is that there are only few research, architectural designs and tools that can aid data mining processes from NoSQL databases. By focusing on terms and topic mining, this work proposes a data analytics framework that enables knowledge discovery through information retrieval and filtering from document-based NoSQL (specifically, CouchDB). The tool is algorithmically built and tested based on two methodologies namely: the inference-based apriori and the Baum-Welch algorithm. Preliminary test results of the proposed tool are also discussed based on the accuracy of each proposed algorithm where the inference-based apriori model performs better.

 

Keywords: data mining; NoSQL databases; Bayesian rule; unstructured data; inference-based apriori; hidden Markov model; HMM; Baum-Welch algorithm; analytics-as-a-service; AaaS; big data; data analytics; knowledge discovery; information retrieval; filtering.

 

DOI: 10.1504/IJBDI.2015.067567

 

Int. J. of Big Data Intelligence, 2015 Vol.2, No.1, pp.23 - 36

 

Submission date: 18 May 2014
Date of acceptance: 22 Aug 2014
Available online: 17 Feb 2015

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article