Int. J. of Big Data Intelligence   »   2017 Vol.4, No.2

 

 

Title: Optimising the calculation of statistical functions

 

Authors: André Rodrigues; Carla Silva; Paulo Borges; Sérgio Silva; Inês Dutra

 

Addresses:
NLPC Lda., Praça Mouzinho de Albuquerque, 113 – 5º, 4100-359 Porto, Portugal
NLPC Lda., Praça Mouzinho de Albuquerque, 113 – 5º, 4100-359 Porto, Portugal
NLPC Lda., Praça Mouzinho de Albuquerque, 113 – 5º, 4100-359 Porto, Portugal
NLPC Lda., Praça Mouzinho de Albuquerque, 113 – 5º, 4100-359 Porto, Portugal
Department of Computer Science, CRACS INESC TEC and University of Porto, Rua do Campo Alegre, 1021, 4169-007, Porto, Portugal

 

Abstract: Statistical data analysis methods are well-known for their difficulty in handling large number of instances or large number of parameters. In this paper, we study popular and well-known statistical functions, generally applied to data analysis, and assess their performance as implemented by SPSS, MATLAB, R and our own software, DataIP. We use medium to large datasets and show that DataIP outperforms SPSS, MATLAB and R by several orders of magnitude. We argue that the design and implementation of these functions need to be rethought to adapt to today's data challenges.

 

Keywords: statistical data analysis; statistical functions; performance evaluation; SPSS; MATLAB; optimisation.

 

DOI: 10.1504/IJBDI.2017.10002936

 

Int. J. of Big Data Intelligence, 2017 Vol.4, No.2, pp.123 - 138

 

Submission date: 18 Mar 2016
Date of acceptance: 12 Sep 2016
Available online: 01 Feb 2017

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article