Title: Big data: web-crawling and analysing financial news using RapidMiner

Authors: Jesse Lane; Hak J. Kim

Addresses: Hofstra University, Hempstead, NY 11590, USA ' Hofstra University, Hempstead, NY 11590, USA

Abstract: Big data today is one of the hottest topics in the ICT field, but still there are many questions around what it is, what it really means, and how it can be used. This paper presents the notion of big data and then attempts to analyse it using a typical analytics tool, which is called RapidMiner. We use actual real-world social media data as an empirical test. Our preliminary result shows that social media data does not provide valuable meaning to predict the future stock market. However, we believe that the analysis of big data is meaningful if more sophisticated methodology and data collection procedures are used.

Keywords: big data; social media; financial news; RapidMiner; web crawling; news analysis; stock markets; stock market prediction; machine learning; data mining; text mining; predictive analytics; business analytics.

DOI: 10.1504/IJBIS.2015.069064

International Journal of Business Information Systems, 2015 Vol.19 No.1, pp.41 - 57

Available online: 13 Apr 2015 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article