Authors: Vuong M. Ngo; Nhien-An Le-Khac; M-Tahar Kechadi
Addresses: Ho Chi Minh City Open University, Ho Hao Hon 35, District 1, HCMC, Vietnam ' School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland ' School of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland
Abstract: The introduction of modern information technologies for collecting and processing agricultural data revolutionise the agriculture practices. The agricultural data mining today is considered a Big Data application in terms of volume, variety, velocity and veracity. Hence, it is a challenge and a key foundation to establishing a crop intelligence platform. The platform, which processes vast amounts of complex and diverse information, will enable efficient resource management and high quality agronomy decision making. In this paper, we designed and implemented a continental level agricultural data warehouse (ADW). ADW is characterised by its (1) flexible schema; (2) data integration from real agricultural multi datasets; (3) data science and business intelligent support; (4) high performance; (5) high storage; (6) security; (7) governance and monitoring; (8) consistency, availability and partition tolerant; (9) cloud compatibility. We also evaluate the performance of ADW and present some complex queries to extract and return necessary knowledge about crop management.
Keywords: data warehouse architecture; constellation schema; Hive; MongoDB; Cassandra; smart agriculture; agricultural data challenges.
International Journal of Business Process Integration and Management, 2020 Vol.10 No.1, pp.17 - 28
Received: 04 Dec 2019
Accepted: 28 Feb 2020
Published online: 19 Feb 2021 *