Title: Investigation and comparative analysis of data mining techniques for the prediction of crop yield

Authors: Kanwal Preet Singh Attwal; Amardeep Singh Dhiman

Addresses: Department of Computer Science and Engineering, Punjabi University, Patiala, India ' Department of Computer Science and Engineering, Punjabi University, Patiala, India

Abstract: Crop yield is affected by climatic, management, geographical, biological and other such factors. Data mining techniques can be used to analyse the effect of these factors on crop yield and to predict crop yield based on these factors. The current paper focuses on the sequence of steps to be followed in data mining process for prediction of crop yield - starting from the determination of research goals to the application of the data mining techniques to build a model. The study applies the defined data mining process to build a model for the prediction of paddy yield based on different climatic factors. The current research also provides an insight to the different metrics that can be used to evaluate various supervised data mining techniques. The metrics have been divided into three categories - threshold evaluation metrics, numerical evaluation metric, and built time and size metrics. Comparative analysis of five supervised data mining techniques has been carried out on the basis of their performance in these three categories of metrics.

Keywords: agricultural data mining; yield prediction; data mining process; data mining tasks; data mining techniques; classification techniques; classification evaluation metrics.

DOI: 10.1504/IJSAMI.2020.106540

International Journal of Sustainable Agricultural Management and Informatics, 2020 Vol.6 No.1, pp.43 - 74

Received: 01 Aug 2019
Accepted: 31 Aug 2019

Published online: 09 Apr 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article