Statistical Analysis and Evaluation of Feature Selection Techniques and implementing Machine Learning Algorithms to Predict the Crop Yield using Accuracy Metrics

Smaranika Mohapatra1

Neha Chaudhary2,Email

1Department of Information Technology, Manipal University Jaipur, Jaipur,303007, Rajasthan, India.

2Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur,303007, Rajasthan, India.

Abstract

Innovations in the agriculture sector help to increase the yield of farmland to liberalize the market economy. Crop-yield prediction by using selected features will help to develop and sustainably increase food production. In this study, feature selection was sued to integrate with the prediction of crop–yield. The accuracy metric obtained using a Decision tree is 87.8 using the Random Forest importance method and Forward feature selection, Random Forest has given the Random Forest importance technique as 90.9 and using Forward feature selection as 91.6. Support Vector Regressor results in an accuracy score of 86.3 using Forward Feature selection and 82.5 using the random Forest importance method. Multiple Linear Regression does not perform well giving an accuracy metric of up to 61.6 using information gain and 61.2 through a Co-relation Heat Map. K-Nearest Neighbor results to 86.3 using Backward Feature selection and 84.8 using Information Gain methods. According to the results obtained, the Random Forest regressor performs better where the R2-metric is evaluated as an average of 89.98 kg ha-1. Our approach of using the feature selection technique has helped to use significant features to predict the crop yield and the performance of the models was also enhanced.

Statistical Analysis and Evaluation of Feature Selection Techniques and implementing Machine Learning Algorithms to Predict the Crop Yield using Accuracy Metrics