Reinforced XGBoost machine learning model for sustainable intelligent agrarian applications

Abstract

The development in science and technical intelligence has incited to represent an extensive amount ofdata from various fields of agriculture. Therefore an objective rises up for the examination of the available data and integrating with processes like crop enhancement, yield prediction, examination of plant infections etc. Machine learning has up surged with tremendous processing techniques to perceive new contingencies in the multi-disciplinary agrarian advancements. In this pa- per a novel hybrid regression algorithm, reinforced extreme gradient boosting is proposed which displays essentially improved execution over traditional machine learning algorithms like artificial neural networks, deep Q-Network, gradient boosting, ran- dom forest and decision tree. Extreme gradient boosting constructs new models, which are essentially, decision trees learning from the mistakes of their predecessors by optimizing the gradient descent loss function. The proposed hybrid model performs reinforcement learning at every node during the node splitting process of the decision tree construction. This leads to effective utilizationofthesamplesbyselectingtheappropriatesplitattributeforenhancedperformance. Model’sperformanceisevaluated by means of Mean Square Error, Root Mean Square Error, Mean Absolute Error, and Coefficient of Determination. To assure a fair assessment of the results, the model assessment is performed on both training and test dataset. The regression diagnostic plots from residuals and the results obtained evidently delineates the fact that proposed hybrid approach performs better with reduced error measure and improved accuracy of 94.15% over the other machine learning algorithms. Also the performance of probability density function for the proposed model delineates that, it can preserve the actual distributional characteristics of the original crop yield data more approximately when compared to the other experimented machine learning models.

Keywords

Crop yield prediction reinforcement learning extreme gradient boosting intelligent agrarian application

1 Introduction

On all accounts of learning and perception, individuals have developed an enormous framework of data that empowers them to understand and foresee fluctuating natural incongruity. Machine learning, an articulation of the more extensive field artificial intelligence and a prompt successor of statistical modeling utilizes this massive data framework to define the underlying significant information [26]. Exuberant advances in machine learning have evidently in- definite potential results. Machine learning has surged together with huge data advances and better measure implementation that make new opportunities to envision data exhaustive processes in agrarian functional environment [19]. Agriculture essentially being a conventional source of sustenance is a standout amongst the most demanding sectors in a country’s economy. Numerous researchers and experts in current agribusiness are examining their hypothesis at an undeniable predominant scale, in assistance to progressively ac- curate and reliable forecast [16, 24]. Existing agrarian frameworks can explore necessarily more machine learning methods to utilize upgrades and essentiality more productively, conforming to various environ- mental changes. In the agrarian development machine learning enhances crop administration that aids in forecasting crop yields [39], management of crop disease [22 , 44], plant weed recognition [31, 35], affirmation of plant arrangements [2, 3], management of soil characteristics [18], estimation of agro-climate and rain- fall [1, 9], monitoring ground water characteristics [5], etc.

Machine learning algorithms are broadly classified as supervised, unsupervised and semi supervised depending on the training dataset. Supervised learning incorporates analysis on a pre-characterized arrangement of training models. Unsupervised learning doesn’t comprise of pre-trained data but are generally dependant with finding associations inside information [25]. Further more to the above mentioned machine learning approaches another class of machine learning approach is characterized as reinforcement learning [45]. Reinforcement learning resembles the performance of an intelligent agent that describes in achieving a task with an increasingly rewarding action. This ensures in constructing a better non-linear frame- work with a robust performance. Machine learning algorithms sustain near or even improved potential in contrary to traditional statistics in expressing the linear or non-linear relations for the agricultural frameworks, and in addition has imperative predicting capability.

Initially at the start of every cropping season, agrarian coordinators require evaluating the yield for arduous crops. Predicting the crop yield is critical since it depends on different associated factors like climate, soil, ground water, manures and pests, rotation of crops, irrigation system, etc. In addition crop yield also varies transiently and spatially with a non-linear aspect manifesting huge variation every year. The advantage of machine learning advancement for agronomical applications is that they are sufficiently precise, non-destructive and yield enduring results. Gradient boosting is a supervised machine learning algorithm for regression and classification based problems, which delivers a better forecast model from an ensemble of weak predicting models [15]. The algorithmic optimizations are performed in function space rather than parameter space, making simple use of custom loss function. Boosting concentrates gradually at every step on crucial precedents providing a decent strategy to manage unbalanced datasets by reinforcing the effect of positive class. In comparison to a single machine learning framework a hybrid approach of associating two or more approaches to attain better execution model is still under research. Endorsing this approach for study, a hybrid model accommodating the functionality of the extreme gradient boosting (XGBoost) algorithm with reinforcement learning is analyzed.

In this paper, a new hybrid forecasting machine learning algorithm based on extreme gradient boosting and reinforcement learning is constructed to strengthen the crop yield forecasting accuracy with rewarding iterations. With regards to the ensemble based methods gradient boosting is particularly convincing for the proposed application. Gradient boosting repeatedly develops a group of learners, with an objective that every new learner exerts to address the errors of its forerunners by improving certain differentiable loss function. A reinforcement learning model is imported in the tree construction process to achieve a predominant predictive efficiency of the crop yield prediction model. The rest of the paper is organized as follows. Section 2 presents the review of existing works in literature. Section 3 explains in detail the proposed hybrid reinforced XGBoost crop yield prediction model. Section 4 defines the agrarian dataset and the area considered for study. Section 5 explains the experimental frameworks and outcomes of the proposed hybrid model over other machine learning algorithms. Section 6 wraps up with conclusion and future works.

2 Related works

Dynamic progress in machine learning have considerably unlimited potential outcomes. Progressively to accomplish new contingencies machine learning has advanced together with enormous information progresses. It has been examined that agriculturist’s revenue rise or fall contingent upon the yields they secure from their harvests. In significance to reinforce the procedure of decision making it is imperative to perceive the well defined existing relationship between the crop yield and numerous factors impacting it [34]. Further- more improved measure prerequisites are in need to regulate, estimate and envision information extensive techniques in agrarian environment [43]. Several examiners and experts in present agribusiness are observing at their hypothesis at dynamically predominant scale, encouraging in achieving progressively exact and reliable prediction. Present day agrarian frameworks can determine significantly more machine learning advances to preserve water resources, utilize enhancements and vitality more considerably and incorporate to substantial environmental transformations.

Forecasting the crop yield is an exceptional process among the most convincing fields in precision agriculture. It is of high relevance for yield mapping, yield assessment, organizing crop market activity and crop agency to improve the effectiveness. Gradient boosting is one of the ensemble based supervised machine learning algorithms providing solutions to numerous agrarian applications. With respect to the application of crop yield prediction [46], gradient boosting algorithm has contributed to the construction of gridded crop models with downscaled spatial temporal approaches [11]. Analysis of the land satellite data to determine the land coverage for cropping [40], urban classification and formulating the essential vegetation indices using the gradient boosting algorithm is attained [41]. Gradient boosting algorithm also contribute to the determination and forecasting of essential climatic parameters assisting for crop yield like solar radiation, precipitation [28], solar irradiance [23] and wind energy [4]. As an intriguing contributor to precision agriculture by enhancing crop yield using the remotely sensed data, the gradient boosting machine learning algorithm has assisted in identifying the essential soil properties like soil organic matter (SOM), cation exchange capacity (CEC), magnesium (Mg), potassium (K), and pH [37]. A decision tree based gradient boosting algorithm has contributed in observing the soil moisture indications over the Tibetan plateau thereby contributing to crop based application in that area [47].

Another propelled machine learning approach is the reinforcement learning algorithms. In agrarian systems till date reinforcement learning has been exposed only to some constrained applications. Reinforcement learning focuses on interactions among the various states of an environment, obtaining result based training that can get familiar with the performance dependant interactions [42]. Some of the prevalent applications of reinforcement learning in agronomy are effective and dynamic resource allocation for superior economic benefits [45]. Construction of intelligent weed map dependant on spatial dissemination of weeds among the crop yields utilizing markov random field reinforcement learning framework [33]. Reinforcement learning also enables to interpret the relative forest quality based on a spatial optimizing agent [14]. Administering the maize crop irrigation scheme depending on dynamic reinforcement algorithm produced persistent outcomes to irrigation professional [8]. Reinforcement learning additionally helps with increasing water assignment for river basins encountering sparse water resource [6].

From the existing literature it is recognized that the hybrid prediction algorithms are limitedly examined, and the predominant techniques have constraints in their processing and assessment criteria. Accordingly a unique hybrid prediction algorithm, contrary to the existing hybrid techniques which incorporates the extreme gradient boosting ensemble algorithm and markov decision based reinforcement algorithm is proposed. Gradient boosting is an ensemble based machine learning algorithm which forms a group of weak learner to produce a strong base learner. Reinforcement learning provides a superior optimization framework that can be definitely characterized and conformed. The following section briefs about the proposed hybrid prediction algorithm for the enhanced crop yield prediction.

3 Proposed methodology

Machine learning has advanced together with enormous information growth and better measure persistence to conceive unique contingencies for agrarian systems by regulating, evaluating and acknowledging data intensive procedures. Constructing a machine learning model for the agrarian based framework is very arduous, since they are extremely eccentric in nature and they maintain a dynamic non-linear behavior. In designing an agrarian framework functional task should be analyzed with persistent description of actions; and the system’s execution should be adequately compelling to embrace continually dynamic activities.

Among the ensemble based algorithms gradient boosting is exceptionally compelling to the proposed work. Gradient boosting iteratively constructs an ensemble of learners, essentially decision trees, so that each new learner endeavors to correct the errors of its predecessors, enhancing some differentiable loss function. To accomplish a superior prognostic performance for the crop yield prediction model, reinforcement based learning model is introduced in the extreme gradient boosting ensemble machine learning algorithm. Reinforced extreme gradient boosting is a hybrid tree based model that delineates a basically improved execution over the existing machine learning models. The proposed hybrid model implements reinforcement learning at each variable split selection of decision tree, which are the learners of the boosting algorithm. This empowers the trees constructed to handle the accessible information samples in an increasingly improved manner. In the forthcoming sections the extreme gradient boosting and random forest algorithms are explained in de-tailed followed bythe proposed hybrid prediction model.

3.1 Extreme gradient boosting (XGBoost) algorithm

In XGBoost stands for “eXtreme Gradient Boosting”, and was introduced by TianqiChen. XGBoost implements the gradient boosting decision tree algorithm [21]. Boosting is an ensemble based procedure where the new models are added to address the errors made by the existing models. In boosting techniques the selection of subsamples are made more intelligently by adding weights to the misclassified observations or errors. Gradient boosting is a methodology where the new models are made that predict the errors or residuals of earlier models and are then included together to make the final prediction. It is called gradient boosting for the reason that it utilizes the gradient descendant algorithm [12] to reduce the loss while including new models. This methodology supports both regression and classification predictive modeling problems. The statistical architecture gives boosting a role as a numerical optimization problem where the motivation is minimize the loss of the model by including weak learners utilizing a gradient descendant procedure. The XG Boost concentrates on the following three elements during the tree construction:

Loss function optimization

Enabling a weak learner to make predictions

An additive model to add weak learners to reduce the loss function

The XGBoost algorithm fits the ensemble models of the type as in Equation (1) $g (x) = \sum_{m = 0}^{M} g_{m} (x) = g_{0} (x) + \sum_{m = 1}^{M} θ_{m} φ_{m} (x)$ (1)

Here g₀ (x) is the initial guess, θ_m is the weight for the ‘m’th estimator, φ_m (x) is the base estimators of the ‘m’th iteration. The product of weight and the base estimator θ_mφ_m (x) is defined as a “step” at the ‘m’th iteration.

The boosting algorithm to be solved by the XGBoost using the gradient descent for each iteration can be viewed as follows in Equation (2)

$\begin{matrix} {θ_{m}, φ_{m}} = & argmin \sum_{i = 1}^{n} L (y_{i}, g^{(m - 1)} (x_{i}) \\ + θ_{m} φ_{m} (x_{i}) \end{matrix}$ (2)

Here g^(m-1) represents the current estimation. In this way the ensemble problem is improved greedily as a forward stagewise additive model. The ensembles are not optimized in a global manner, but rather enhance the outcome dependant on the current estimate as shown in Equation (3). At each iteration a regression tree model is fitted to forecast the negative gradient. Generally the squared error is utilized as a surrogate loss. $φ_{m} = argmin \sum_{i = 1}^{n} {[(- g_{m} (x_{i})) - φ (x_{i})]}^{2}$ (3)

XGBoost is faster on comparison to the normal gradient boosting because for gradient boosting the weights are generally the aggregate of the gradients. While for XGBoost it is the sum of the gradients scaled by the sum of the hessians. This results in faster execution of XGBoost eliminating the necessity for line search as in gradient boosting. XGBoost also enhances more randomization using sub sampling for optimum results.

3.2 Extreme Gradient Boosting (XGBoost) Algorithm

Reinforcement learning is an intriguing classification of machine learning algorithms, where an agent in an environment learns to act bcarrying out rewarding actions. The actions are performed successively that is, the outcome depends on the contribution of the current state. For this procedure of study a Markov decision process (MDP) is characterized that sustains the compliance for embracing the reinforcement learning problems. MDP is a tuple (S, A, Psa, D, R) which represents set of all possible states (S) of the agent; set of all possible actions (A) an agent can engage into; the state transition probability (Psa) from one state to another by an agent; a discount factor (D) which includes value from 0 to 1; and a reward function (R).

For every state a policy π is defined which determines the action the agent performs. The policy is generally a function mapping the states and the actions of an agent. For a given stateaction pair (s, a) the value Vπ (s, a) is an analogy of the expected future reward. The value function [20] satisfying the Bellman equations and is given in Equation (4).

$V^{π} (s, a) = R (s, a) + γ \sum_{s^{'} \in S, a \in A} P_{s π (s, a)} (s^{'}, a) V^{π} (s^{'}, a)$ (4)

The reward R is discounted by a discount factor γ. An agent’s goal or objective is to identify an optimal policy π* as shown in Equation (5), that expands the sum of the discounted rewards amid the agent’s life time. $π^{*} (s) = \begin{matrix} argmax \\ a ɛ A \end{matrix} γ \sum_{s^{'} \in S} P_{sa} (s^{'}, a) V^{*} (s^{'}, a)$ (5)

The value function that is obtained using the optimal policy forms the optimal value function as defined in Equation (6).

$V^{*} (s, a) = R (s, a) + \begin{matrix} max \\ a ɛ A \end{matrix} γ \sum_{s^{'} \in S} P_{sa} (s^{'}, a) V^{*} (s^{'}, a)$ (6)

In reinforcement learning to accomplish the best result, the expected cumulative reward should be augmented.

3.3 Reinforced extreme gradient boosting

The proposed reinforced XGBoost prediction algorithm is a predominant extreme gradient boosting of decision trees with uniquely selected split variables. So as to recognize the most significant variable at any internal node, formerly introduce the XGBoost and appropriate the importance measures for all the cofactors. Further utilizing the essential variables the nodes are split. While executing this recursively in every daughter node of the tree, the variables are chosen which prompts to a tree that is adaptable to a minimum prediction error in the more extended run. The splitting of nodes continues until the samples obtained are meeting the stopping criterion and the gradient descent loss function for the tree models are minimized. The proposed hybrid prediction algorithm framework is presented in Fig. 1.

Fig. 1

Framework of proposed hybrid Reinforced XGBoost prediction model.

XGBoost performs based on the boosting technique; where the trees are build sequentially aiming to reduce the errors of the previous trees. Boosting comprises of the following steps:

Defining an initial model M0 to predict the target variable y associated with a residual (y-M0)

From the residuals of the previous step a new model r1 is fit.

A boosted version of M0 which is M1 is produced by combining M0 and r1, resulting in an lower error than M0.

M_{1} (x) \leftarrow M_{0} (x) + r_{1} (x)

(7)

When generalizing to m iterations Equation (7) can be rewritten as follows: $M_{m} (x) \leftarrow M_{m - 1} (x) + r_{m} (x)$ (8)

While constructing the loss function a gradient descendant algorithm is considered which enables to minimize any differential function. The average gradient component for every node is computed. For each node in considering the split the discount factor γ is multiplied with the residual accounting for the difference in the impact of every branch split. XGBoost assists in predicting an optimal gradient for every additive model. The following algorithm summarizes the reinforced XGBoost prediction algorithm.

Algorithm 1: Reinforced XGBoost Model

Input: Data Samples n, state value function v

Constructing Reinforced XGBoost Model

XGB ← empty set of trees

While training XGB do

L_t← empty list

For all state in v do

Reset environment

While timestamp i in state do

Read state s_i from environment

p ← empty set

For all action a do

p_a ← 1/|A_n|, where A_n is number of actions

For all trees t in XGB do

Compute weight w_s as per t

Initialize boosting algorithm M₀ (x):

M_{0} (x) = argmin \sum_{i = 1}^{n} L (y_{i}, γ)

Determine Gradient descent loss function:

g_{m} (x) = \frac{\partial L (y, M (x))}{\partial (M (x)} M (x) = M^{(m - 1)} (x)

Fit residual model r_m

Define Multiplicative factor γ

Boosted Model Mm(x):

M_m (x) ← M_m-1 (x) + r_m (x)

p ← p + w_s

Reward r_i← from a_i in environment

L_t ← L_t append (s_i, r_i, a_i, i)

For all a, s,i,r in L_t do

Compute v (s, a)

update v

Train ensemble tree t to predict action a from s

XGB ← XGB append tree t

The proposed approach ensures using the most significant variables for constructing the trees subsequently providing better prediction. The following sections briefs about the dataset utilized for the proposed method and the area under which the study is carried out.

4 Dataset and study area description

The data required for this examination is procured from the various village blocks of the Vellore district in the state of Tamil Nadu, India. The blocks of the Vellore district considered for the study include Ponnai, Arcot, Sholinghur, Ammur, Thimiri, and Kalavai. The paddy crop is considered for the examination of yield. Paddy being one of the dominant financial crops developed in this region therefore, this district is considered for analysis. In differing to the regular climatic and soil parameters, the dataset includes distinctive climate, soil, groundwater properties together with the fertilizer volume devoured by crops of the study area. Some of the parameters explored in the present study like evapotranspiration, ground frost frequency, aquifer characteristics, wet day frequency and groundwater nutrients are not recognized in the prior works existing in the literature. The data is taken for a period length of 30 years. Figure 2 presents a snapshot of few of the 37 dataset parameters of the actual paddy crop dataset used for the current study. The dataset contains paddy yield in terms of area cultivated (in hectares), paddy production (in tons) and yield acquired (in kg/hectare). The information pertinent to climatic components like precipitation, temperature, potential evapotranspiration, reference crop evapotranspiration and distinctive climatic parameters like ground frost frequency, diurnal temperature range, humidity, wind speed has been used which is acquired from the Indian water portal ‘metdata’ tool. The soil and groundwater properties comprises of soil PH, topsoil density, measure of the soil macronutrients (Nitrogen, Phosphorus a Potassium) present and the distinctive hydro-chemical properties of groundwater like aquifer type, transmissivity, permeability, electrical conductivity, pre-monsoon and post-monsoon micro-nutrients (sodium, magnesium and chloride) content in groundwater. Table 1 presents concise information about the different parameters utilized in the examination.

Fig. 2

Snapshot of few crop dataset parameters used for constructing the yield prediction models.

Table 1

List of dataset parameters and their description

S.No	Parameter Name	Description	Uts
1	Net Cropped Area	The total geographic area on which the crop has been planted at least once during a year	Integer (Ha)
2	Gross Cropped Area	Total area planted to crops during all growing seasons	Integer (Ha)
3	Net Irrigated Area	The total geographic area w has received irrigation throughout the year	Integer (Ha)
4	Gross Irrigated Area	The total area under crops that have received irrigation during all the growing seasons.
5	Area Rice	Total area planted for rice cp
6	Quantity Rice	Total production of rice.	Integer (Ton)
7	Yield Rice	The total amount of rice yield obtained	Integer (Ton)
8	Soil PH	A measure of the soil acidity and alkalinity.	Integer
9	Topsoil Depth	The soil outermost layer rich in organic matter and microorganisms	Integer (m)
10	N Soil	The amount of nitrogen present in the soil.	Integer (Kg/Ha)
11	P Soil	The amount of phosphorus present in the soil.	Integer (Kg/Ha)
12	K Soil	The amount of potassium present in the soil	Integer (Kg/Ha)
13	QNitro	Quantity of nitrogen fertilizers used	Integer (Kg)
14	QP2O5	Quantity of Phosphorus fertilizers used	Integer (Kg)
15	QK2O	Quantity of Potassium fertilizers used	Integer (Kg)
16	Precipitation	Rain or water vapor condensation from the atmosphere	Integer (mm)
17	Potential Evapotranspiration	Amount of evaporation that would occur in an area with sufficient water source.	Integer (mm/day)
18	Reference Crop Evapotranspiration	The evapotranspiration rate from a crop reference surface that is not short of water	Integer (mm/day)
19	Ground Frost Frequency	Number of days that refers to the condition when the temperature of the upper layer of the soil falls below the freezing point of water	Integer (No. of days)
20	Diurnal Temperature range	Difference between the daily maximum and minimum temperature	Integer (°C)
21	Wet Day Frequency	The number of days in which an amount of rain is recorded in an observed period of 24 hrs.	Integer (No.of days)
22	Vapour Pressure	The pressure exerted by water vapor in thermodynamic equilibrium with its condensed phase.	Integer (Hectopascal)
23	Maximum Temperature	The highest air temperature recorded.	Integer (°C)
24	Minimum temperature	The lowest air temperature recorded.	Integer (°C)
25	Average temperature	The average air temperature recorded.	Integer (°C)
26	Humidity	The amount of water vapor in the atmosphere.	Integer (Percentage)
27	Wind Speed	The rate at which the air moves in a particular area	Integer (Miles/ Hr)
28	Aquifer Area Percentage	Percentage of the area covered by an aquifer. The aquifer is a body of permeable rock that can contain or transmit groundwater.	Integer (Percentage)
29	Aquifer Permeability	A measure of the rock property which determines how water and other fluids can flow through it.	Integer (m/day)
30	Pre–Electrical Conductivity	Average Pre-monsoon Electrical conductivity of groundwater that determines the ability or power to conduct or transmit heat or electricity	Integer (Siemens/m)
31	Post-Electrical Conductivity	Average Post-monsoon Electrical conductivity of groundwater that determines the ability or power to conduct or transmit heat or electricity	Integer (Siemens/m)
32	Groundwater Pre-Magnesium	Average Pre-monsoon Magnesium level in groundwater	Integer (mg/L)
33	Groundwater Post-Magnesium	Average Post-monsoon Magnesium level in groundwater	Integer (mg/L)
34	Groundwater Pre-Sodium	The average Pre-monsoon Sodium level in groundwater	Integer (mg/L)
35	Groundwater Post-Sodium	Average Post-monsoon Sodium level in groundwater	Integer (mg/L)
36	Groundwater Pre-Chloride	Average Pre-monsoon Chloride level in groundwater	Integer (mg/L)
37	Groundwater Post-Chloride	Average Post-monsoon Chloride level in groundwater	Integer (mg/ L)

All these data parameters together form a productive dataset that aspires in enhancing the crop yield prediction, by achieving better precision over the conventional methodologies. The following section briefs the experimental results attained by the proposed approach and the other machine learning algorithms.

5 Results and discussion

Machine learning algorithms demand a permissible portion of information for efficient data processing. Information with consistent attributes streamlines the effort of finding regularities by eliminating the redundant features with respect to the learning objective. The viability of a learning model is constrained by evaluating the model distinctive execution measures or by examining the performance by various assessment measures. This section briefs about the experimental outcomes attained utilizing the proposed approach and the other existing machine learning algorithms.

During the advancement of machine learning models, the dataset is arbitrarily split into training set and test set, where the prevalent measure of information is taken for training. Despite the fact that the test dataset is small, there exists a possibility of overlooking certain significant data that may have enhanced the model. This may likewise brings about a high variance in the dataset. To regulate this problem, k-fold cross validation is used. It is a strategy that is utilized to evaluate the machine learning models by reassessing the training data for enhancing the performance. Randomly portioning a time series data for cross-validation results in temporal dependency as there prevails a constant dependence on past observation. At the same time a leakage from the response variable to lag variable will occur undoubtedly, leading to a non-stationarity in the information space. In order to overcome this forward chaining cross validation is performed. For the proposed reinforced extreme gradient boosting model, five-fold forward chaining cross validation is enforced. It starts with a small subset of data for training, predict for the following data and determining the efficiency of the predicted data. The same predicted data are confined as a segment for the following training subset and its consecutive data points are predicted. This results in contriving an increasingly precise model, where the model is constructed on past information and predicts the onward information. The results of five-fold forward chaining cross validation are tabulated in Table 2.

Table 2
Results of 5- fold forward chaining cross-validation for the Reinforced XGBoost model

Data subset Training Data Set (Years) Test Data Set (Years) R² Score Measure Correlation Measure

1 1996–2000 2001–2003 0.70 0.73

2 1996–2003 2004–2006 0.837 0.85

3 1996–2006 2007–2009 0.75 0.69

4 1996–2009 2010–2012 0.824 0.85

5 1996–2012 2013–2016 0.88 0.89

Data subset	Training Data Set (Years)	Test Data Set (Years)	R² Score Measure	Correlation Measure
1	1996–2000	2001–2003	0.70	0.73
2	1996–2003	2004–2006	0.837	0.85
3	1996–2006	2007–2009	0.75	0.69
4	1996–2009	2010–2012	0.824	0.85
5	1996–2012	2013–2016	0.88	0.89

Pre-processing of the dataset is carried out using the min-max scaling technique which normalizes the information. The whole information is partitioned into five subsets and then administered to training and testing with a test size of 0.25. The model is prepared all through the five-fold forward chaining cross validation process and the error metric is characterized. The determined error metric is the R2 score, which is delineated in every iteration and achieves the best value depicting the principal model efficiency.

The proposed model stability is defined by examining the model against different execution measures or definite evaluation measures. The empirical system which consolidates the enhance execution over the other machine learning algorithms and linear regression models. Generally decision trees produce lower bias and are likely to over-fit. Therefore ensemble procedures of running numerous trees are sufficient for pertinent predictions. Gradient boosting repeatedly develops an ensemble of learners, basically decision trees, with a goal that each new tree tries to address the mistakes of its predecessors, improving some differentiable loss function. Reinforced extreme gradient boosting is a hybrid tree based model that portrays an essentially improved execution over the current machine learning models. The proposed model enforces reinforcement learning at every variable split of the decision tree, which is the learner of the boosting algorithm. This engages the trees built to deal with open datasets in an undeniably improved way. The expected output is acquired by collecting the sum of the gradients scaled by the sum of the hessians. The proposed technique efficiently reduces both bias and variance. As larger number of trees may decrease the variance the number of trees is chosen depending on the precision function enforced by the computation time. Here the number of trees is chosen as 500 and the number of randomly chosen feature at each node is selected as 20. Contingent upon the idea of pruning that can fix potential over fitting, it is observed that above the sufficient number of nodes the efficiency prevailed to be the same prompting faster calculations. Hence the maximum depth of the tree is expanded up to 10 nodes. The learning rate is set to 0.1 and the least squares regression loss function is implemented for the tree construction. With this parameter framework, the proposed model resulted in an R2 value of 0.88 and an accuracy of 94.15%.

The proposed reinforced extreme gradient boosting model is examined and tested with five other substantial and significant machine learning algorithms like decision tree, random forest, gradient boosting, artificial neural networks and deep Q-network. For the construction of the feed forward propagation artificial neural network an input layer of 37 neurons representing the crop dataset parameters, a single hidden layer with 15 neurons and an output layer with one neuron representing the crop yield value is constructed. A sigmoid activation function is used to introduce non linearity in the neuron output for better non linear representations. The weights of the neural networks are assigned randomly. The crop model developed with artificial neural network resulted in a model accuracy of 91.5%. A convincing aspect of the assessment metrics is their capacity to segregate between the results of various learning models. The various assessment metrics considered for the proposed work are Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE) and Determination Coefficient (R2).

Mean Squared Error is one of the most frequently used regression loss function. It is the total sum ofthe squared distance between the predicted and the target variables. It is a risk function with respect to the determined value of squared error. The randomness of the MSE makes it to almost remain strictly positive. Since the model developed is a regression model, the linear regression line equation considered is given in Equation (9). $y^{'} = mx + c$ (9)

Here m is the slope and c is the intercept. After the identification of the optimal parameters the result is assessed using MSE. The minimum the MSE value the better the model performance. The formula for MSE is defined in Equation (10). $MSE = \frac{1}{N} \sum_{i = 1}^{n} (y_{i} - y_{i}^{'})^{2}$ (10) $y_{i}^{'} = {mx}_{i} + c$ (11)

Here y_i is the predicted value and $y_{i}^{'}$ is the actual value. To observe how each individual parameter affects the MSE partial derivatives are used with respect to m and c. The function of MSE with respect to m and c is defined in Equation (12). $f (m, c) = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - ({mx}_{i} + c))}^{2}$ (12)

Differentiating Equation (12) with respect to m and b is defined in Equations (13) and (14). $\frac{\partial f}{\partial m} = \frac{1}{n} \sum_{i = 1}^{n} - 2 x_{i} (y_{i} - ({mx}_{i} + c))$ (13) $\frac{\partial f}{\partial c} = \frac{1}{n} \sum_{i = 1}^{n} - 2 (y_{i} - ({mx}_{i} + c))$ (14)

To attain better execution results the MSE should to maintained as minimum as possible which results in the tuning of the slope and intercept parameters of the regression equation.

Root Mean squared error is an intermittently used measure to observe the difference between the actual value and the value predicted by a model or an estimator. In other words it is just the square root of the mean squared error. RMSE peorms to accumulate the prediction error magnitudes of various time series into a single predictive measure. It is a measure of accuracy to analyze prediction errors of different models for a specific dataset since it is scale dependant. The formula for defining the RMSE is presented in Equation (15). $RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{n} (y_{i} - y_{i}^{'})^{2}}$ (15)

The RMSE in terms of an estimated parameter θ is defined in Equation (16). $RMSE (θ) = \sqrt{\frac{1}{N} \sum_{i = 1}^{n} (y_{i}^{θ} - y_{i}^{'})^{2}}$ (16)

Differentiating Equation (16) partially with respect to θ to observe the performance of the individual parameter is defined in Equation (17).

$\frac{\partial}{\partial θ} \sqrt{\frac{1}{N} \sum_{i = 1}^{n} (y_{i}^{θ} - y_{i}^{'})^{2}} = \frac{\frac{1}{n} \sum_{i = 1}^{n} (y_{i}^{θ} - y_{i}^{'}) y_{i}^{θ} \log (y_{i})}{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i}^{θ} - y_{i}^{'})}^{2}}}$ (17)

Mean Absolute Error is a loss function that is utilized for regression models. It is the sum of the absolute differences between the predicted and the target variables. Hence it determines the average degree of the errors in a group of predictions without taking into account the directions of predictions. Its value ranges from 0 to ∞. The MAE is described as follows in Equation (18). $MAE = \frac{\sum_{i = 1}^{N} | y_{i} - y_{i}^{'} |}{N}$ (18)

As described it is the arithmetic average of the absolute error $| y_{i} - y_{i}^{'} |$ . It is the most prevailing measure of prediction error in time series analysis. The optimal value of the mean absolute error of a variable a with respect to the random variable R is defines in Equation (19). $MAE = E (| R - a |)$ (19)

The probability distribution of R exists such that ‘mn’ is a median of R iff ‘mn’ is the reducer of MAE with respect to R. To be more specific ‘mn’ is a reference median iff ‘mn’ reduces the average mean of the absolute deviations. This median is defined as the multivariate or spatial median and is defined in Equation (20). $Spatial Median = E (| R - a | - | R |)$ (20)

This optimization of the median of the MAE is utilized for statistical data analysis.

The coefficient of determination is an evaluation metric used to assess a model and is closely analogous to correlation coefficient. It is the ratio of thevariance of the dependant variable which is predicted from the independent variable. It is ud to analyze how the differences in one variable can be defined by the differences in another variable. It possesses an advantage of being scale free and defines the value between 0 and 1. The formula to determine the coefficient of determination is defined in Equation (21). $R^{2} = {(\frac{n (\sum xy) - (\sum x) (\sum y)}{\sqrt{[n \sum x^{2} - (\sum {x)}^{2}] [n} \sum y^{2} - (\sum y)^{2}]})}^{2}$ (21)

To assure fair assessment of the results, the estimation is performed on both training and test set. Accordingly two sets of these six models were built to analyze the predicting accuracy. Table 3 defines the performance of the models on both training and test datasets.

Table 3

Accuracy and evaluation metrics of the proposed Reinforced XGBoost model and other machine learning models

Model	Training Dataset				Validation Dataset				Accuracy (%)
	RMSE	MSE	MAE	R²	RMSE	MSE	MAE	R²
Reinforced XGBoost	0.19	0.08	0.16	0.86	0.14	0.02	0.12	0.88	94.15
Artificial Neural Network	0.29	0.15	0.24	0.80	0.24	0.08	0.20	0.85	91.5
Deep Q-Network	0.16	0.10	0.22	0.83	0.13	0.05	0.15	0.88	92.13
Gradient Boosting	0.62	0.72	0.42	0.69	0.51	0.79	0.47	0.71	85.44
Random Forest	0.63	0.71	0.57	0.61	0.56	0.69	0.50	0.61	72.7
Decision Trees	0.73	0.78	0.61	0.63	0.68	0.73	0.57	0.52	68

In analyzing the importance of the residual spread, the performance and the accuracy of the model is determined. By means of adaptability and precision, the proposed reinforced XGBoost model outruns other machine learning models with an accuracy of 94.15% and improved error measure. On the other hand the performance of the artificial neural networks and deep-Q learning algorithm are analogously close to the proposed reinforced XGBoost model. Figure 3 explains the experimented model’s performance metric evaluation results.

Fig. 3

Evaluation metric results of the machine learning models.

In order to improve the exploratory performance of the model diagnostic plots are created for the regression analysis [27]. In determining the model efficiency, residuals which are the difference between the actual and predicted values are usually leftover after fitting a model to the data. A normal Q-Q plot defines if residuals are normally distributed. Q-Q plot represents good result if residuals are lined well on the straight dashed line [17]. Scale-Location plot also defined as the spread location plot depicts if the residuals are equally distributed along the predictor range [36].

The scale-location plot with a horizontal line and equally spread points represent better data distribution. The residual-leverage graph identifies the influential outliers in a regression model [10]. A margin is created using the cooks distance measure, the outliers possessing higher cooks distance score [13] or those occurring outside of the cooks distance are the influential outliers [32]. These outliers need to be observed carefully as they may enhance or degrade the models performance by their presence or absence. The residuals assisting in defining the unexplained patterns in the data by means of diagnostic graphs are represented in Fig. 4.

Fig. 4

Linear regression analysis diagnostic plots.

Evaluation and interpretation of the model efficiency is a fundamental part of the model enhancement process. It entitles in interpreting the ideal model for representing the information and administering the model for following timestamps. Accuracy describes the ratio of predictions that the model has predicted precisely. Accuracy interprets the closeness of the predicted values to the actual values. The model’s performance in terms of accuracy measure of the proposed reinforced XGBoost prediction algorithm along with the existing machine learning models is explained graphically in Fig. 5.

Fig. 5

Accuracy measure of the (a) Proposed Reinforced XGBoost algorithm, (b) Deep Q-Network algorithm, (c) Artificial neural network algorithm, (d) Gradient Boosting algorithm, (e) Random Forest algorithm, (f) Decision tree algorithm.

To ascertain whether the proposed reinforced XGBoost model sustained the actual data distributional properties, the probability density functions (PDF) of the actual crop yield data and the tested models are observed. PDF is an analytic interpretation that outlines probability distribution for a continuous random variable against a discrete random variable. In graphically characterizing the PDF, the area under the curve will define the interval where the predicted variable occurs. The complete area in the graph interval associates the probability of occurrence of the continuous random variable. It empowers to ascertain the probabilities of the range of outcomes. The probability density functions of the actual crop yield data and the predicted crop yield data using the proposed reinforced XGBoost model and other machine learning algorithms are defined in Fig. 6. It is performed to examine if the proposed model and the other experimented algorithms can preserve the distributional characteristics of the original crop yield data.

Fig. 6

Probability density functions of the actual crop yield data along with other experimented models: (a) Reinforced XGBoost, (b) Artificial Neural Networks, (c) Deep Q-Network, (d) Gradient Boosting, (e) Random Forest, (f) Decision Trees.

Model execution time is the time taken for the machine learning model to run the algorithm for the experimented dataset. It enables to determine the computational cost of training process and define better hyperparameter settings that eventually enhance the model accuracy. For the experimented models the execution time is measured for 1000 epochs. An epoch is a total performance of the dataset that is to be learned by a learning machine. All the experimented models are developed in Python environment.

The time module of the python enables to determine the model execution time using the perf_counter() function.

The hardware specification of the system used for experimenting the machine learning algorithms for the paddy crop dataset is as follows: Model –HP pavilion, CPU –3.2 GHz Intel core i5, RAM –8GB 1867 MHzDDR3, GPU –AMD Radeon r9 m390 2GB, OS –Windows 10. To determine the model computation time performance for the experimented models, the total time taken for building the training model and the time taken to determine the results from the test data are examined and are tabulated in Table 4. From the model computation time of the experimented machine learning models it is observed that the proposed reinforced XGBoost model comprises of faster computation time for both training and test data.

Table 4

Experimented machine learning model execution time

Model Name	Training Data	Test Data
	Execution Time (secs)	Execution Time (secs)
Reinforced XGBoost	4467	4774
Artificial Neural Network	5306	5488
Deep Q-Network	8049	7226
Gradient Boosting	6593	6164
Random Forest	6841	6163
Decision Trees	7741	7992

Decision Trees are essentially precise and agile. They are transparent in development and gives successfully interpretable principles that encourage in perceiving the significant fields. On account of huge datasets, they are expensive in processing information because of the development of an extensive number of sub-trees. More over decision trees doesn’t regulate outliers and missing values effectively. Random forests are simpler for legitimate analysis but its theoretical assessment is very unpredictable. Random forests use decision trees which are susceptible for over-fitting. In order to attain higher accuracy random forest develops large number of trees based on bagging. The underlying objective is to resample the data regularly and for every sample training a new tree. Different trees over-fit the data in various ways and based on voting the differences are averaged. Xgboost is a boosting algorithm which is constructed on weak trees. Its objective is to add a tree at a time, so that the next tree is trained to enhance the already trained ensemble. Boosting lowers the both bias and variance. By utilizing multiple models it handles variance and reduces the bias by training the consequent models based on the residuals obtained from the previous models. The proposed model with embedded XGBoost and reinforcement learning upgrades regularization refraining from data over-fitting. Reinforced XGBoost integrating sparsity aware splitting method empowers to deal data sparsity productively. Also the data is handled in non-continuous memory blocks, empowering it to be reuse in consecutive iterations rather than re-computing resulting in faster parallel data computation. The experimental results obtained illustrates the fact that the proposed reinforced extreme gradient boosting model presents better performing results than the other experimented machine learning models. On the other hand the outputs acquired from the artificial neural networks and the deep Q-learning models predict moderately closer results to the proposed approach. This can be inspected and acknowledged from the investigations in recent literature [7, 38], that the applied machine learning has explained the outcomes of deep learning and ensemble based systems are identical for reasonable sample size. The prior consideration of the proposed work is to keep the systems and techniques as transparent and more conceivable as possible, to lean in favor of reinforced XGBoost approach. Besides performing internal cross validation, the proposed method needs less parameter tuning and are simpler than neural networks, which are generally analyzed as black boxes. Though reinforced XGBoost model is implemented with un-pruned regression trees, limiting the tree’s depth is a choice. Still this doesn’t have any negative impact by means of over-fitting, but instead has a positive outcome for agile calculations. The results interpret improved model performance by bringing down the model’s complexity and experimental errors. The models accuracy measure concludes that the proposed reinforced extreme gradient boosting model is better with 94.15% over the other machine learning models. Further the model’s predicting ability is inspected from their performance with lower error measures. The following section concludes the paper and discusses the future scope.

6 Conclusion and future work

Agriculture is a prevalent sector among the most rigorous fields to encompass an ultimate objective of statistical assessment. There exists a staggering potential for machine learning to enhance agribusiness by incorporating varying factors like climate, soil characteristics, crop disease and pest infestations. Machine learning models secure an immense degree to interpret the significant data, transform the data acquired giving deeper insight to the process. A hybrid reinforced XGBoost prediction algorithm is proposed to predict the crop yield by coordinating the various climate, soil and ground water factors. A reinforced tree node splitting is introduced in the decision tree construction of the extreme gradient boosting framework. The proposed hybrid model with embedded XGBoost and reinforcement learning enhances regularization through L1 and L2 regression methods preventing over-fitting of the data. XGBoost utilizes block structure to store data, enabling the data to be reused in subsequent iterations rather than re-computing it. The gradient statistics are also stored in these non-continuous memory blocks. This results in faster parallel data computation of the proposed model. Thus the proposed reinforced XGBoost model enables faster execution and also precise prediction results when compared to the prevailing machine learning algorithms. The results achieved using the proposed reinforced XGBoost model illustrates improved performance by bringing down the model intricacy and experimental errors. This is attained by evaluating using various assessment metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE) and Determination Coefficient (R2). In addition the accuracy measure obtained concludes that the proposed model performs better with an accuracy of 94.15% over the other experimented machine learning models, which defines the model’s prediction capability and also preserved the original crop data distributional properties which are described by the probability density function graphs. Furthermore the diagnostic plots define the regression model exploratory performance by observing the influential outliers, distribution of the data and residual spread. The process of achieving a tight error bound for node splitting and contrasting it with other models becomes tedious. The ensuing step of future research to extend the proposed model to an easier embedded framework and relatively to a restricted baseline function in a prospect to accomplish tight error bound. Administering the reinforcement learning approach to several other agrarian applications along with varying hybrid algorithms furthermore can be an interesting area of research.

Footnotes

Acknowledgments

We thank the India water portal for providing the meteorological data relevant to climatic factors from their MET data tool. The MET data tool provides district wise monthly and the annual mean of each metrological indicator values. We also thank the Joint Director of Agriculture, Vellore, Tamil Nadu, India for providing the details regarding the soil and groundwater properties for the respective village blocks.

References

Gümüşçü

, Tenekeci

M.E.

and Bilgili

A.V.

, Estimation of wheat planting date using machine learning algorithms based on available climate data, Sustainable Computing: Informatics and Systems, (2019).

Khamparia

, Singh

, Luhach

A.K.

, Pandey

and Pandey

D.K.

, Classification and Identification of Primitive Kharif Crops using Supervised Deep Convolutional Networks, Sustainable Computing: Informatics and Systems (2019).

Gili

, Álvarez

, Bagnato

and Noellemeyer

, Comparison of three methods for delineating management zones for site-specific crop management, Computers and Electronics in Agriculture 139 (2017), 213–223.

Torres-Barrán

, Alonso

Á.

and Dorronsoro

J.R.

, Regression tree ensembles for wind energy and solar radiation prediction, Neurocomputing 327 (2019), 151–160.

Garg

, Gadi

V.K.

, Feng

Y.-C.

, Lin

, Qinhua

, Ganesan

and Mei

, Dynamics of soil water content using field monitoring and AI: A case study of a vegetated soil in an urban environment in China, Sustainable Computing: Informatics and Systems (2019).

Abolpour

, Javan

and Karamouz

, Water allocation improvement in river basin using Adaptive Neural Fuzzy Reinforcement Learning approach, Applied Soft Computing 7(1) (2007), 265–285.

Badrinarayanan

, Kendall

and Cipolla

, Seg Net: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 39(12) (2017), 2481–2495.

Bergez

J.E.

, Eigenraam

and Garcia

, Comparison between dynamic programming and reinforcement learning: A case study on maize irrigation management. In Proceedings of the 3rd European Conference on Information Technology in Agriculture (EFITA01). (2001).

Bouaita

, Moussaoui

and Bachari

N.I.

, Rainfall Estimation from MSG Images Using Fuzzy Association Rules, Journal of Intelligent & Fuzzy Systems 37(1) (2019), 1357–1369.

10.

Barrett

B.E.

and Brian

, Gray, Leverage, residual, and interaction diagnostics for subsets of cases in least squares regression, Computational Statistics & Data Analysis 26(1) (1997), 39–52.

11.

Folberth

, Baklanov

, Balkovič

, Skalský

, Khabarov

and Obersteiner

, Spatio-temporal downscaling of gridded crop model yield estimates based on machine learning, Agricultural and Forest Meteorology 264 (2019), 1–15.

12.

White

C.C.

and White

D.J.

, Markov decision processes, European Journal of Operational Research 39(1) (1989), 1–16.

13.

Kim

, Lee

and Park

B.U.

, Cook’s distance in local polynomial regression, Statistics & Probability Letters 54(1) (2001), 33–40.

14.

Bone

and Dragićević

, Simulation and validation of a reinforcement learning agent-based model for multi-stakeholder forest management, Computers Environment and Urban Systems 34(2) (2010), 162–174.

15.

Krauss

, Do

X.A.

and Huck

, Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500, European Journal of Operational Research 259(2) (2017), 689–702.

16.

Küçük

Ç.

, Taşkın

and Erten

, Paddy-Rice Phenology Classification Based on Machine-Learning Methods Using Multitemporal Co-Polar X-Band SAR Images, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 9(6) (2016), 2509–2519.

17.

Reimann

and Filzmoser

, Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data, Environmental Geology 39(9) (2000), 1001–1014.

18.

Deb

, Intelligent decision making device for residue incorporation in soil or biomass power plants, Journal of Intelligent & Fuzzy Systems (2020), 1–10.

19.

Elavarasan

, Vincent

D.R.

, Sharma

, Zomaya

A.Y.

and Srinivasan

, Forecasting yield by integrating agrarian factors and machine learning models: A survey, Computers and Electronics in Agriculture 155 (2018), 257–282.

20.

Elavarasan

and Vincent

P.M.D.

, Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian Applications, in, IEEE Access 8 (2020), 86886–86901. doi: 10.1109/ACCESS.2020.2992480

21.

Vincent

D.R.

, Deepa

, Elavarasan

, Srinivasan

, Chauhdary

S.H.

and Iwendi

, Sensors Driven AI-Based Agriculture Recommendation Model for Assessing Land Suitability, Sensors 19 (2019), 3667.

22.

, Yang

, Zhang

and Wan

, Identification of tea leaf diseases by using an improved deep convolutional neural network, Sustainable Computing: Informatics and Systems 24 (2019).

23.

Verbois

, Rusydi

and Thiery

, Probabilistic forecasting of day-ahead solar irradiance using quantile gradient boosting, Solar Energy 173 (2018), 313–327.

24.

Aghighi

, Azadbakht

, Ashourloo

, Shahrabi

H.S.

and Radiom

, Machine Learning Regression Techniques for the Silage Maize Yield Prediction Using Time-Series Images of Landsat 8 OLI, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11(12) (2018), 4563–4577.

25.

Jain

and Dixit

V.S.

, Recommendations with Context Aware Framework Using Particle Swarm Optimization and Unsupervised Learning, Journal of Intelligent & Fuzzy Systems 36(5) (2019), 4479–4490.

26.

Lee

J.H.

, Shin

and Realff

M.J.

, Machine learning: Overview of the recent progresses and implications for the process systems engineering field, Computers & Chemical Engineering 114 (2018), 111–121.

27.

Ferré

, Brown

S.D.

, Tauler

and Walczak

, 3.02 - Regression Diagnostics, Comprehensive Chemometrics, Elsevier (2009), 33–89.

28.

Fan

, Wang

, Wu

, Zhou

, Zhang

, Yu

, Lu

and Xiang

, Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China, Energy Conversion and Management 164 (2018), 102–111.

29.

Karadağ

, Tenekeci

M.E.

, Taşaltın

and Bilgili

, Detection of pepper fusarium disease using machine learning algorithms based on spectral reflectance, Sustainable Computing: Informatics and Systems (2019).

30.

, Ma

, Ju

and Li

, Graph-based semi-supervised learning for spectral-spatial hyperspectral image classification, Pattern Recognition Letters 83 (2016), 133–142.

31.

Dang

L.M.

, Hassan

S.I.

, Suhyeon

, Sangaiah

A.K.

, Mehmood

, Rho

, Seo

and Moon

, UAV based wilt detection system via convolutional neural networks, Sustainable Computing: Informatics and Systems (2018).

32.

Pinho

L.G.B.

, Nobre

J.S.

and Singer

J.M.

, Cook’s distance for generalized linear mixed models, Computational Statistics & Data Analysis 82 (2015), 126–136.

33.

Bonneau

, Gaba

, Peyrard

and Sabbadin

, Reinforcement learning-based design of sampling policies under cost constraints in Markov random fields: Application to weed map reconstruction, Computational Statistics & Data Analysis 72 (2014), 30–44.

34.

Holzman

M.E.

, Carmona

, Rivas

and Niclòs

, Early assessment of crop yield from remotely sensed water stress and solar radiation data, ISPRS Journal of Photogrammetry and Remote Sensing (2018).

35.

Pantazi

X.E.

, Tamouridou

A.A.

, Alexandridis

T.K.

, Lagopodi

A.L.

, Kontouris

and Moshou

, Detection of Silybummarianum infection with Microbotryumsilybum using VNIR field spectroscopy, Computers and Electronics in Agriculture 137 (2017), 130–137.

36.

Rousseeuw

P.J.

, A diagnostic plot for regression outliers and leverage points, Computational Statistics & Data Analysis 11(1) (1991), 127–129.

37.

Khanal

, Fulton

, Klopfenstein

, Douridas

and Shearer

, Integration of high resolution remotely sensed data and machine learning techniques for spatial prediction of soil properties and corn yield, Computers and Electronics in Agriculture 153 (2018), 213–225.

38.

Sangineto

, Nabi

, Culibrk

and Sebe

, Self paced deep learning for weakly supervised object detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 41(3) (2018), 712–725.

39.

Liu

, Cossell

, Tang

, Dunn

and Whitty

, A computer vision system for early stage grape yield estimation based on shoot detection, Computers and Electronics in Agriculture 137 (2017), 88–101.

40.

Godinho

, Guiomar

and Gil

, Using a stochastic gradient boosting algorithm to analyse the effectiveness of Landsat 8 data for montado land cover mapping: Application in southern Portugal, International Journal of Applied Earth Observation and Geoinformation 49 (2016), 151–162.

41.

Georganos

, Grippa

, Vanhuysse

, Lennert

, Shimoni

and Wolff

, Very High Resolution Object-Based Land Use–Land Cover Urban Classification Using Extreme Gradient Boosting, IEEE Geoscience and Remote Sensing Letters 15(4) (2018), 607–611.

42.

Sutton

R.S.

and Barto

A.G.

, Introduction to reinforcement learning, Cambridge: MIT Press 135 (1998).

43.

Rehman

T.U.

, Mahmud

Md.S.

, Chang

Y.K.

, Jin

and Shin

, Current and future applications of statistical machine learning algorithms for agricultural machine vision systems, Computers and Electronics in Agriculture 156 (2019), 585–605.

44.

Huang

, Yang

, Huang

and Qiao

, Detecting sugarcane borer diseases using support vector machine, Information Processing in Agriculture 5(1) (2018), 74–82.

45.

Jiang

, Hao

, Cai

and Ding

, An improved reinforcement-immune algorithm for agricultural resource allocation optimization, Journal of Computational Science 27 (2018), 320–328.

46.

Zhao

, Vergopolan

, Baylis

, Blekking

, Caylor

, Evans

, Giroux

, Sheffield

and Estes

, Comparing empirical and survey-based yield forecasts in a dryland agro-ecosystem, Agricultural and Forest Meteorology, 262 (2018), 147–156.

47.

Wei

, Meng

, Zhang

, Peng

and Meng

, Downscaling SMAP soil moisture estimation with gradient boosting decision tree regression over the Tibetan Plateau, Remote Sensing of Environment 225 (2019), 30–44.