Prediction of average daily gain of swine based on machine learning

Abstract

Understanding factors affecting growth rates in swine is important in the productivity of pig farms. We herein propose machine learning-based schemes to predict the average daily gain (ADG) of pig weight using temperature, humidity, feed intake, and the current weight of the pig. In order to address the lack of available growth data for pigs, we generate a synthetic dataset describing the weight of swine in relation to environmental factors based on the theoretical growth model and experimentally measured data, in an attempt to facilitate the application of machine learning techniques. Using the generated growth data, linear regression, tree regression, adaptive boosting (AdaBoost), and a deep neural network (DNN) are applied to estimate ADG. By means of a performance evaluation, we confirm that the machine learning algorithms are capable of predicting the ADG of swine accurately even when the growth characteristics of pigs are heterogeneous, i.e., each pig follows a different growth curve. Moreover, we also find that DNN can provide a higher predictive accuracy than other machine learning-based schemes.

Keywords

Average daily gain swine machine learning prediction deep learning

1. Introduction

Growth rates in swine are of the utmost importance in pig farming and in the swine industry in general. If the pig grows too slowly, it will take too long to mature, increasing the cost to the farmer. If growth is too rapid, the quality of the meat is usually poor, e.g., the backfat of the pig may be too thick [1, 2]; it may become vulnerable to disease and be more exposed to health risks, such that again the revenue of the pig farmer could be severely reduced. Accordingly, an accurate growth model for swine is necessary to allow them to be raised at a suitable growth rate on farms. A growth model of swine would also be useful in commercial pork production to determine the proper swine weight for slaughtering [1, 2], and to devise an optimal feeding strategy [3]. A growth model is also beneficial for educational purposes, too. These factors mean that extensive research has already been undertaken to develop a growth model for pigs [4 , 19]. In particular, given that growth is affected by many environmental factors, e.g., temperature, nutrition, and mineral concentration in feed, type of pig, humidity, feed intake, and birth weight [4], the relationship between growth and these factors has for some time been a major topic in swine research.

There are two ways to characterize the growth of swine. The first approach is via the use of growth curve. In most cases, the growth of an individual can be modeled as an asymmetric S-function such that the rate of growth increases gradually as the weight increases, then at some point the rate asymptotically approaches zero when it converges to a final body weight. Most of the current work on this topic focuses on the investigation of the appropriate parameters of the growth curve such that it matches actual measurements [5]. In [5] various growth-related measures were discussed, i.e., average daily gain of body weight (ADG) and average daily feed intake (ADFI), using various growth-curve models. In [6], the growth of pigs from birth to sixty days of age was investigated using three different growth curves, namely Bridges, Gompertz, and Michaelis-Menten, and the effect of birth weight on growth rate was evaluated. The growth of the Liangshan pig, a traditional Chinese breed, was examined using the Von Bertalanffy growth model in [7]. In [8], the growth characteristics of pigs of various genotypes and for various feeding strategies were examined using magnetic resonance imagery (MRI), in which the ratio of muscle to fat tissue was measured. The parameter estimation of the S-curve for pig growth was investigated in [9].

Another way to characterize the growth of a pig is via the use of the Net Energy (NE) model in which the metabolism of the pig is taken into account [10 –12]. The authors of [12] investigated the digestible energy, metabolizable energy, and the NE of various types of feed. The effect of specific nutrition on growth is considered in conjunction with metabolism, such as the effect of sodium zeolite-A and copper [4] and the effect of dietary soybean oil [13]. This type of research is generally undertaken in the feed industry to determine the nutritional characteristics of feeding programs. Although the growth of pigs has been investigated extensively, the current hand-crafted models only consider a few environmental factors and are unable to take account of the complicated interactions between different environmental conditions.

More recently the lower cost of sensors and the emergence of Internet-of-Things (IoT) technologies has meant that smart pig farms equipped with various sensors for collecting biological and environmental data are now more common [18]. In the near future, smart pig farms will allow huge amounts of data to be collected, yielding large amount of high-quality data on the growth of pigs. These data sets could allow machine learning techniques to be applied to predict pig growth.

There are a handful of studies in which machine learning techniques have been adopted in the field of animal science. In [14, 15], the productivity of dairy cattle, e.g., pregnancy rates and first-service conception rates, was analyzed using various machine learning techniques including decision trees and support vector machines. In [16], an abnormal livestock detection algorithm was developed based on deep learning. The productivity of sows was investigated using a machine learning scheme in [17]. In our previous work [18, 19], we developed an approach to predict growth rates in pigs, based on temperature, humidity, CO₂, and wind speed using a decision tree algorithm; however, the accuracy of this model was somewhat limited due to the lack of available data, i.e., only 264 growth records were used and only one machine learning scheme was considered.

Herein, we develop and compare four machine learning-based growth models of pigs based on environmental factors. The main contributions of our work are as follows:

2.
We develop a growth model of a pig as a means of predicting its daily weight gain (i.e., ADG) based on temperature, humidity, ADFI, and current weight, using four machine learning algorithms, namely linear regression, tree regression, adaptive boosting (AdaBoost), and deep neural network (DNN). To address the lack of experimental data, we artificially generate growth data for pigs by combining a conventional growth function based on an asymmetric S-function with a measured growth record. To the best of our knowledge, this is the first attempt to apply deep learning to predict growth in pigs.

We examine the accuracy of the developed growth model. Our results justify the use of machine learning techniques in the prediction of pig growth and reveal that the DNN-based model outperforms other prediction models. The validity of using a synthetic dataset is also confirmed.

The remainder of this paper is organized as follows. We describe the considered pig growth data in Section 2. In Section 3, the development of our machine learning-based pig growth model is explained. Our simulation results and conclusions are provided in Sections 4 and 5, respectively.
2. Pig Growth Data

In this section, we describe the characteristics of empirical pig growth data and outline some of the problems associated with these datasets. We then present the approach we use to generate synthetic pig growth records based on a conventional growth model and an experimental dataset.

2.1 2.1 Characteristics of Measured Pig Growth Data

In this paper, we use the dataset presented in [18] and [19], obtained from a small pig farm with three pigs over a period of 85 days during 2015. The dataset includes the average, maximum and minimum temperature, humidity, CO₂, and wind speed for each day. It also contains the weights of individual pigs for each day, from which the ADG of the pigs can be calculated. Unlike previous work investigating body composition, i.e., the ratio of fat to muscle in the carcass [4], we focus purely on the body weight of the pig. The statistics of the empirical data are summarized in Table 1.

Table 1
Statistics of Environmental Data of Pig Farm and Weight Data of Pigs. Avg., Std., Max., Min., Temp., Hum., and Wind. denote average, standard deviation, maximum, minimum, temperature, humidity, and wind speed, respectively

Parameter Avg. Std. Min. Max.

Avg. Temp. (°C) 22.41 2.04 17.12 25.97

Max. Temp. (°C) 27.71 3.25 20.07 34.07

Min. Temp. (°C) 17.86 3.79 7.97 22.80

Avg. Hum. (%) 68.38 17.95 34.30 99.90

Max. Hum. (%) 89.26 10.93 61.20 99.90

Min. Hum. (%) 43.01 21.20 12.60 99.90

Avg. CO₂. (PPM) 576.62 176.37 421.33 1854.72

Max. CO₂. (PPM) 778.14 226.91 538.00 2121.00

Min. CO₂. (PPM) 476.084 104.91 365.00 1030.00

Avg. Wind. (m/s) 0.37 0.19 0.02 0.71

Max. Wind. (m/s) 1.16 0.43 0.43 2.13

Min. Wind. (m/s) 0.02 0.04 0.00 0.15

Weight (kg) 58.16 11.58 34.50 85.60

ADG (kg) 0.64 1.10 -3.30 8.00

Parameter	Avg.	Std.	Min.	Max.
Avg. Temp. (°C)	22.41	2.04	17.12	25.97
Max. Temp. (°C)	27.71	3.25	20.07	34.07
Min. Temp. (°C)	17.86	3.79	7.97	22.80
Avg. Hum. (%)	68.38	17.95	34.30	99.90
Max. Hum. (%)	89.26	10.93	61.20	99.90
Min. Hum. (%)	43.01	21.20	12.60	99.90
Avg. CO₂. (PPM)	576.62	176.37	421.33	1854.72
Max. CO₂. (PPM)	778.14	226.91	538.00	2121.00
Min. CO₂. (PPM)	476.084	104.91	365.00	1030.00
Avg. Wind. (m/s)	0.37	0.19	0.02	0.71
Max. Wind. (m/s)	1.16	0.43	0.43	2.13
Min. Wind. (m/s)	0.02	0.04	0.00	0.15
Weight (kg)	58.16	11.58	34.50	85.60
ADG (kg)	0.64	1.10	-3.30	8.00

Table 1 indicates that the standard deviation of the temperature is low, because the pig farm considered here is equipped with an air conditioning system. We also show that the ADG of the pigs is about 640 g per day. Note that the range of ADG is large, i.e., the minimum is – 3.3 kg and the maximum is 8 kg, due to inaccuracies in the weights measured by the scales, caused by the movement of the pigs.

In our previous work [18, 19], we performed a machine learning analysis of pig growth based on this empirical dataset. However, the results were poor and it was hard to gain any meaningful insights from the analysis, because 1) the size of dataset was too small to apply machine learning algorithms properly, and 2) the measurement error of the weight was too high.

To solve this problem, we generated a synthetic dataset based on conventional pig growth (i.e., the Gompertz model) and empirical data. Although a synthetic dataset is used in place of real data, performance evaluation based on this synthetic dataset can provide meaningful insights regarding predictions based on machine learning [21]. Moreover, through performance evaluation, we find that prediction of ADG based on synthetic data provides a greater accuracy than that achieved using a small empirical dataset, which justifies the use of synthetic data. Furthermore, this kind of modeling approach can be a useful alternative to experiments using real animals, greatly reducing the time and resources required for evaluation and enabling more rapid assessments to be made [22].

2.2 2.2 Generation of Pig Growth Data

We used a Gomertz growth curve, which is one of most popular models of growth in pigs [5, 18]. In a Gompertz growth curve, the weight of a pig at day n, which we denote W (n), can be written as follows: $\begin{matrix} W (n) & = & W_{m} e^{- a e^{- kn}}, \end{matrix}$ (1) where W_m is the mature weight of the pig, a is the ratio of birth weight, W_b, to the mature weight of the pig, i.e., $\frac{W_{b}}{W_{m}}$ , and k is the growth coefficient. In general, the values of W_m, a and k depend on gender and species. For example, the value of W_m is larger for barrow than it is for gilt, with typical values of 290 kg ≤W_m≤ 340 kg for barrow and 250 kg ≤W_m≤ 300 kg for gilt [19]. The ADG of the pig at day n, without considering environmental factors, can then be obtained using W_me^-a (e^{e^kn} - e^{e^k(n-1)}).

Unlike most conventional approaches that only consider the effects of genetic differences between pigs, we consider the effect of environment on ADG. More specifically, we assume that daily average temperature and humidity affect growth in pigs [20]. This assumption can be verified using our experimental results, to which end we applied a decision tree algorithm, which is one of most popular machine learning tools [18], to the empirical dataset and derived a prediction model depending solely on either average temperature or humidity.

Fig. 1.

Predicted ADG based on decision tree depending on average temperature and humidity.

The ADG predicted using a decision tree is shown in Fig. 11. It can be seen from the results that the ADG of the pig is indeed affected by the environmental factors of temperature and humidity. We find that the predicted ADG decreases when the average temperature is lower than 21°C or higher than 24°C. Moreover, the predicted ADG also decreases if the average humidity is higher than 55%. Based on this observation, we take account of the effect of these two environmental factors in our model. In addition to temperature and humidity, we also consider the effect of ADFI on ADG recognizing that ADG increases when ADFI increases, as found in previous research [8 , 22].

Let e_temp (T_n), e_hum (H_n), and e_FI (F_n) be the effects of average temperature, humidity, and ADFI on ADG, respectively, where T_n, H_n, and F_n are average temperature, humidity and ADFI at day n, respectively.

First, e_temp (T_n) is modeled as follows: $\begin{matrix} e_{temp} (T_{n}) = {\begin{matrix} 1.0, for 21 < T_{n} < 24 \\ 0.6, otherwise . \end{matrix} \end{matrix}$ (2)

Second, e_hum (H_n) is modeled as follows:

$\begin{matrix} e_{hum} (H_{n}) = {\begin{matrix} 1.0, for H_{n} < 55 \\ 0.6, otherwise . \end{matrix} \end{matrix}$ (3)

Third, e_FI (F_n) is modeled as follows: $\begin{matrix} e_{FI} (F_{n}) = \frac{FI}{2.4} . \end{matrix}$ (4)

It should be noted that e_temp (T_n) and e_hum (T_n) are devised based on the results shown in Fig. 1, and e_FI (F_n) is formulated assuming that ADG is proportional to ADFI

In this approach, we use a simple linear relationship between ADGand ADFI. Use of a more complicated model, e.g., thegamma function [22], is left for future work.

In our model, the ADG at day n, ADG_n, is then formulated using T_n, H_n, F_n, and the weight at day n, which we denote as W_n, is given as follows: $\begin{matrix} {ADG}_{n} & = & (1 + ϵ) e_{temp} (T_{n}) e_{hum} (H_{n}) e_{FI} (F_{n}) \\ \times W_{m} e^{- a} (e^{e^{k \hat{n}}} - e^{e^{k (\hat{n} - 1)}}), \end{matrix}$ (5) where $\hat{n} = W^{- 1} (W_{n})$ and W^-1 (·) is the inverse function of the Gompertz growth curve in (1). Moreover, ϵ is a random variable that indicates the unaccounted randomness of growth, e.g., different genotypes of individual pigs and other environmental effects such as CO₂. In our model, the heterogeneity in the growth of individual pigs [5, 22] can be taken into account using different values of ϵ, W_m, and W_b for each pig. It is worth noting that the proposed model is identical to the Gompertz growth model when 21 < T_n < 24, H_n ≥ 55, F_n = 2.4, and ϵ = 0. However, given that the effects of environment factors, e.g., temperature, can also be taken into account, our model is more general compared to the Gompertz growth model, which highlights the benefits of the proposed model.

Using (1), W_n can be written as follows: $\begin{matrix} W_{n} & = & W_{b} + \sum_{k = 1}^{n - 1} {ADG}_{k} . \end{matrix}$ (6)

Fig. 2.

Correlation between average temperature and average humidity in an experimental pig farm.

In this paper, temperature, humidity and ADFI are randomly generated from the statistics of the empirical data. More specifically, we randomly generate temperature and humidity based on Gaussian distributions with mean and standard deviation coincident with the values given in Table 1. ADFI is randomly generated using a Gaussian distribution with mean 2 and variance 0.04 [20]. Furthermore, the correlation between temperature, humidity and feed intake is also considered. In Fig. 1, we show the probability density function (PDF) of the temperature and humidity, and the correlation between temperature and humidity obtained from empirical measurements. It can be seen from Fig. 1 that the average temperature and humidity are correlated and the covariance is calculated as 0.4. Moreover, from [20], we find that the temperature and ADFI are negatively correlated such that the ADFI decreases as the temperature increases. Based on these observations, the temperature, humidity and ADFI are generated to follow $\frac{픻 | (T_{n} - 픻 (T_{n})) (H_{n} - 픻 (H_{n})) |}{σ_{T} σ_{H}} = 0.4$ and $\frac{픻 | (T_{n} - 픻 (T_{n})) (F_{n} - 픻 (F_{n})) |}{σ_{T} σ_{F}} = \frac{픻 | (F_{n} - 픻 (F_{n})) (H_{n} - 픻 (H_{n})) |}{σ_{F} σ_{H}} = - 0.4$ , where $픻 | \cdot |$ is the expectation and σ_T, σ_H, and σ_F are the standard deviation of T_n, H_n, and F_n, respectively. In addition, the final weight W_m, and birth weight W_b, are uniformly distributed in the range 250 to 300 and 8 to 11, respectively, by considering the growth of gilt [19].

Fig. 3.

Growth curve of a pig under different environments.

In Fig. 1, we show the growth curve of pigs as generated by our model. The average growth (Avg. Growth), the growth when the ADFI is 2.2 kg (High ADFI), the growth when the temperature is 20°C (Low temp.), and the empirical growth (Measured) are all shown. These results indicate that our model matches the empirical data quite well and the effect of environment is correctly taken into account, i.e., pigs grow more slowly when the average temperature is low, cf. Fig. 1, and grow more rapidly when the ADFI is high.

3. Growth Prediction Based on Machine Learning

In this paper, we consider four machine learning algorithms to predict the weight gain of pigs based on average temperature, average humidity, feed intake, and current body weight

2
It isworth noting that neither birth body weight nor mature body weight, both of which affect growth, are used because these values are hard to obtain in practice on a pig farm. Nevertheless, from our performance evaluation we confirm that our prediction models are sufficiently accurate.

. For performance indicators, the mean square error (MSE) and the R2 score are considered. A detailed explanation of the considered machine learning algorithms and performance metrics are given in the following subsections.

3.1 3.1 Machine Learning Algorithms

Four regression based machine learning algorithms are used for growth prediction, namely linear regression [23], tree regression [23], AdaBoost [24] and deep learning [25]. First, in the linear regression algorithm, the ADG of a pig is estimated using a linear function of T_n, H_n, F_n, W_n. More specifically, ADG_n is estimated using the following function. $\begin{matrix} {ADG}_{n} & = & α_{T} T_{n} + α_{H} H_{n} + α_{F} F_{n} + α_{W} W_{n} + β, \end{matrix}$ (7) where α_T, α_H, α_F, α_W are coefficients and β is a bias.

In the linear regression, the coefficients and bias are found such that the MSE of ADG is minimized [23], i.e., ∑|ADG_true - ADG_n|² is minimized where ADG_true is the true value of ADG and ADG_n is the predicted value. In our dataset, α_T, α_H, α_F, α_W, and β are found as 0.022, – 0.005, 0.266, 0.0036, and – 0.35, respectively. We therefore take it that the effect of ADFI is dominant and the effect of humidity is low.

Second, in tree regression, the ADG of a pig is predicted using a tree structure constructed by taking entropy into account [23]. An example of using tree regression to predict the ADG of a pig where the depth of the tree is limited to 4 is shown in Fig. 1, in which we find that humidity, temperature and weight are the dominant factors.

Fig. 4.

Example of ADG prediction based on tree regression.

Third, AdaBoost is a meta-machine learning algorithm in which the output of multiple machine learning algorithms are combined using a weighted sum [24]. The AdaBoost algorithm is adjusted such that any mispredicted examples are repredicted properly. In this paper, the tree regression algorithm is used as a base classifier for AdaBoost.

Finally, in deep learning-based prediction, a DNN that mimics the neurons in a human brain is used to predict ADG. In a DNN, a near-optimal strategy to predict ADG can be learned in cognitive way via a back-propagation algorithm [25]. One particular benefit of deep learning is that it allows highly nonlinear and complicated characteristics to be modeled.

The considered DNN structure is shown in Fig. 1. In this work, we consider a simple DNN structure containing four sets of linear and rectified linear unit (ReLu) layers connected in sequence. The input of the whole DNN are the four features, which are temperature, humidity, ADFI, and weight, and the output is the predicted ADG. In the linear layer the matrix multiplication of weight and the addition of bias take place. Let x_L be the input of the linear layer, and the output then becomes L_wx_L + L_b, where L_w and L_b are the weights and bias of the linear layer, respectively. In our DNN model, the number of units in one linear layer is set to 50. The output of the linear layer is fed into ReLU, which provides nonlinearity

Note that without ReLU, the DNN becomes a simple large matrix computation that is unable to cope with nonlinear characteristics.

to our DNN structure [27]. In the ReLU layer, the output becomes the same as the input if the input is larger than 0 and it becomes 0 otherwise, i.e., the output becomes max(x_relu, 0) where x_relu is the input of the ReLu layer. The outputs of the final linear layer are gathered into a single unit, which then generates the predicted ADG. In order to prevent overfitting, a dropout is used. In addition, the batch size of the learning is set to 128 and the training epoch is set to 20. It should be noted that we consider a relatively small DNN structure compared with those described elsewhere in the literature, e.g., VGGnet for image classification [26], because in this case there are only four features. Nevertheless, through performance evaluation, we show that DNN-based prediction outperforms other machine learning-based predictions, which verifies the benefits of deep learning.

Fig. 5.

DNN structure used for ADG prediction.

Fig. 6.

MSE of machine learning-based predictions achieved by varying the randomness in the growth model.

Fig. 7.

R2 score of machine learning-based predictions achieved by varying the randomness in the growth model.

Fig. 8.

MSE of machine learning-based predictions achieved by varying the number of samples.

3.2 3.2 Performance Metric

In this paper, the MSE and the R2 score are considered as performance metrics. Let $F$ and $A$ be the set of features (temperature, humidity, ADFI, weight) and the ADG used for performance evaluation. The MSE of the predictor can then be written as $\frac{1}{| A |} \sum_{i \in F, j \in A} (F (i) - j)^{2}$ , where | · | is the cardinality of the set, F (i) is the predicted ADG of the pig when the environmental factors are i, and j is the true value of ADG. Note that when the prediction is accurate, the MSE will be close to zero. On the other hand, the R2 score is calculated as $1 - \frac{{SS}_{res}}{{SS}_{tot}}$ where ${SS}_{res} = \sum_{i \in F, j \in A} (F (i) - j)^{2}$ and ${SS}_{tot} = \sum_{j \in A} (픻 (A) - j)^{2}$ . When the prediction of ADG is accurate, the value of the R2 score is close to 1.

4. Performance Evaluation

In this section, we illustrate the accuracy of our machine learning-based ADG predictions. In the performance evaluation, we generate the synthetic growth dataset for 500 pigs using the methodology discussed in Section 2, where the daily records for 85 days are collected starting 79 days after birth. Accordingly, 42500 growth records are generated in total, where each record is composed of the average temperature, humidity, feed intake, current weight, and ADG. In the performance evaluation, a subset of growth data is randomly chosen, which we denote as N_sample. Then, among the N_sample sample data, 90% are used for training the machine learning algorithms (training dataset) and 10% are used for testing only (test dataset). In addition to the synthetic dataset, we also examine the accuracy when the empirical dataset is used for training (empirical dataset). Furthermore, we also consider the four machine learning-based predictions presented in Section 3, i.e., DNN based prediction, linear regression (LR)-based prediction, tree regression (TR)-based scheme, and AdaBoost-based prediction.

In Figs. 6 and 7, we show the accuracy of the growth predictions achieved by varying the effect of unaccounted randomness in the growth model, ϵ. To this end, we assume that ϵ is uniformly distributed in the range -P to P, and we vary the value of P in the performance evaluation. Moreover in the simulation, we assume N_sample = 20000.

As can be seen from the results, the MSE increases as P increases because the influence of the random effect on ADG becomes more significant. For the same reason, the R2 score decreases as P increases. However, it should be noted that both the DNN-based scheme and the LR-based scheme are robust to P for an empirical dataset. Furthermore, we find that the predictions are sufficiently accurate (i.e., MSE is close to zero and the R2 score is close to 1) for the training and test datasets; however, the accuracy is severely degraded for the empirical dataset. Especially, we find that the MSE and R2 score of the TR- and AdaBoost-based predictions are 0 and 1, respectively, for the training dataset. It is due to the overfitting of decision tree based algorithm such that proper regularization techniques should be taken into account to avoid this overfitting.

Fig. 9.

R2 score of machine learning-based predictions achieved by varying the number of samples.

From these results, we also find that the DNN-based prediction outperforms all other predictions for the test and empirical dataset in view of both the MSE and the R2 score, which confirms the benefit of deep learning. Although we do not show it in the performance evaluation, we find that DNN-based prediction yields a better performance using the synthetic dataset, compared with when the empirical dataset is used for training

For example, when the empirical dataset is used for training, the measured R2 score is -0.014, which is lower than the value shown in Fig. 7.

, which validates

Nevertheless, the achieved accuracy is still low such that more accurate empirical data must be collected; we leave this question for future work.

the use of a synthetic dataset.

In Figs. 6 and 7, we show the MSE and R2 scores of the predictions achieved by varying the number of samples, which is N_sample, when the value of P is set to 0.4. From these results, we find that the DNN-based prediction outperforms other predictions for the test and empirical datasets when the number of samples is large. In particular, we find that the DNN-based prediction has a much higher accuracy for the empirical dataset due to its ability to extract more generalized knowledge from the data, and this again confirms the benefit of deep learning technology. Moreover, we find that the MSE and R2 score of the TR- and AdaBoost-based predictions are 0 and 1, respectively, for the training dataset, same as Figs. 6 and 7 due to the overfitting. Furthermore, as expected, all predictions show a higher accuracy as the number of samples increases except for the empirical dataset because the performance of machine learning techniques is improved as the dataset becomes larger. Unlike other predictions, the DNN-based prediction shows a more accurate result for empirical data as the number of samples increases, which leads to the conclusion that DNN-based predictions are more appropriate for cases where big data are available. In addition, we find that the R2 score of the DNN scheme is rather low for the training dataset due to the use of dropout.

Fig. 10.

Predicted ADG achieved by varying temperature and weight.

In Fig. 10, we show the predicted ADG achieved by varying the temperature and weight. The humidity and ADFI are set to average values, N_sample = 20000, and P = 0.4. We find that the decision boundary of the linear regression is too simple (underfit) and those of the tree regression and AdaBoost are too complex (overfit). On the other hand, the DNN-based prediction shows a general and reasonable decision boundary, in which ADG is high when the average temperature lies between 20°C and 24°C, and increases as the weight increases.

Finally, in Fig. 11, we show the predicted ADG for the empirical dataset. The results show that the predicted ADG deviates from the empirical data due to the low accuracy. Moreover, we observe that the ADG of the DNN-based prediction is closest to the empirical data.

Fig. 11.

Predicted ADG for empirical dataset.

5. Conclusions

In this paper, algorithms to predict ADG in pigs using machine learning were explored, where various environmental factors, i.e., temperature, humidity, ADFI, and weight of pigs were taken into account. In order to overcome problems caused by a lack of growth data, we proposed a new way to generate the synthetic growth dataset by combining an empirical dataset with a conventional growth model. Then, four machine learning algorithms, namely linear regression, tree regression, AdaBoost, and DNN, were applied. We used simulations to validate the use of machine learning techniques for ADG prediction, and the characteristics of all the machine learning-based predictions were investigated. We found that DNN-based prediction showed the highest accuracy for the test and empirical datasets, which justifies the use of deep learning technology in livestock science. An interesting extension of this work would be the improvement of ADG predictions using more empirical datasets.

Footnotes

Acknowledgments

This work was supported by Rural Development Administration(RDA) through Cooperative Research Program for Agriculture Science & Technology Development, funded by Ministry of Agriculture, Food and Rural Affairs (MAFRA) (Project No. PJ01229901201801).

References

Y.S.

Kim ,

S.W.

Kim ,

M.A.

Weaver and

C.Y.

Lee , Increasing the pig market weight: World trends, expected consequences and practical considerations, Asian-Australasian Journal of Animal Sciences 18 (2005), 590–600.

M.J.

Park ,

B.C.

Park ,

D.M.

Ha ,

J.B.

Kim ,

K.S.

Jang ,

D.H.

Lee ,

G.T.

Kim ,

S.K.

Jin and

C.Y.

Lee , Effects of increasing market weight of finishing pigs on backfat thickness, incidence of the 'caky-fatty' belly, carcass grade, and carcass quality traits, Journal of Animal Science and Technology 55 (2013), 195–202.

C.F.M.

De Lange ,

B.J.

Marty ,

Birkett ,

Morel and

Szkotnicki , Application of pig growth models in commercial pork production, Canadian Journal of Animal Science 81 (2001), 1–8.

T.L.

Ward ,

K.L.

Watkins ,

L.L.

Southern ,

P.G.

Hoyt and

D.D.

French , Interactive effects of sodium zeolite-a and copper in growing swine: Growth, and bone and tissue mineral concentrations, Journal of animal science 69 (1991), 726–733.

C.M.

Shull , Modeling Growth of Pigs Reared to Heavy Weights, Ph. D. Dissertation, University of Illinois at Urbana-Champaign, 2015.

A.P.

Schinckel ,

Ferrel ,

M.E.

Einstein ,

S.M.

Pearce and

R.D.

Boyd , Analysis of pig growth from birth to sixty days of age, The Professional Animal Scientist 20 (2004), 79–86.

Luo ,

Lei ,

Shen ,

Yang ,

Pu ,

Zhu ,

Li ,

Tang ,

Li ,

Zhang and

Zhu , Estimation of growth curves and suitable slaughter weight of the liangshan pig, Asian-Australasian Journal of Animal Sciences 28 (2015), 1252–1258.

Kusec ,

Kralik ,

Scitovski ,

Vincek and

Baulain , Growth characteristics of hybrid pig predicted by means of asymmetric S-curve, Poljoprivreda 14 (2008), 55–61.

Vincek ,

Sabo ,

Kusec ,

Kralik ,

Durkin and

Scitovski , Modeling of pig growth by S-function -least absolute deviation approach for parameter estimation, Archiv Tierzucht 55 (2012), 364–374.

10.

J.A.A.

Camargo , Evaluating the Ability of the NE system to Predict Growth Performance and Energy Utilization of Growing Pigs, M.S. Dissertation, Iowa State University, 2015.

11.

Nitikanchana ,

S.S.

Dritz ,

M.D.

Tokach ,

J.M.

DeR-ouchey ,

R.D.

Goodband and

B.J.

White , Regression analysis to predict growth performance from dietary net energy in growing finishing pigs, Journal of Animal Science 93 (2015), 2826–2839.

12.

Noblet ,

Fortune ,

X.S.

Shi and

Dubois , Prediction of net energy value of feeds for growing pigs, Journal of Animal Science 72 (1994), 344–354.

13.

D.Y.

Kil ,

Ji ,

L.L.

Stewart ,

R.B.

Hinson ,

A.D.

Beaulieu ,

G.L.

Allee ,

J.F.

Patience ,

J.E.

Pettigrew and

H.H.

Stein , Effects of dietary soybean oil on pig growth performance, retention of protein, lipids, and energy, and the net energy of corn in diets fed to growing or finishing pigs, Journal of Animal Science 91 (2013), 3283–3290.

14.

D.Z.

Caraviello ,

K.A.

Weigel ,

Craven ,

Gianola ,

N.B.

Cook ,

K.V.

Nordlund and

M.C.

Wiltbank , Analysis of reproductive performance of lactating cows on large dairy farms using machine learning algorithms, Journal of Dairy Science 89 (2006), 4703–4722.

15.

Shahinfar ,

Page ,

Guenther ,

Cabrera ,

Fricke and

Weigel , Prediction of insemination outcomes in holstein dairy cattle using alternative machine learning algorithms, Journal of Dairy Science 97 (2014), 731–742.

16.

Lee ,

S.H.

Kim ,

Ryu and

T.-W.

Ban , Fast detection of disease in livestock based on deep learning, Journal of Korea Institute of Information and Communication Engineering 21 (2017), 1009–1015.

17.

M.S.

Lee and

Y.C.

Choe , Forecasting sow's productivity using the machine learning models, Journal of Agricultural Extension and Community Development 16 (2009), 939–965.

18.

Han ,

Lee and

Sung , Development of a model to analyze the relationship between smart pig farm environmental data and daily weight increase based on decision tree, Journal ofKorea Institute ofInformation and Communication Engineering 20 (2016), 2348–2354.

19.

Han , A Study on the Growth Prediction of Livestock based on Machine Learning, M.S. Dissertation, Gyeongsang National University, 2017.

20.

S.C.

Pearce ,

N.K.

Gabler ,

J.W.

Ross ,

Escobar ,

J.F.

Patience ,

R.P.

Rhoads and

L.H.

Baumgard , The effects of heat stress and plane of nutrition on metabolism in growing pigs, Journal of Animal Science 91 (2013), 2108–2118.

21.

T.J.

O'Shea ,

Corgan and

T.C.

Clancy , Convolutional Radio Modulation Recognition Networks, Proc of EANN, Aberdeen, UK, 2016.

22.

Vautier ,

Quiniou ,

van Milgen and

Brossard , Modelling The Dynamics of Feed Intake in Growing Pigs; Interest for Modelling Populations of Pigs, Proc of 62nd Annual Meeting of the European Federation of Animal Science, Stavanger, Norway, 2011.

23.

I.H.

Witten ,

Frank ,

M.A.

Hall and

C.J.

Pal , Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, 2016.

24.

Freund and

R.E.

Schapire , A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55 (1997), 119–139.

25.

LeCun ,

Bengio and

Hinton , Deep learning, Nature 521 (2015), 436–444.

26.

Simonyan and

Zisserman , Very Deep Convolutional Networks for Large-scale Image Recognition, arXiv preprint arXiv:1409.1556, 2014.

27.

Nair and

G.E.

Hinton , Rectified Linear Units Improve Restricted Boltzmann Machines, Proc of ICML, Haifa, Israel, 2010.

Prediction of average daily gain of swine based on machine learning

Abstract

Keywords

1. Introduction

2.1 2.1 Characteristics of Measured Pig Growth Data

2 It isworth noting that neither birth body weight nor mature body weight, both of which affect growth, are used because these values are hard to obtain in practice on a pig farm. Nevertheless, from our performance evaluation we confirm that our prediction models are sufficiently accurate.

4. Performance Evaluation

Footnotes

Acknowledgments

References

2
It isworth noting that neither birth body weight nor mature body weight, both of which affect growth, are used because these values are hard to obtain in practice on a pig farm. Nevertheless, from our performance evaluation we confirm that our prediction models are sufficiently accurate.