Abstract
Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.
Keywords
Introduction
Agriculture is one of the main sources of human sustenance on earth. It does not only provide necessary food for human consumption and existence but also plays a major role in the economy of the country [1]. In India, almost 70% of the population depends on farming and it is second larger producer of agriculture products [2]. Further, India accounts for 7.39% of total global agricultural output [2]. The quality and quantity of agricultural production is affected by environmental parameters like rain, temperature and other weather parameters that are beyond the control of human beings. It is a matter of concern to safeguard crops from bio-aggressors such as pests and insects which are very dangerous for the overall growth of the crops [3]. In India, approximately 18% of the crop yield is lost due to pest attacks every year which is valued around 90,000 million rupees [4]. It is almost impossible to execute the appropriate pest control at the right time in the right place without gathering information about pest activity. Conventionally, sticky traps, and black light traps are being utilized for manual pest monitoring and detection in farms. However, these techniques are less effective and more prone to cause harm to environmental friendly insects. Manual pest monitoring techniques are time-consuming and subjective to the availability of a human expert to detect the same. As a preventive measure, farmers spray pesticides in bulk quantity which are hazardous to the ecosystem [5]. Therefore, a lot of research is being carried out all around the world for better methods of pest control than the use of chemical pesticides. A program named Integrated Pest Management (IPM) has been initiated as an alternative and effective approach to pest control since 1960 [6]. IPM consultants regularly monitor the environment by counting harmful pests on crops and apply control according to the actual localization of pests. However, IPM is also a time taking process.
In recent years, image processing technologies and robotics are widely used in agriculture to reduce the workload and time of farmers. Many researches have incorporated the concept of image processing for the classification of plant leaf diseases [7, 8] and pests of various crops like Rose [9], Rice [10], Cotton [11], Maize [12], Soybean [13], Sugarcane [14] and teagarden [15]. In this context, various approaches have been proposed in literature for detection and classification of agricultural pests such as whiteflies [16], aphids and thrips [5], Honey bees [17], wasps [18], leaf miner [19], rice plant hopper [20] and many more.
In this paper, we have considered the pest of tomato plant. Tomato occupies second highest agricultural product in the Indian economy [21]. Despite its production, its loss is more due to attacks of pests. Thus, protecting tomato from pest is crucial for improving crop quality and quantity. In literature, several researches have been carried out on identification of tomato leaf diseases [22, 23, 24, 25] and its quality evaluation [26]. However, there are a handful number of literature is available on classification of tomato pests. An approach for detection of borer insect which is commonly affecting the tomato plant has been proposed using morphological features [27] and cloud computing [28]. Tomato plants are infected with two types of virus named Tomato Spotted Wilt Virus (TSWV) and Tomato Yellow Leaf Curl Virus (TYLCV). These two viruses have been identified using support vector machine (SVM) in [29]. Moreover, due to the significant improvement in deep learning technology [30], it has also been applied in agriculture field. For example, convolutional neural network (CNN) based VGG16 [31] model and transfer learning approach [32] has been presented to detect tomato pests and diseases in [33]. The transfer learning model has reported an accuracy of 89%, slightly better than VGG16 model whose accuracy value is 88%. Another transfer learning approach on Google’s Inception-V3 model was presented to classify different types of tomato pests and diseases in [34] and reported 88.90% accuracy. In [35], the authors have presented a deep learning-based approach for classification of tomato plant diseases and pests. They have experimented with three architectures: the faster region-based CNN (Faster R-CNN), region-based fully convolution network (R-FCN), and single-shot multiplex detector (SSD) with various CNN based feature extractors such as Virtual Geometry Group (VGGNet) and Residual Network (ResNet). It has been reported that the best average precision of 85.98% has been achieved using R-FCN with ResNet50. In [36], an approach has been proposed to detect tomato whitefly and its predatory bugs using a deep CNN model. The result has been compared with hand-counted insects using the yellow sticky trap method. The average classification accuracy was reported as 87.40%. In [37], a comparative study of K-Nearest Neighbour (KNN), SVM, Multilayer Perceptron (MLP), Faster R-CNN, and SSD classifiers has been presented in distinguishing Bemisia Tabacii egg and Trialeurodes Vaporariorum egg tomato pest classes. The highest classification accuracy of 82.51% has been obtained using Faster-RCNN. Dawei et al. has presented a transfer learning based pest image classification approach using AlexNet model and reported a classification accuracy of 93.84% [38]. Recently, our group has presented tomato pest image classification approach using various pre-trained deep CNN models with transfer learning technique and reported a classification accuracy of 88.83% [39]. The following observations have been made from the literature on tomato pest image classification: (i) A handful number of research works have been done on tomato pest image classification, so there is a need to explore the image-based tomato pest classification tasks; (ii) The dataset used in most of the research works is a mix of tomato plant diseases and pests, which may not result in a robust and reliable model for tomato pest classification; (iii) The performance of deep learning models on tomato pest image classification is found to be higher than shallow models, which motivates us to do the analysis with various deep learning models.
Details of tomato pest dataset
Details of tomato pest dataset
Sample image from each class of tomato pest.
In this work, we have presented the performance of 11 state-of-the-art deep CNN architectures in three configurations: transfer learning, fine-tuning and scratch learning. The 11 models used here are: ResNet50V2, ResNet101V2, ResNet152V2, InceptionV3, Xception, InceptionResNetV2, MobileNet, DenseNet121, DenseNet169, DenseNet201 and NASNetMobile. In addition, data augmentation technique has been applied to increase the size of our dataset and avoid overfitting. The contributions of this paper are as follows: (i) application of deep CNN model with three configurations transfers learning, fine-tuning and scratch learning on tomato pest classification are first of its kind; (ii) performance comparison of 11 state-of-the-art deep CNN models on tomato pest classification; (iii) investigation of effect of data augmentation technique on the performance of deep CNN model for classification of tomato pest images.
The reminder of the paper is structured as follows: In Section 2, we have described the methodology that consists of dataset collection and preparation, CNN model with three configurations and data augmentation technique. The experimental setup has been described in Section 3. We have been presented the results and discussion in Section 4. Finally, we conclude the paper with future scope in Section 5.
Dataset collection and preparation
The dataset used in this study has been collected from online sources [40, 41, 42, 43, 44]. The dataset consists of 859 tomato pest images belonging to 10 classes. All the images are in RGB color space. The details of the dataset have been provided in Table 1. Bactrocera Latifrons [45] is a pest of solanaceous crops like potatoes, tomatoes, eggplant, capsicum and chillies. In general, it can be attacked as larvae either by parasitoids or by vertebrates eating fruit. Bemisia Tabaci [46] attacks more than 500 species of plants from 63 plant families. It can damage directly, indirectly or by virus transmission. Chrysodeixis Chalcites [47] is noticed with two silver spot and golden color. The major agricultural crop hosts of this insect pest are tobacco, tomato, cotton, cruciferae, legumes, corn, soybeans, potatoes, artichokes, greenhouse crops and cauliflower. Epilachna Vigintioctopunctata [48] is observed as a serious pest of Solanaceaeous crops. It results in total crop failure due to defoliation caused by these pests. Helicoverpa Armigera [45] is intercepted repeatedly at entry port and not detected easily. The most important crop hosts are cotton, pigeon pea, chickpea, tomato, sorghum and cowpea. Icerya Aegyptiaca [49] is a sap sucking insect. Damages to the host caused by sap depletion resulting leaf drop and stunted growth. LiriomyzavTrifolii [45] is a leaf-mining insect, commonly known as the serpentine leaf miner. It is highly polyphagous and has been recorded from 25 families. It is a major pest of ornamental and vegetable crops, including beans, capsicum, potatoes and tomatoes. Nesidiocoris Tenuis [50] feeds on solanaceous crops. It has significant contributions in controlling of greenhouse pests. Spodoptera Litura [45] is one of the important pests of agricultural crops. Damages caused due to voracious eating habits of its larvae, which leads to stripping of plants. Tuta Absoluta [45] is an insect pest that causes major losses to tomato and it affects in all growing stages of egg, larvae, pupa and adult. The impact of the pest includes severe yield loss reaching 100% in case of tomato crop. Hence it is required to address these pest’s activity to reduce the crop losses to minimal.
Deep convolutional neural network models
Deep learning models, especially convolutional neural networks (CNNs) have shown great success in image classification. CNNs are made up of learnable weights and biases. The architecture of a typical CNN structure can be explained with four main layers: convolutional layer, ReLU (Rectified Linear Unit) layer, pooling layer and fully connected layer. A kernel or filter is convolved with the input and passed through the non-linear activation function ReLU and generates feature map. The pooling layer helps to reduce the spatial size of the feature map and provides translation invariance property. In deep CNN model, convolution and pooling layers are stacked alternately followed by fully connected layers at the end, which connects every neuron in one layer to every neuron in another layer. Model is trained using well-known back propagation algorithm [51]. The final layer of CNN for classification is Softmax activation function that returns a probability distribution over the target class in a multiclass classification problem.
In this paper, we have explored 11 state-of-the-art deep CNN models which are ResNet (5OV2, 101V2, 152V2) [52], InceptionV3 [53], Xception [54], InceptionResNetV2 [55], MobileNet [56], DenseNet (121, 169, 201) [57] and NASNetMobile [58]. These models are usually trained on ImageNet dataset [59] which has 1.2 million images including 1000 categories. The input size of VGG16 is 224
We have explored the performance of these 11 models on tomato pest dataset by training the model in three different approaches: Transfer Learning (TL), Fine Tuning (FT) and Scratch Learning (SL). In TL approach, a model is trained on large dataset and the trained weights are used for new classification problem having small dataset. During training in TL approach, all the layers of pre-trained model are frozen except last few fully connected layers. In this way, weights of all frozen layers are unchanged while training and only weights of unfrozen layers take part in training. Similar to TL approach, the model is loaded with pre-trained weights in FT approach. However, the weights of all the layers are updated while training in FT unlike TL approach. On the other hand, the model is initialized with random weights in SL approach and weights of all the layers are updated. The concept of deep CNN model with three configurations: TL, FT and SL are represented in Fig. 2. Further, the details of all 11 models are provided in Table 2.
Details of deep CNN models in TL, FT and SL configuration
Details of deep CNN models in TL, FT and SL configuration
Concept of training a deep CNN model in three configurations: (a) TL; (b) FT; (c) SL.
The limitation of deep learning technique is overfitting due to small dataset. To prevent overfitting and generalize the model, data augmentation (DA) can be used to enhance the size of training dataset [60]. It includes techniques like geometric transformations, color space transformations, mixing images, adversarial training and meta-learning schemes. The geometric transformation is commonly used due to its simplicity. It involves transformations such as translation, rotation, scaling, flipping and shearing of the original image that can be represented mathematically as
OA obtained using 11 state-of-the-art models with and without DA in three configurations (TL, FT, SL) on tomato pest classification
OA obtained using 11 state-of-the-art models with and without DA in three configurations (TL, FT, SL) on tomato pest classification
We have trained 11 state-of-the-art deep CNN models in three configurations: TL, FT and SL for tomato pest classification. We have resized the image size to input shape of each model. For example, images are resized to 224
Result and discussion
The paper presents an exhaustive comparison of performance of 11 state-of-the-art deep CNN models with three approaches: TL, FT and SL on tomato pest classification. Moreover, DA has been applied to avoid overfitting. Table 3 shows the OA with STD obtained using 11 models with three approaches on tomato pest dataset. Further, the results have been shown with and without DA to validate the effect of DA. Following observations have been made from Table 3: (i) The highest OA has been achieved using DenseNet201 model among all considered models in TL and SL approach and in both the cases: with and without DA. Though, the DenseNet169 has achieved highest OA in FT, the difference in OA between DenseNet169 and DenseNet201 is very small. DenseNet169 has achieved 83.72% without DA and 86.27% with DA whereas DenseNet201 model has achieved OA of 81.86% without DA and 83.25% with DA. (ii) While comparing the performance of three approaches, it can be clearly observed that TL approach has produced better performance for all 11 models in both the cases: with and without DA. The rationale behind it is that the pre-trained weights in TL helped in better classification in 100 epochs whereas FT and SL approach might require more number of epochs to effectively train the model. (iii) The improved performance has been observed with DA in all 11 models and all three approaches. (iv) Overall, the highest OA of 94.87% has been obtained using DenseNet201 model with DA in TL approach and STD of 2.67% in this case shows robustness of the model.
Other performance parameters obtained using DenseNet201 model with DA in TL configuration for tomato pest classification
Other performance parameters obtained using DenseNet201 model with DA in TL configuration for tomato pest classification
Benchmarking of our approach on tomato pest classification with literature
Training performance using DenseNet201 model with DA in TL configuration: (a) validation loss vs. epochs; (b) validation accuracy vs. epochs.
ROC curve obtained using DenseNet201 model with DA in TL configuration.
For more detailed analysis, we have also calculated the following parameters: class-wise accuracy, precision, sensitivity, specificity, and F1-score using DenseNet201 model with DA in TL approach as it has produced highest OA and shown in Table 4. It has been observed that five classes of pest (Pest1, Pest2, Pest4, Pest6 and Pest7) has been accurately classified i.e., class accuracy
Lastly, we have presented a benchmarking of our approach with recent studies on tomato pest classification in Table 5. It can be observed that most of the research work carried out on tomato pest image classification utilized a dataset that is mix of diseases and pest [32, 33, 34, 36]. Further, the number of pest classes considered in literature are very low. On the other hand, this paper presented a comprehensive analysis on tomato pest image classification on 10 classes of pest dataset using 11 state-of-the-art models in three configurations (TL, FT and SL). Moreover, the effect of DA has been demonstrated. Finally, DenseNet201 model in TL configuration with DA has achieved highest classification accuracy of 94.87% and outperformed the performance reported in literature.
We have presented an extensive comparative performance analysis of 11 state-of-the-art deep CNN models for tomato pest image classification. We have implemented these 11 models with three training approaches TL, FT, and SL. Moreover, we have augmented the pest dataset to avoid overfitting and validated the effect of data augmentation in the performance of deep CNN models. We made three conclusions here: (1) TL approach outperformed than other two approaches (FT and SL) because initial state of weights exploits a large amount of visual knowledge already learned by ImageNet dataset, (2) DenseNet201 model is outperformed than other considered models, (3) Data augmentation improves the performance of all considered models and hence, it can be a useful technique in case of insufficient data. In future, we would like to explore the augmentation with Generative Adversarial Network (GAN) technique on tomato pest classification tasks.
Declaration of interest statement
Authors declare no conflict of interest.
Funding
No funding has been received for this research.
