Abstract
Plant species recognition from images or videos is challenging due to a large diversity of plants, variation in orientation, viewpoint, background clutter, etc. In this paper, plant species recognition is carried out using two approaches, namely, traditional method and deep learning approach. In traditional method, feature extraction is carried out using Hu moments (shape features), Haralick texture, local binary pattern (LBP) (texture features) and color channel statistics (color features). The extracted features are classified using different classifiers (linear discriminant analysis, logistic regression, classification and regression tree, naïve Bayes, k-nearest neighbor, random forest and bagging classifier). Also, different deep learning architectures are tested in the context of plant species recognition. Three standard datasets (Folio, Swedish leaf and Flavia) and one real-time dataset (Leaf12) is used. It is observed that, in traditional method, feature vector obtained by the combination of color channel statistics+LBP+Hu+Haralick with Random Forest classifier for Leaf12 dataset resulted in a plant recognition accuracy (rank-1) of 82.38%. VGG 16 Convolutional Neural Network (CNN) architecture with logistic regression resulted in an accuracy of 97.14% for Leaf12 dataset. An accuracy of 96.53%, 96.25% and 99.41% is obtained for Folio, Flavia and Swedish leaf datasets using VGG 19 CNN architecture with logistic regression as a classifier. It is also observed that the VGG (Very large Convolutional Neural Network) CNN models provided a higher accuracy rate compared to traditional methods.
Keywords
Introduction
In agriculture, plant species identification is used for weed detection [5], growth estimation, and plant disease classification [1]. Also, plants are used as medicines providing solutions to diabetes [4] and cardiovascular diseases [12]. In plant species recognition [2], leaf plays an important role compared to other parts like flower, seeds and stem. Computer vision techniques are utilized in automatic plant identification and recognition. Numerous mobile applications such as Pl@ntNet [11], leafsnap [13] are also developed.
Shape is one of the main characteristics to classify objects. Hu et al. [9] proposed a shape descriptor known as Multiscale distance matrix (MDM). MDM is a global based contour approach. MDM is invariant to rotation, translation, scaling, and bilateral symmetry. Decomposed Newton’s Method (DNM) and Maximum Margin Criterion (MMC) are applied for dimensionality reduction and further nearest neighbor (1NN) classifier is used. This method is tested for two datasets namely, Swedish Leaf Dataset and ICL (Intelligent Computing Laboratory) leaf dataset.
Zhao et al. [29] proposed a counting based shape descriptor to recognize simple and compound leaves. Independent-Inner Distance Shape Context (I-IDSC) measures the count of active shape pattern rather than considering matching features. Nearest neighbor classifier is used for classification. I-IDSC descriptor is tested over five datasets namely, Swedish leaf, ICL, Smithsonian, Plumbers Island, and their own leaf dataset formed from 54 species of Hong Kong.
Naresh et al. [18] proposed a modified Local Binary Pattern (LBP) for feature extraction and nearest neighbor classifier for medicinal plant classification. This method is tested with several standard leaf datasets and a collected dataset from Mysore, India. Tomar et al. [27] observed that the directed acyclic graph multi-class least square support vector machine (DAG-MLSTSVM) classifier performed better than artificial neural network and support vector machine. Prior to classification, leaf features are extracted based on shape and texture (21-d). Further, Hybrid Feature Selection (HFS) is carried out to identify the best features.
Ghazi et al. [6] applied transfer learning over LifeCLEF plant dataset with the help of pre-trained models like AlexNet, GoogleNet and VGGNet. For all these deep convolutional neural networks, fine tuning is performed and various parameters are analyzed after data augmentation. Parameters like batch size and number of iterations are analyzed. Lee et al. [14] discussed that Convolutional Neural Network (CNN) is used to learn leaf features and further gain knowledge based on selective features using Deconvolution Network (DN) approach. The authors observed that leaf veins help in more accurate plant identification than leaf shape. Learning features from hybrid local-global methods with deep learning performs better recognition than other techniques.
Sun et al. [23] proposed a 26-layer ResNet (Residual Network) model for plant identification. BJFU100 dataset is used and it consists of 10000 images of 100 ornamental plant species found in Beijing Forestry University campus. For experimental analysis, BJFU100 and Flavia datasets are utilized. In deep residual networks, 18, 26, 34, and 50 layers are considered. Amongst the four set of layers considered, ResNet26 outperformed the other three models. For experimental training, learning rate is set to 0.001. Flavia dataset accuracy (99.65%) result is compared with other approaches like Radial Basis Probabilistic Neural Network (RBPNN), Support Vector Machine (SVM), Deep Belief Network with dropout (DBN) and ResNet26. ResNet26 architecture produced an accuracy of 91.78% recognition rate for BJFU100 dataset. Barre et al. [3] developed a LeafNet, a CNN-based plant identification system. The LeafNet consisted of five sets of 2 convolutional layers and 1 max-pooling layer followed by 1 convolution, 1 max-pooling layer and 3 fully connected layers. LeafNet is tested over Leafsnap, Foliage and Flavia datasets.
Based on the extensive literature survey, it is identified that the reported work on plant recognition over Indian plant species are sparse. Also, numerous research works are carried out using the features such as shape, texture, color, morphological or physiological features. Reported works on plant species recognition using deep learning architecture are limited. Hence in this paper, an investigation is performed using traditional methods and deep learning architectures in order to achieve higher plant recognition rate.
Methodology
Two approaches are used for plant species recognition, namely, traditional method and deep learning methods as depicted in Fig. 1. The conventional image classification steps include preprocessing, feature extraction and classification using the machine learning classifiers. Feature extraction includes extraction of shape features, texture features and color features from leaves of plant images. These features are known as handcrafted features. Local Binary Pattern (LBP) [19] and Haralick texture features [7] are used to extract the texture information. Hu moments [10] are used for shape extraction and color channel statistics (mean and standard deviation of three color channels) for color information. To the extracted features, classification is carried out using machine learning techniques such as Logistic Regression (LR), K- Nearest Neighbour (KNN), Classification and Regression Tree (CART), Random Forest classifier (RF), linear discriminant analysis (LDA), Bagging Classifier (BC) and Naïve Bayes (NB) classifier [15, 16].

Plant classification methods.
In deep learning method, CNN based pre-trained models such as VGG 16, VGG 19, Inception-v3 and Inception-ResNet-v2 is used for feature extraction. These models have been trained on ImageNet dataset and their weights are free to use. ImageNet dataset contains about 1.2 million images. Pre-trained model weights are used as initial weights in deep learning architectures to extract the features from the input image. To the features extracted from pre-trained model, machine learning classification techniques is applied.
The Convolutional Neural Network [3] consists of several convolution layers, max pooling layer and a fully connected layer (FCL) as shown in Fig. 2. Convolutional layer is used as a feature extractor. Max pooling layer is used to reduce the dimension of the extracted feature vector. The fully connected layer is used to convert the feature map to 1-D feature vector and also, it is used as a classification layer. The deep learning models such as VGG 16, VGG 19, Inception-v3, and Inception-ResNet-v2 are developed based on the CNN principle.

Convolutional neural network.
VGG 16 [21] has 16 weight layers containing two sets of two convolution layers with max pooling, two sets of three convolution layers with max pooling followed by three fully connected layers. Similar to VGG 16, VGG 19 [21] has two sets of two convolution layers with max pooling, three sets of four convolution layers with max pooling followed by three fully connected layers. There are 138 and 144 million parameters in VGG 16 and VGG 19, respectively. In both the architectures, the width of layers starts from 64 and increases by 2 times after max pooling till 512 is reached. Also, both the models use ReLU as activation function in all layers and uses softmax for the final fully connected layer. Computational complexity of VGGNet is greater compared to other models.
‘Feature pooling’ is a concept used in Inception-v3 [25] that uses 1×1, 3×3 or 5×5 convolutions to collect maximum feature from each convolution. Inception-v3 uses stem (contains few convolutional layers and max pooling layers), few sets of filter concatenation and fully connected layers.
Inception-ResNet-v2 [24] merges both the concepts of Inception-v3 and ResNet [8] architectures. Inception-ResNet-v2 has stem as in Inception-v3 and Residual blocks as in ResNet model. But, inside every residual block, filter concatenation is carried out and their filter size varies for these residual blocks.
Three standard datasets (Folio [17], Swedish leaf [22], Flavia [28]) and one real-time dataset are used in the experimental studies. Swedish leaf dataset is a standard dataset prominent because of its clarity. Swedish leaf dataset contains 15 different classes with 75 images in each class. Totally, there are 1125 images in this dataset.
Flavia dataset contains 32 classes of leaves and 1907 images. Folio database has 637 images in 32 classes. These two datasets have uneven number of images in each class. Hence, for Flavia dataset 50 images are used in each class (1600 images). Similarly, for folio dataset 18 images in each class (576 images) are used. The Real-time dataset is named as Leaf12 dataset. Twelve plant species images are collected and each class contains 320 images. It is photographed under different illumination conditions, color backgrounds, viewpoints and orientations using a portable camera. The list of plants in Leaf12 dataset and their sample images are shown in Fig. 3.

Leaf12 samples.
Plant species recognition rate is determined using traditional methods and deep learning methods for three standard datasets (Folio, Flavia and Swedish leaf) and one real-time dataset (Leaf12). The implementation is carried out using Python language with the help of OpenCV package. For neural network, Keras package with theano as backend is used. Train and test size for analysis of results is set as 70% and 30%, respectively. The results of various datasets including Leaf12 dataset for traditional methods and pre-trained models are discussed in this section.
Folio dataset
Accuracies obtained using conventional technique and pre-trained neural networks for Folio dataset are summarized in Table 1. For most of the handcrafted features, either LDA or RF performed better compared to other classifiers. In traditional methods, handcrafted features (color channel statistics, Hu, LBP, Haralick) with LDA obtained an accuracy of 79.77%. Usage of pre-trained models (VGG 16, VGG 19, Inception-v3, Inception-ResNet-v2) resulted in the improvement of accuracy compared to conventional methods. VGG 19 with LR classifier outperformed other pre-trained models with an accuracy of 96.53%. Pawara et al. reported plant species recognition accuracy of 96.35% and 95% for AlexNet and GoogleNet architecture [20], respectively. An improvement by a factor of 0.18% is achieved with VGG 19+Logistic regression classifier.
Accuracies of Folio Dataset (%)
Accuracies of Folio Dataset (%)
The results for Swedish leaf dataset are listed in Table 2. In traditional methods, LDA and RF classifier holds good whereas, LR is the best classifier for pre-trained models. Handcrafted features (Haralick, Hu) with LDA classifier resulted in an accuracy of 92.01%. In deep learning methods, VGG 19 with LR classifier produced an accuracy of 99.41%. The accuracy of VGG 16 CNN architecture with LR is higher than Alexnet (97.81%) and GoogleNet (98.24%) models as reported by Pawara et al. [20].
Accuracies of Swedish Leaf Dataset (%)
Accuracies of Swedish Leaf Dataset (%)
The results of Flavia dataset are tabulated in Table 3. Handcrafted features (color channel statistics, LBP, Hu, Haralick) with LDA classifier resulted in an accuracy of 89.17%. With respect to pre-trained models, VGG 19 with LR classifier produces a plant species recognition rate of 96.25%. Thanh et al. [26] reported an accuracy of 95.11% using CNN for Flavia dataset. VGG 19 + LR has an improvement of the order of 1.14% to CNN model.
Accuracies of Flavia Dataset (%)
Accuracies of Flavia Dataset (%)
Random Forest (RF) classifier performed well for traditional classification. LR classifier acts as the best classifier for all pre-trained models considered for analysis. The pre-trained model, VGG 16 with LR resulted in an accuracy of 97.14% and the results are tabulated in Table 4. It is also observed that the pre-trained models using deep learning architecture yield higher accuracy compared to traditional methods.
Accuracies of Leaf12 Dataset (%)
Accuracies of Leaf12 Dataset (%)
Table 5 shows the performance metrics of various datasets. Precision, recall, F1-score, Rank-1 and Rank-5 accuracies are the performance measures taken into consideration. It is noticeable that the performance of VGG based models is best for all the four datasets, standard as well as for real-time dataset. Rank-5 accuracies for Swedish leaf and Leaf12 datasets is 100%. Also, the performance of pre-trained models used for feature extraction is higher than conventional methods.
Performance Metrics of leaf datasets
Performance Metrics of leaf datasets
Plant species recognition is carried out by two approaches namely, traditional methods (feature extraction followed by classifier) and deep learning method (pre-trained models with machine learning classifiers). Four different datasets (Folio, Swedish, Flavia and Leaf12) are considered in this studies. From the experimental investigation, it is observed that the deep learning model yielded a higher accuracy compared to that of conventional methods for all datasets considered. Logistic regression classifier with pre-trained models resulted in an improved accuracy compared to that of other classifiers with pre-trained models. VGG 16 or 19 deep learning architectures with LR classifier resulted in higher accuracy compared with Inception-v3 and Inception-ResNet-v2. Maximum plant recognition rate obtained for different datasets are listed in Table 6 given below. Further, the accuracies can be improved by increasing the number of images using data augmentation methods.
Rank-1 Accuracies of leaf datasets
