Abstract
The current framework for detecting Fake License Plates (FLP) in real-time is not robust enough for patrol teams. The objective of this paper is to develop a robust license plate authentication framework, based on the Vehicle Make and Model Recognition (VMMR) and the License Plate Recognition (LPR) algorithms that is implementable at the edge devices. The contributions of this paper are (i) Development of license plate database for 547 Indian cars, (ii) Development of an image dataset with 3173 images of 547 Indian cars in 8 classes, (iii) Development of an ensemble model to recognize vehicle make and model from frontal, rear, and side images, and (iv) Development of a framework to authenticate the license plates with frontal, rear, and side images. The proposed ensemble model is compared with the state-of-the-art networks from the literature. Among the implemented networks for VMMR, the Ensembling model with a size of 303.2 MB achieves the best accuracy of 89%. Due to the limited memory size, Easy OCR is chosen to recognize license plate. The total size of the authentication framework is 308 MB. The performance of the proposed framework is compared with the literature. According to the results, the proposed framework enhances FLP recognition due to the recognition of vehicles from side images. The dataset is made public at https://www.kaggle.com/ganeshmailecture/datasets.
Keywords
Introduction
According to the Accidents of India report, 2021 [1], India ranks first in the number of road accident deaths among 199 countries, which accounts for almost 11% of the accident-related deaths in the world. The number of accidents can be significantly reduced if the traffic regulations are properly maintained and enforced along roadways. Currently, the law and order in India is maintained by the patrol teams. These patrol teams are responsible for activities like law enforcement, accident investigation, commercial vehicle enforcement, emergency response, maintenance, and traffic enforcement. According to the interaction with these patrol teams, the authors observed that the teams face setbacks in detecting over speeding, detecting fake license plates, and detecting fabricated accidents as there is no adequate surveillance system [2]. Among these, fake license plate detection is a serious concern as criminals use modified or defaced, or forged license plates to commit serious traffic violations and accidents. These license plates are similar in size and font of authorized license plates but the numbers are different. These plates are known as Fake License Plates (FLP). With the advancement of the latest Intelligent Transportation System (ITS) technology, all vehicles can be monitored and tracked but could not authenticate the license plate of the vehicles. Hence, for traffic violations, the ITS identifies the wrong owner of the vehicle which causes serious lapses in legal procedures. Currently, there are three types of FLP recognition [3] namely, (i) Visual judgment of the vehicles by the traffic personnel based on past personal experience (ii) By compounding the traffic records at different violations to identify the FLP and, (iii) Victims of accidents caused by vehicles with FLP report it. In all the three cases, the police personnel have to identify FLP after heavy 100% manual inspection. Thus, there is no robust system for real time identification of the FLP. To identify the FLP, external features of the vehicles are to be considered for authentication. Primarily, the license plate is a unique certification of an authorized vehicle. License plate recognition is the primary method for vehicle surveillance and tracking. It holds information regarding country, state and type of vehicle (private or tourist). In India, as shown in Fig. 1, the first two letters represent the state of India, the next two digits represent the district and the next 6 alphanumeric characters indicate the registration number of the vehicle. Thus, the characters in TN 69 BA8107 represent the unique license plate number of the vehicle. Localizing the license plate from the video or image of vehicle is known as license plate detection and recognizing the characters from the detected license plate is known as license plate recognition (LPR) [4].

Sample License Plate.
Different techniques like Optical Character recognition (OCR) [5], Support Vector Machines [2], and Convolutional Neural networks [6, 7], Attention networks [8] are used to recognize the characters in real time. But these methods cannot identify whether the recognized license plate belongs to the authorized vehicle. Thus, the conventional LPR system could not authenticate the license plates of vehicles. Hence, the license plate authentication system plays a crucial role in the authentication process (i.e.), validating the legality of the license plate.
Other than the license plate number, vehicle body features can be used for license plate authentication. From the vehicle features, the vehicle can be classified into its vehicle make and model (VMM). The classification of vehicles based on make and model is known as Vehicle Make and Model Recognition (VMMR) [8]. The VMMR process consists of two steps, of which the first is called vehicle localization, which is the identification of the vehicle from an image or a video.
The second step involves the identification of the VMM through the extraction of the fine-grain details. The details of the vehicle, such as make (i.e.) Maruti, and Model (i.e.) Celerio for the vehicle as shown in Fig. 2, are the outputs of the VMMR model. Recently, these VMMR systems have attracted many researchers, industries, and governments that have invested manpower, money, time, and energy [8] to develop real-time systems. Identifying the FLP based on vehicle features and LP details can improve the realtime FLP detection system [3].

Maruti Suzuki-Celerio car from the custom dataset.
The literature is classified into non-Image-based frameworks, single-image-based frameworks, and Multi-image-based frameworks, as shown in Fig. 3. The proposed method is shown in bold. Verifying the authenticity of license plates has been under research since the development of automobiles. Since then, patrol teams have verified the authenticity by manual inspection of vehicle details with the owner’s details after intercepting the vehicles. Automating the process started back in the 20th century.

Classification of License Plate Authentication frameworks.
The license plate authentication frameworks are grouped into three types (i) Non-Image based framework (ii) Single Image based framework (iii) Multiple Image based framework. Research started with the development of electronic methods [9] which used radio frequency tags to store the LP number, vehicle type, and color information, and validated the details using an electronic tag reader. This method can accurately recognize the vehicle. However, the owner should cooperate with the installation of electronic tags which makes it difficult to promote to drivers. This led to image-based frameworks.
With the development of computer vision systems, image-based systems started to be actively used in various domains of Intelligent Transportation Systems (ITS) like license plate recognition, vehicle make and model recognition, traffic management, law enforcement, and speed enforcement [3]. Since license plate number is the unique certification of a vehicle, many research works have been carried out in recognizing the license plate details, which is still an active area of research [10]. This research led to the development of Automatic Number Plate Recognition systems (ANPR) or License plate recognition (LPR) systems. This technology led to single image-based authentication frameworks.
The recognition methods were based on edge detection [11–13], morphological operations [14, 15], and template matching [16]. Edge detection and morphological methods were prone to noise. Hence, based on template matching many optical character recognition methods (OCR) were developed and used in LPR. Primarily, Tesseract OCR is used to detect the license plate details [5]. The authors in [17] proposed an FLP recognition system based on OCR, which can detect the FLP automatically using the rule that one car cannot duplicate at the same time in ITS. However, this requires the real and fake plates to appear in ITS simultaneously. Even though the OCR method used can recognize characters from different languages including English, French and Arabic these OCR methods are prone to errors due to challenging environmental conditions and different size of characters. Researchers hence started to use machine learning models like Support vector machines (SVM) [18] to improve the detection of license plate from the vehicle. The SVM method can detect the license plate whereas it could not recognize the characters. Hence, deep neural networks were trained by dataset of characters including Iranian dataset [19], and Bangla dataset [20]. These models recognized the characters but could not classify the characters in harsh environmental conditions like rainy, shiny, and dusty [21]. Hence, those models were trained with images in different conditions [22–24]. Even though these neural networks could recognize the characters in the LP, there are many constraints for real-time implementation in edge devices [25], [26]. Firstly, these networks are designed to segment and classify LPs based on training data. Secondly, the memory size of these networks is high, when trained to classify LPs of different languages at different conditions. Hence, they require high computational power. Some researchers have implemented the system for specific languages like Arabic with static cameras [27, 28], [29] like CCTVs. These models show an average detection accuracy of 93% at a detection time of 0.6 s. Block Chain based LPR systems are also included in the development of Intelligent Transportation systems [30]. This single image framework uses only license plate database to verify the recognized LPs. Hence, the system cannot trace whether the license plate belongs to the particular vehicle under authentication.
Simultaneously researchers, on the other hand, tried to identify make and model from the shape of the vehicle. This process is termed vehicle make and model recognition (VMMR). The image-based VMMR methods are based on handcraft features or convolutional neural networks. Image processing methods like SIFT [31] and SURF [32] use these hand-crafted features for VMMR. With the introduction of Convolution Neural Networks (CNN) [33], VMMR systems became more efficient in extracting image features than handcrafted features. The primary feature for recognizing the make is the logo of vehicles available at the frontal and rear side of the vehicles. This logo-based recognition is possible only with the frontal and rear images [34–36], since the logo is available at frontal and rear part of the vehicle. A brief scheme to recognize the make and model of a vehicle using frontal images was introduced in [17]. Hence, the datasets developed for VMMR are primarily focused on frontal and rear images. The authors in [37] constructed a dataset that contained 11,500 logo images of 10 manufacturers. An average accuracy of 99.07% was obtained on the recognized logo images. All the models trained on popular datasets like Stanford cars dataset [38], VMMRdb dataset [39], and Vehicles images dataset are logo-based recognition. If the input image is a side view of the vehicle, the models will fail.
To include a two-step verification, researchers started to include VMMR data with LPR data. The author in [2] improved the FLP detection system by combining the vehicle make recognition system with the LPR system. When the license plate was interchanged between different vehicle make, the system detected. the FLP with an authentication accuracy of 84%. In this work, the vehicle model is not considered for FLP detection. Vehicle make and model details are used with LP details to authenticate the LPs in [3, 40]. All these multi-image-based frameworks are trained with frontal images only. These frameworks do not authenticate vehicles using rear and side images. This is not suitable for a robot-based FLP system. Moreover, these frameworks are not validated in real-time with edge devices.
From the literature the following phenomena has been observed: There is no public dataset in common for LPR and VMMR tasks. LP number database is not available to implement FLP systems. There is no robust multi-image-based framework to detect FLP from rear features of vehicles using VMM details and LP details. The existing VMMR models fail for side views of vehicles. Single image-based frameworks are focused on improving character recognition. Multi-image-based frameworks are focused on improving the vehicle make and model details and none has focused on partial authentication case. The multi-image-based frameworks are not implemented in edge devices. The multi-image-based frameworks have not been implemented to recognize FLP from real-world surveillance systems based on CCTVs and patrol robots.
In India, most of the criminals who cause a significant number of accidents and traffic violations use fake license plates (FLP). The CCTV cameras installed in today’s intelligent transportation systems can detect the LPs but cannot authenticate the vehicle details. This leads to the erroneous charging of fines for traffic violations. Moreover, the patrol team receives false information about the vehicles causing accidents. To control the use of FLP, the patrol team intercepts vehicles randomly and makes cross-verification of the vehicle details. For verifying the vehicle details, the patrol team has to write down the LP visually and type it manually into the VAHAN mobile application of the Government of India. This primary process gives information about the owner and vehicle from its database, followed by cross-verification with the observed data. Thus, the system is still based on 100% manual inspection, thereby creating a serious concern for the safety of today’s smart cities and intelligent transportation systems. Here it is to be noted that this sort of process cannot be performed manually for all vehicles since it is a laborious and time-consuming process. Recently, the Government of India introduced High-Security Registration plates (HSRP), which can also be easily faked. Such cases are on the rise in India. From the literature [2], it was observed that research had been performed to detect FLP using vehicle make detection and LP recognition using a state vector machine classifier from frontal images of vehicles. The total authentication accuracy of FLP detection was only 84%. Moreover, FLP detection using only frontal images was performed in [12] and [3]. These multi-image models are developed based on frontal images only without considering other sides. The model fails to authenticate if the frontal part of the vehicle is modified. Moreover, the models fail to authenticate parked vehicles with rear and side images. There needs a comprehensive authentication framework which uses multiple images from multiple views (i.e.) frontal, rear, right side and left side view of the vehicle in overcoming the above-mentioned challenges. In order to acquire multiple view images, a mobile robot-based image acquisition is necessary. Thus, a robust license plate authentication model is necessary to deploy the mobile robot system. Hence, the objective of this work is to develop a multi-view-image based license plate authentication framework with vehicle make and model recognition that authenticates a vehicle from its frontal, rear, and/or side images, instead of single view frontal images of the vehicle.
The proposed framework uses an algorithm to detect and recognize license plates from multiple images. It then compares the results against a real-time database of registered license plates. The vehicle’s make and model are also identified to ensure the results are accurate. The proposed framework is designed to provide an accurate and reliable license plate authentication. Moreover, this framework is feasible for implementation in mobile robot-based patrol systems which works on edge devices like the Raspberry Pi.
Therefore, this work focuses primarily on the development of (i) A VMMR model to recognize vehicles from either frontal, rear, or side images of vehicles and (ii) A multi-image-based framework to detect FLP from VMM and LP details that is easily implementable in edge devices. Moreover, to train and test the performance of the framework, (i) A License plate database and (ii) An image dataset has been developed.
The core objective of this work is to develop a multi-view-image based license plate authentication framework with vehicle make and model recognition that authenticates a vehicle from its frontal, rear, and/or side images, instead of single view frontal images of the vehicle.
The contributions of this paper are listed below: License plate database is developed for 547 vehicles. An Image dataset is developed with 3173 images of 547 vehicles in 8 car classes at Indian scenario. Development of an Ensemble model to recognize vehicle make and model from frontal, rear, and side images. Development of a framework to authenticate the license plate with frontal and rear images.
The remainder of this paper is organized as follows. Section IV discusses the Data preparation method. The framework of the authentication system is reported in Section V. The experimental results and discussion are presented in Section VI. Section VII presents limitations, while the comparison with the literature is explained in Section VIII. Section IX introduces the database update module. Section X expands the framework to scalable system and Section XI sums up the discussion with the conclusion.
Data preparation
In the literature, VMMR models were trained using datasets such as Stanford Cars dataset [38], VMMRdb dataset [39], and Vehicles images dataset [41]. LPR models were trained using ChineseLP, OpenALPR-EU, SSIG-SegPlate, and UFPR-ALPR datasets.
There are many constraints in these datasets for developing a robust model. Firstly, the existing datasets do not contain images of all the sides of vehicles for VMMR. Secondly, the number of frontal and rear images is not balanced. Thirdly, license plate details are not available. Since the proposed approach to the problem includes both VMMR and LPR tasks, a custom vehicle database has been built with prominent cars available in India to cater to the need for the two tasks.
Indian Authentication database
An Indian Authentication database is developed, which consists of (i) A LP database for 547 vehicles and, (ii) An image dataset consisting of 3173 images of 547 vehicles in 8 vehicle classes. For diversity, vehicles from different parts of India were included in the database. The vehicles in the dataset were at different definite angles to analyze the performance of the framework. Images of electric vehicles have also been included, following the rise in the number of electric vehicles.
The frontal view, the rear view, and views from both sides were collected for VMMR, which was not attempted in the literature. Then, the images were grouped into eight car classes in a day environment at Indian road scenario. The frontal and rear images were collected from websites on used cars like carswale.com and Olx.com. Images of around 200 parked vehicles were captured using a static camera, which significantly increases the number of images in the dataset. As the images were captured using different mobile phone cameras of varying resolution, the images of the dataset are of different image sizes. Training recognition models with this dataset increases the robustness of the model.
To improve the robustness of the proposed model, it is trained on vehicles with occlusions, modifications, and different oblique angles. The list of images under different varieties is detailed in Table 1. This dataset can be used by researchers who need to develop a system with both VMMR and LPR systems combined.
Images List
Images List
Details such as license plate number were collected for around 547 vehicles and stored in an Excel database. The sample LP number for the Baleno class is tabulated in Table 3. As electric vehicles are fewer in India, the number of vehicles for Tata Nexon is only 15.
Sample Baleno Numbers
Sample Baleno Numbers
From the database, it can be observed that there is a variation in LP according to the state, district, and number of vehicles in the particular district. The segmented eight classes focus on Hatchback, Multi-utility vehicle (MUV), Multi-purpose vehicle (MPV), Sports utility vehicle (SUV), Sub compact car.
There are 3173 images in the dataset. The list of images under different classes is listed in Table 2.
Details of vehicles in the database
Details of vehicles in the database
Figure 4 (a), (b), (c), (d) represent the frontal images and Fig. 4 (e), (f), (g), (h) represent the rear images captured at different angles. This includes images of electric vehicles with green number plates. Figure 4 (i), (j), (k), (l) represent side images. Figure 4 (m), (n), (o), (p) have images of modified vehicles. Occluded LPs are shown in Fig. 4 (q), (r), (s), (t). Figure 4 (u), (v), (w), (x) represent images of LPs at 0°, 20°, 60°, and 90°.

(a), (b), (c), (d) Frontal Images, (e), (f), (g), (h) Rear Images, (i), (j), (k), (l) Side Images, (m), (n), (o), (p) Modified Images, (q), (r), (s), (t) Occluded Images, (u), (v), (w), (x) License Plates at 90°, 60°, 20°, and 0°.
After dataset collection, the ground truth generation step was performed, as shown in Fig. 5, to train and test the CNN models. The location of the region of interest is specified manually in this step. The authors in [42] used the LabelIMG tool to draw a rectangle around the vehicle known as a bounding box and specify the class number. An annotation file consisting of the class number and coordinates was generated. To avoid over fitting, the annotation files, and images were split into 70% for training and 30% for testing the neural network.

Ground Truth Generation Process.
In the literature, Dongyu Guo [2], in his work, detected Fake License Plate (FLP) based on the license plate (LP) number and vehicle make details. The vehicle make was classified by the SVM classifier, which is not suitable for a large number of classes. Wei Pan [3] detected FLP based on vehicle color and license plate details. Both pieces of literature were designed to detect FLP by using frontal face images of vehicles only. Also, no literature has focused on the real-time implementation of FLP systems in edge devices.
Hence, it is observed that today’s FLP detection systems are predominantly based on complete manual authentication. Based on this factor, the authors of this present work have developed a real-time FLP detection system based on Vehicle make and model (VMM) and LP details that is feasible for edge devices. Since the robust FLP detection system needs to recognize vehicles from any view, the VMMR model in the framework is trained to detect VMM from either frontal, rear, or side images of vehicles that were not discussed in the literature. Moreover, the authors experimented with the FLP framework with real-time images and observed the results. The framework of the proposed authentication system, as shown in Fig. 6, consists of three parts namely (i) Vehicle Make and Model Recognition (VMMR) Module (ii) License plate Detection and Recognition Module (iii) License Plate Authentication (LPA) Module. This VMMR network consists of (i) Vehicle Localization and (ii) Vehicle Classification. The localization network in the VMMR network localizes the vehicle in the input image. The localization neural network plots a bounding box around the vehicle. After localization, the vehicle is cropped and sent to the classification network. The VMM details recognized are used for authentication. The second process in the framework is LPR, for which the cropped vehicle image from vehicle localization is sent as the input for the License Plate Detection and Recognition Module. The LP is detected from the cropped vehicle image by the LP detection neural network, and a bounding box is predicted. Subsequently, the license plate image is cropped and sent to an OCR to recognize the characters. The characters recognized are used for authentication. In the LPA module, if the VMM and LP are recognized, then the details are cross-verified with the LP database to identify authenticity. If there is a mismatch, the LP is classified as Unauthorized (i.e.) Fake License Plate (FLP). If the details match, then the LP is classified as Authorized. As there is no loss of data, these cases require no manual inspection by the patrol team. However, if either VMM or LP is not recognized, 50% manual inspection is required for the missing data. If both VMM and LP are not recognized, 100% manual inspection is required.

Block diagram of the framework.
During the training process, the localization network must receive images of the same size. The images of the custom dataset are resized according to the neural network’s requirements. In the proposed frame-work, the training and testing images were resized to 224×224×3 as per the input size of the VGG-16 model. The Visual Geometry Group (VGG16) network [43] was used for Vehicle Localization. It is a deep convolutional network with 16 layers. There are 13 convolutional layers of size 3×3. This is the smallest size to capture the notions of left/right, up/down, and center. The convolutional stride was fixed to one pixel. Spatial pooling was carried out by five maximum pooling layers in between the convolutional layers. Following the convolutional layers, three dense layers were used. All hidden layers were equipped with the ReLU activation function. Finally, the output was classified by the softmax layer. The car is detected only if the confidence interval is> = 0.90. The proposed model predicts four values specifying the ymax). These four values were used to draw a bounding box over the vehicle. The localized vehicle was cropped and sent to the vehicle classification network.
Vehicle classification
Cropping reduces the size of the image, which reduces the training time of the model. After detection, the images are cropped to remove unwanted information. For instance, vehicles with backgrounds such as roads and trees are redundant information. Vehicle classification is used to recognize and classify localized vehicles. For vehicle classification tasks, transfer learning is used. The cropped image is given as input to the state-of-the-art networks. The base architectures used are Insception ResNetV2 [44], DenseNet201 [45], MobileNetV2 [46], ResNet50V2 [47], Efficient-NetV2B3 [48]. These architectures are chosen based on their unique qualities, such as scaling capability, solving vanishing gradient problems, and feasibility in edge devices. All the predefined layers of the state-of- the-art models are not trained to reduce the training time. Only the final layers are trained. The final layer consists of eight neurons with a softmax activation function and categorical cross entropy as a loss function to classify the vehicles. The Insception ResNetV2, DenseNet201, MobileNetV2, ResNet50V2, and EfficientNetV2B3 networks were ensembled to improve the metrics [49]. These models are known as ensembled models. The convolutional layers, pooling layers, dropout layers, and fully connected layers extracted the feature maps during the training process. The model was trained until the validation loss is negligible. Prominent fine-tuning methods of neural networks involve Hyperparameter tuning, and Transfer Learning based tuning. Hyperparameter tuning involves tuning the hyperparameters like learning rate, dropout rate, number of hidden layers, batch size and epochs. The common methods of hyperparameter tuning are (i) Grid Search (ii) Random Search (iii) Bayesian Optimization (iv) Sequential Model Based Optimization (v) Genetic Algorithms (vi) Gradient based Optimization (vii) Randomized Search cross validation (viii) Automated hyperparameter tools (ix) Ensemble based methods. Transfer Learning based tuning involves adjusting the weights of the pretrained model to fit the current dataset. Moreover, changing the architecture of the pretrained model like the number of neurons, freezing/unfreezing certain layers adapts the model to the target dataset. The vehicle classification model in the proposed framework is developed by ensembling five different models. The specific steps are detailed below:
Identify the base models
The base models for the ensemble model are chosen based on complexity of the data, the number of classes, balanced data, identical features, training infrastructure present, transfer learning opportunities and the computational resource available in the edge device. In our proposed model, Insception ResNetV2, Dense-Net201, MobileNetV2, ResNet50V2 and Efficient- NetV2B3 are chosen based on their unique qualities, such as scaling capability, solving vanishing gradient problems, and feasibility in edge devices. Moreover, all these models are predominately used in vehicle detection and classification.
Modify the architecture of each base model
The model’s architecture is changed according to the specific requirements of target task. The number of layers to be untrained and to be trained is decided based on the features to be extracted. In the proposed ensemble model, the number of neurons on the output layer of the model is changed to eight to match the number of classes. The final layer is trained with SoftMax activation function to predict the probabilities of each class. Final layer of all the base models, is trained while the inner layers are untrained. This reduces the training time of the model. The pretrained weights of ImageNet dataset is chosen for detection and classification.
Select hyperparameters of individual models
Identify hyperparameters like learning rate, dropout rate, number of hidden layers, batch size, number of epochs and number of neurons in the hidden layers specific to each neural network architecture. To fine-tune the proposed model, learning rate, batch size, and number of epochs is selected as hyperparameters. Moreover, optimizer is analyzed with the developed dataset.
Select search space
Search space is the range of values through which the hyperparameters tune the model. The search space of learning rate is 0.0 to 1.0. Typically, the 32 or 64 is chosen as the batch size. The number of epochs is chosen in the range 500 to 2000. Manually, the values are set and the validation accuracy of the individual model is analyzed. Adam optimizer is selected since it requires a learning rate of only 0.001 for optimization.
Organize the dataset
To solve unbalanced data in the dataset, data augmentation techniques shall be used. To train and test a neural network, the dataset is split into training set and testing set. The image dataset developed is split into 70 percent for training and 30 percent for testing without any data augmentation.
Evaluate performance
Assess the performance of the model on unseen data that wasn’t used during training or testing. Evaluate metrics such as accuracy, precision, recall, F1 score and model size.
Ensemble model formation
The base models are combined to form an ensemble model. The inference time of predictions by the ensemble model is analyzed for real time implementation. Among the ensemble model average ensembling provides the best accuracy.
The model weights obtained can be used for future implementations. When tested, the models recognized the vehicle using the features observed and predicted the vehicle class number, which was later mapped to the class name. The performance of the networks was compared based on metrics like recall, precision, accuracy, and F1 Score. To ensure compatibility with low-cost edge devices like the Raspberry Pi, the model size of the networks is also compared.
License plate detection and recognition
Under this process, the cropped vehicle image from vehicle localization is sent to the License Plate Detection and Recognition module. Easy OCR (Optical Character Recognition) [50] is used as the License Plate Detection and Recognition module. This is be- because it is a lightweight model that supports character recognition in all popular scripts, including English, Latin, Chinese, and Arabic. Moreover, the size of the model is smaller than other state-of-the-art methods. The detection model is a regression model to predict the boundaries of the bounding boxes. The output of the module can be altered by changing the hyper parameters. In this module, the VGG16 neural network model is used as the base model to predict the boundary boxes. The OCR extracts the characters of the license plate after detection.
License plate authentication
The VMMR module provides the details of VMM, and the LP detection and recognition module gives the LP number. When the make and model are not recognized from one view of the vehicle, the framework uses the next side of the vehicle for the authentication process. One at a time, all the four sides are used for recognition. The framework infers that the vehicle is modified at the side where there is no recognition. The VMM details are tapped from the database for the recognized license number and compared with the recognized VMM details. As discussed in the previous page, vehicle authorization has arrived through authentication, which is classified into four cases viz. (i) If the details mismatch, the output is Fake/Not Authorized, thus performing Successful Un-authorization. (ii) If the actual and recognized are matched, then the output is Legit/Authorized thus performing Successful Authorization, (iii) If the system could not recognize the VMM or LP, the output is Partially Authorized, thus performing Partial Authorization and (iv) If the system cannot recognize both VMM and LP, authorization cannot be performed, thus performing No Authorization.
Experimental results and discussion
The Indian Authentication database was uploaded to Google Drive from the Kaggle website for training and testing. The Google Colab Pro platform was used to train the neural networks. The hyper parameters of the detection and classification models were tuned for the best accuracy.
Vehicle localization
The VGG16 neural network, which is the state- of-the-art network for regression, was used in this step. The images from the database were resized to 224×224×3 images for the VGG16 neural network to process.
The bounding box coordinates were predicted. Python code was used to draw a blue rectangle using the coordinates. During the training, it was noted that the VGG16 achieved 99% top-1 accuracy in localization. The localized images are shown in Fig. 7. The average inference time for the detection is 2 sec. Here, it can be noted that the model detects the vehicles with frontal, rear, and side images and even in occlusions. From Fig. 7, it can be observed that the vehicles with shadows and at different angles are also easily localized. The first row shows the detection of multi-utility vehicles, while the second and third rows show the detection of Hatchback vehicles.

Results of Vehicle Localization.
Other than models in the literature, frontal, rear, and side images of cars were used to train the proposed classification network. Convolutional and pooling layers were used to extract the features of cars. The performance of the models was evaluated [51, 52] using the Equations (1) to (4).
Were, TP = True Positive, where the model correctly predicts the positive class. FP = False Positive, where the model wrongly predicts the positive class. TN = True Negative, where the model correctly predicts the negative class. FN = False Negative, where the model wrongly predicts the negative class.
The license plate authentication block classifies the vehicles as FLP if false positives and false negatives are observed. Therefore, the VMMR network for fake license plate detection should produce a minimum number of false positives and false negatives. This is a serious constraint in classification.
From Table 4, the recall, precision, and F1score of ResNet50V2 are the lowest at 0.79, 0.82, and 0.79, respectively. This is the least valuable among the base architectures chosen for this work. Hence, this net-work cannot be used for real-time FLP detection. EfficientNetV2B3, which is the specialized architecture for developing scaled networks, classifies the vehicles with recall, precision, and F1 score of 0.81, 0.83, and 0.81. MobileNetV2, which is designed for edge devices, shows an increase of 0.02 in precision. But this cannot be used in real-time. If these are used in real-time, the FLP detection system will fail miserably and cause chaos. On the other hand, InsceptionResNetV2, which is a combination of Insception and ResNet, observes an increase in the metrics. The fully connected networks in DenseNet201 increase the metrics further to 0.86, 0.89, and 0.86. The number of false positives and false negatives decreases for ensembling methods. It is the least for the Average Ensembling method compared to other basic architectures. Hence, according to Equations (2), (3), and (4), the recall, precision, and F1 score are at their best for Average Ensembling (i.e.) 0.92, 0.93, and 0.92 followed by Voting-based Ensembling (i.e.) 0.91, 0.92, and 0.91. Subsequently, the Maximum ensembling method classifies at 0.9, 0.91, and 0.91, respectively. Hence, the average ensembling method that has the least false positives and false negatives shall be used in edge devices.
Metrics of Recognition models
As shown in Fig. 8, among the base architectures, the MobileNetV2 network which was developed for edge devices, predicts the VMM with an accuracy of 71% within 48 ms. This is considered the base accuracy, followed by ResNet50V2 at 73%. The InsceptionResNetV2 increases the accuracy to 78%. The dense layers in DenseNet201 learn the features much better, which results in a drastic increase to 82%. EfficientNetV2B3 classifies at 84%. The backbone architectures classify the VMM within 100 ms as the number of parameters is lower compared to ensemble models. When the basic architectures are ensemble by maximum, voting, and average methods, the run time increases to approximately 220 ms. The ensembling computations are responsible for this increase in run time.

Results of vehicle recognition by various architectures.
When these basic architectures are ensemble by Maximum, Voting, and Average methods, the accuracy increases to 89%. The average ensembling method provides the best accuracy, followed by voting-based ensembling and maximum ensembling for images in the Indian Authentication database. This proves that the ensembling methods improve classification accuracy.
The model size of these ensembling models to implement in edge devices is high, as shown in Fig. 9 compared to other models. The model size for InsceptionResNetV2 is 629.6 MB, which makes it difficult to implement in edge devices with 2GB, 4GB, and 8GB RAM.

Memory Size Vs Accuracy of various architectures.
Based on a tradeoff between time, memory size, and accuracy, three models were selected for implementation, as shown in Table 5.
Selected Model
From these three models, it can be observed that the Ensemble model is ideal for Raspberry pi boards with 8GB RAM as it observes high accuracy with a moderate memory size. The size is moderate since the models are trained on multi-view images such as frontal, rear, and side images, which are not available in other frameworks. In Figs. 10, 11 and 12 the vehicle in the image is recognized as Maruti Suzuki-Celerio. The bounding box in blue in the respective images indicate the location of the vehicle in the given input image. Similarly, the blue text represents the recognized vehicle. Thus, the framework can detect VMM based on frontal, rear and side images.

Vehicle Classification Output - Frontal Image.

Vehicle Classification Output - Rear Image.

Vehicle Classification Output - Side Image.
The Module was trained to recognize LP with frontal and rear images. The performance of the model was evaluated in real time. The license plate detection was performed by a regressor neural network model, VGG16. The bounding boxes were generated around the license plate. The image within this bounding box was cropped and sent for character recognition. Easy-OCR was used for character recognition. The location of the vehicle was encapsulated by a blue bounding box. As shown in Fig. 13, the blue text indicates the VMM. The LP is encapsulated by a bounding box in green if the confidence level is greater than 0.4.

Results of VMMR module and LPR Module (a) Clear Image (b) Blurred Image.
Figure 13(a) shows the VMM as Tata-Nexon with a confidence level of 99.99%. The license plate is recognized as MH12SY7020 with a confidence level of 62.54% since the vehicle is straight toward the camera.
Figure 13(b) also shows the VMM as Tata-Nexon with a confidence level of 99.99%. The license plate is recognized as MH01DB2864 with a confidence level of 47.54% since the image is blurred. The module has a better confidence level if the vehicle is straight toward the camera.
The performance of optical character recognition (OCR) for license plate character recognition task, depends on factors like the quality of the image in dots per square inch (DPI) and the skewness of LP characters in the captured image. The minimum dpi required for the framework is 200 dpi. The skewness increases as the angle between the optical axis of camera and the axis of license plate increases. This angle between the optical axis of camera and the axis of license plate is known as tilt angle. The tilt angle with samples of camera position is illustrated in Fig. 14.

Tilt angle representation.
The OCR accuracy is calculated by counting the number of characters correctly recognized among the total number of characters. In our work, the total number of characters for each LP is ten. The skewness increases as the tilt angle of the image captured increases thus reducing OCR accuracy. If the tilt angle is zero, the skewness of the characters in the LP is zero. The increase in skewness increases the computation thus increasing the inference time. This reduces the processing speed. The real time accuracy and inference time is plotted for different tilt angles as shown in Fig. 15.

OCR Accuracy and Inference time for varying license plate tilt angles.
From the plot, EasyOCR recognizes the characters with an accuracy of 100 percent in 0.79 seconds at zero-degree tilt angle. When the tilt angle is 90 degrees, the OCR couldn’t recognize any characters even after 1.3 seconds. Thus, the accuracy is zero.
The performance of EasyOCR is compared with other popular OCR engines like Tesseract OCR, Keras OCR, and Paddle OCR and is tabulated in the Table 6. The accuracy of various OCRs is observed for the images in the database at zero tilt angle with Raspberrypi hardware. It is observed that the EasyOCR provides the maximum accuracy of 99 percent with an inference time of 0.79 seconds. As EasyOCR is the lightweight model among the OCR series, it is also well suitable for real-time applications.
OCR Comparison
However, Though Keras deep learning framework, KerasOCR can recognize standard license plates and license plates with curly fonts, it consumes more inference time than EasyOCR which is approximately 1.86 seconds. Hence, it is comparatively inferior for real-time applications. Subsequently, Paddle OCR and Tesseract OCR recognize the license plates with an accuracy of 97 percent and 90 percent at an approximate inference time of 1.34 and 1.23 seconds. As a result of this comparative study on OCR series, it is evident that the EasyOCR provides maximum accuracy at a minimum inference time which is well suitable for real-time authentication. The extracted LP information can be used to authenticate the LP, which is explained in the next section.
The Authentication block performs the license plate authentication. If the VMM is not recognized, the framework uses the next side for the process. The Fig. 16 (a), (b) shows frontal and rear image of Tata Nano modified as a mini truck. The vehicle is not recognized from the rear image, whereas the same vehicle is recognized as Tata Nano from the frontal image. The license plate details recognized from the rear image and the vehicle make and model details recognized from the frontal image is sent to the authentication block for authentication.

Results of Tata Nano (a) Modified Rear (b) Unmodified Frontal.
After the recognition process, the authentication module compares the recognized VMM details and the LP details with the database and identifies the FLP. The results of four cases (i) Successful un-authorization(ii) Successful Authorization (iii) Partial Authorization and (iv) No Authorization is explained in this section.
In this case, the VMM and LP are recognized correctly, but there is a mismatch between recognized and the corresponding database details. Thus, the LP is identified as fake. Figure 17(a) represents a fake case where the frontal view of Maruti Baleno is identified with unauthorized number HR 26 ET 3568. The rear view of Maruti Suzuki, as in Fig. 17(b), is also identified with unauthorized number MH 03DK 0771. The vehicle is drawn with a red boundary to represent an unauthorized vehicle. Thus, the framework identifies FLP from rear and frontal images. This case does not require manual intervention from the patrol team.

Unauthorized plate/FLP successfully detected (a) Frontal Image (b) Rear Image.
The VMM of vehicle in Fig. 18 (a) and (b) is identified as Maruti Suzuki and Baleno, respectively. The LP number is recognized as MH05DS5345. These details were cross verified with the database details and were found to match. So, the vehicle is classified as legit or authorized. The vehicle is drawn with green boundary to represent authorized vehicle. The framework worked for both rear and frontal images since the model was trained with frontal and rear images. This case does not require manual inspection from the patrol team.

Authorized plates successfully detected for (a) Frontal Image (b) Rear Image.
The Partial Authorization case is caused due to two reasons.
VMM Recognized and LP Unrecognized
Partial Authorization is a case where the system fails to recognize either the LP or VMM. The VMM of the vehicle in Fig. 19 is identified correctly as Toyota and Innova, respectively, but the LP could not be recognized since the font of the alphanumeric is curly.

Partial Authentication - LP not recognized.
Thus, this case identifies the LPs that are not in standard format. The output is represented by a blue box, as shown in Fig. 19. As LP is not recognized, manual inspection of LP is needed, thereby requiring 50 percent manual inspection.
LP Recognized and VMM Unrecognized
This case occurs when the model recognizes untrained vehicles or vehicles that are modified in all sides. The Fig. 20(a) shows a Toyota passenger car. The framework cannot recognize the make and model even from frontal, rear and side images of the same vehicle. But the framework recognizes the license plate characters as MH46AL7236 with 99% confidence.

Vehicle Make and Model Not Recognized (a) Toyota Glanza (b) Tata Nano modified as helicopter.
Sample images of Tata Nano modified as a helicopter is as shown in the Fig. 20(b). The model is trained to recognize Tata Nano, but could not recognize the modified Nano as all the sides of the vehicle is modified. The license plate of the vehicle is recognized as BR01AU8108 with 92% confidence. At this stage, alert is sent to the patrol team by the framework to recognize the vehicle’s make and model details manually.
In this case as shown in Fig. 21, both LP and VMM were detected but not recognized.

LP and VMM not recognized.
Hence, this case requires a complete, 100 percent manual inspection by the patrol team to recognize both LP and VMM. However, these cases are minimum.
As found in Fig. 22, the Maruti Suzuki-Baleno was recognized wrongly as Hyundai-Verna due to the shiny surfaces of the vehicle. The LP number was identified as AS01DZ6325, which actually belongs to Maruti Suzuki-Baleno. In this case, the LP was recognized correctly, but VMM was recognized incorrectly. The number is in the database, and hence the authentication result is unauthorized. This is a limitation of the proposed framework.

Not authorized case - Vehicle make and model detected wrongly.
The confidence level of proposed VMMR model is about 34% if all sides, including frontal, rear, and both sides, are modified. In this case, even though the vehicle is among the trained classes, the framework fails to authenticate. From the experiments, it has been observed that the framework fails when the numbers are occluded or partly visible, due to different lighting conditions and glares. The vehicles cannot be detected due to different environmental conditions like shiny, rainy, foggy or if the font is not up to standards (i.e.) curly letters and numbers are not recognized. The fixtures on the number plate are recognized as characters. (e.g.) The screws fastened to the number plate are recognized as zero, as shown in Fig. 23(a). The borders of the number plate are recognized as ‘I’ or ‘L’ as shown in Fig. 23(b). The Occluded LPs can be detected but cannot recognize the characters. To solve this problem, a number plate recognition neural network can be trained with curly fonts, occluded numbers, and occluded alphabets and deployed in the framework instead of Easy-OCR module. However, the number of neural networks will increase to four. Thus, the total RAM memory consumed increases, making it less feasible to deploy in edge devices. This causes problems like real time data leakage, reduced image frame rate, and an increase in inference time. Hence, the number plate is detected with confidence level greater than 0.4 if the angle of elevation or angle of depression is zero. The total memory size of the framework using Ensembling model with Easy-OCR is 308MB, which is easily implementable in edge devices.

(a) Screws detected as 0 (b) LP border recognized wrongly.

Sample cases for Comparison with literature (a) Authorized Vehicle - Frontal Image (b) Unauthorized vehicle with FLP - License plate interchanged (c) Authorized Vehicle - Rear Image (d) Rear Image of unauthorized vehicle which uses LP of Maruti Suzuki-Celerio in fig (a) (e) Frontal Image of unauthorized vehicle which uses LP of fig (a) Maruti Suzuki-Celerio (f) Side image of vehicle in fig (e).
The comparison of the performance of the proposed framework with the literature is listed in Tables 7 and 8. The proposed framework performs license plate authentication more effectively since the neural network introduced for VMMR extracts features with convolutional neural networks, while [2] extract saliency areas of image and converts into visual code books. [3] uses a neural network for VMMR and LPR. The SVM classifier was used to classify the vehicle in [2], whereas the proposed framework uses softmax layer. Moreover, LP authentication was performed with frontal images of vehicles in [2, 3], whereas in the proposed framework it has been performed with frontal, rear and side images. Thus, the proposed framework has better robustness compared to [2] and [3], as shown in Table 7. The comparison is performed with six cases. In the first case, as shown in Fig. 24(a), the input is frontal image of an authorized vehicle with VMM as Maruti Suzuki and Celerio respectively. The vehicle has the authorized LP number MH03DK0771. All frameworks classified the vehicle as authorized successfully since both use vehicle detection to classify the vehicles. In the second case, as shown in Fig. 24(b), the same vehicle with FLP number HR 26EJ5682 was used. All frameworks identified the FLP and classified it as unauthorized. The rear image of the vehicle used in the first case was the input for the third case. [2] and [3] could not recognize the vehicle found in Fig. 24(c) whereas the proposed framework recognized and classified as Authorized. Figure 24(d) shows rear view of a Maruti Suzuki-Swift car which uses LP of Maruti Suzuki-Celerio used in the first case. [2] and [3] could not recognize cars in rear view, whereas the proposed framework classified the LP as unauthorized. In addition, manual inspection is required, whereas proposed work is not. The fifth case, as found in Fig. 24(e), shows the frontal view of Maruti Suzuki-Swift car with the LP of Maruti Suzuki-Celerio. The results show that [2] classified the LP as Authorized, whereas the proposed framework classified the LP as unauthorized. The results show that if LP is interchanged between different make and model (e.g.) Maruti Suzuki-Celerio and Tata-Nano, the literature and the proposed framework identify FLP but if LP is interchanged between similar make and model (e.g.) Maruti Suzuki-Celerio and Maruti Suzuki-Swift then the proposed framework and [3] classify, whereas [2] does not. Moreover, the proposed framework recognizes VMM with rear images, while [2] fails. VMM is recognized in Fig. 24(f) by the proposed framework whereas [2] and [3] fails. When the proposed framework is tested with a sample of 100 fake license plates, it could authenticate almost 91% of the vehicles correctly since it could authenticate the LP from frontal, rear, and side images. Further, the proposed framework can be easily implemented in low-cost mobile robot patrol systems. The factors affecting the accuracy of the model are (i) Extreme Lighting conditions (ii) Modifications to the vehicle body (iii) Similar inter-class vehicles (iv) View angle of license plate (v) Vehicle-side under authentication.
End to End Result Comparison
End to End Result Comparison
Comparison with literature
The conventional models from the literature are trained with frontal images only. The logo on the frontal images is one of the distinctive features for Vehicle Make and Model Recognition (VMMR). The accuracy is around 98% for frontal and rear images. For vehicles that are modified at the frontal and rear, side images form distinctive features. The vehicles can be recognized with 97% accuracy with side views, except for the classes that have overlapping features. The Type 1 error and Type 2 error are high for some interclass vehicles when viewed from the side (e.g.) Maruti Suzuki Celerio and Maruti Suzuki Baleno thus reducing the accuracy of the model to around 91% as these inter-class vehicles have similar features in the side view. The accuracy is around 97% in all the views for other inter-class vehicles (e.g.) Toyota Innova and Tata Nano as the features are very distinct in all the views. The vehicle’s year of manufacture is not considered for this research. (i.e.) Maruti Suzuki Celerio 2019 version and Maruti Suzuki Celerio 2022 version are intra-class vehicles and the proposed VMMR model recognizes both versions as Maruti Suzuki Celerio with 99% accuracy. The license plate is visible at 90 degrees. As the angle of view decreases, the accuracy decreases. This effect is the same for the best LPR systems. Since the model is developed based on images from different cases including outdoor and indoor, the model can recognize the vehicle features in low and bright lighting conditions. However, extreme brightness or extreme darkness reduces distinct features, which affects the accuracy of the model, hence leading to the partial authorization case.
The major challenges include (i) Parallel processing of license plate authentication and robot motion control (ii) Shortage of Random Access Memory (RAM)
(iii) Data loss due to blurring of images (iv) Authentication Error due to OCR. (v) Position of License plate on the vehicle. Parallel processing of license plate authentication and control algorithm requires high computational power and hence requires high RAM. This can be solved by developing a framework with a minimum model size. Moreover, choosing an 8GB variant among 2GB, 4GB, and 8GB variants can enhance performance. Pruning the model not only reduces the model size but also reduces the accuracy of detection. Data loss of image-like blurring happens when the mobile robot moves along the roadways as in Fig. 25. This can be avoided by developing an algorithm that selects authentication or locomotion depending on the confidence level of the detected vehicle. This algorithm will also help in solving the RAM problem.

Patrol Robot.
This reduces the inference time when deployed in edge devices. The authentication error for OCR can be solved by using simple techniques like position matching and autonomous mobile robot alignment. For occluded license plates, the license plate can be viewed by using a camera moving mechanism on the mobile robot.
The database update module (DUM) is used to update the existing vehicle database by images of new vehicles. In our work, the modification of a vehicle side is defined based on the confidence level of recognition. With a confidence level ranging from 85 to 94 percent, vehicle make and model recognition (VMMR) from a vehicle side picture indicates that the vehicle structure has undergone a little modification on that particular side. The term for this is minor modification. Confidence level less than 85 percent indicates that the vehicle structure is modified severely at that side, which is defined as major modification. Whereas, a confidence level greater than 94 percent implies that the vehicle is not modified on that side. Thus, the three levels of confidence are denoted as CL1 for confidence level greater than 94 percent, CL2 for confidence level in the range of 85 to 94 percent and CL3 for confidence level less than 85 percent. Some examples of minor modification are (i) Bumper modification (ii) Light modification (iii) Windshield modification. Whereas, Tata Nano modified as helicopter, Tata Nano modified as autorickshaw, Maruti Celerio modified as Rolls Royce are examples of major modification. The steps for adding images to the dataset based on minor and major modification are represented as flowchart in Fig. 26. In this work, the dataset for eight classes is developed manually by capturing images of all vehicle sides. To expand the dataset, images of new vehicles should be captured from all sides manually. The proposed model recognizes the make and model of vehicles without any modifications from any side with confidence level CL1. Adding these images will not benefit the existing model and hence these images are excluded for dataset update. The condition box in brown verifies CL1 value at all sides. If all the sides are recognized with CL1, the images are sent for exclusion as shown in red. Vehicle sides which are modified slightly are recognized with CL2. If all sides are recognized with similar class, the sides with CL2 are added to the corresponding class in the dataset. Thus, minor modification to the existing class is identified and added to corresponding class in the dataset. This increases the number of images in the existing class of the dataset. Moreover, vehicle sides which are modified severely are recognized with CL3. The features of severely modified vehicles will change the features of existing trained classes. Hence, severely modified vehicle sides of trained classes are not added to the existing class of dataset. The features of vehicles which undergone major modification at all sides (e.g.) Tata Nano modified as a helicopter lookalike are completely distinct. Hence, these vehicles are considered as new make and model and added to the dataset as a new class. Moreover, features of new vehicle make and models (e.g.) Mahindra Thar and Audi R8 are completely distinct compared to the existing class. Hence, the images of these vehicles are recognized with CL3. These images are added to the dataset in a new folder. The methodology prompts the user to enter the class name for major modification/new make and model. After addition, the methodology prompts the user for license plate number.

Flowchart of Database Update Module.
The existing models in the ensemble model is retrained if the number of images is atleast 100. All the base models are trained till the validation accuracy is greater than 95 percent. Subsequently, the vehicle sides of next vehicle (i + 1) are processed by the methodology.
For multi-image-based vehicle authentication, a vehicle database should consist of (i) An image dataset consisting of vehicle images and (ii) A license plate (LP) number database. In this work, the vehicle database consists of 3173 images in the image dataset and 547 LP numbers in the LP number database. The collected images and the LP numbers are grouped into 8 classes based on the make and model of the vehicle and stored in cloud. The methodology for integrating new vehicle databases into the existing framework is as shown in Fig. 27.

Database Update Module in the proposed framework.
Database update module has been introduced into the proposed license plate authentication framework for including new vehicle make and model into the framework. The new images of vehicle sides fed to the DUM, are further sent to the vehicle make and model recognition module (VMMR) to recognize the vehicle make and model. If the VMM of any one side is recognized with a confidence level of 95 percent, the VMM is used to check the legality of the license plate. Also, the VMM of all sides of the vehicle is sent to the DUM to categorize minor and major modifications/new make and model. The images of new make and model/major modifications and minor modifications are updated to the existing dataset. The retrained network is loaded to the VMMR module to recognize the modifications and new make and models. The license plate number of the corresponding vehicle is updated manually into the license plate database of the vehicle database. The case studies for integrating modified vehicle databases are explained below.
When the framework is exposed to side images of new vehicles, the VMMR Module in the framework predicts the VMM for each side. The Fig. 03 shows a modified Tata Nano. Compared to the company standard vehicle structure and appearance, it is observed that the bumper is modified at both frontal and rear side. The features of the modified bumper reduce the confidence level of recognition at all sides. The frontal, rear and side images of Fig. 28 are recognized as Tata Nano with a confidence level of 91.12, 93.08 and 89.13 respectively.

Bumper Modification of Tata Nano.
Hence, these images are added to the Tata Nano folder and the model is retrained to recognize the vehicle. This retrained model is used for further recognition.
The vehicle, as shown in Fig. 29, is an autorickshaw. It is made by modifying the structure of Tata Nano. It is observed that the structure is severely modified at the frontal side, right and left sides. The rear side of the vehicle is severely modified by placing steel rings. The features of the vehicle at all sides are severely modified. Hence, the vehicle is recognized with a confidence level 49.54, 83.99 and 79.61 percent at frontal side, rear side and left side respectively. The vehicle is categorized as major modification. The framework prompts the user for new class name. Since the class is new, a new folder is created and the images are added to the new class. The license plate characters, KL10N6381 is noted and entered manually into the license plate database. The number of images for this new model is only three. When the number of images for this make is more than 100, the images are used to train the existing VMMR model. After the training process, the model would be able to recognize this make and model with confidence level greater than 95 percent.

Tata Nano modified as Autorickshaw
The database update module (DUM) provides scalability by updating minor and major modifications to the current dataset. The retrained model will be able to recognize vehicles with minor modifications at 95 percent confidence level. Moreover, scalable model methodology is used if the confidence level of recognition is less than 95 percent at all sides, even after training using DUM.
The steps involved in Fig. 30 is described below:

Scalable model of the proposed license plate authentication framework.
The evaluation metrics like precision, recall and F1 score of the current ensembled model is calculated. Moreover, the confidence level of recognition of new vehicle sides is also noted.
Model Loading
This step involves selecting and loading neural network models for license plate authentication.
Select New network architectures
For real time implementation, networks with minimum memory are preferred for the ensemble expansion. In our work, Squeeze net is selected for model expansion as the required memory size is less than 5 MB.
Set Initial Thresholds
The threshold parameter and its value for forming new ensemble model is set at this step. In our work, validation accuracy of 95 percent is selected as the threshold for including the model into the new ensemble model.
Initial Ensemble Composition
The proposed ensembled model with Insception ResNetV2, Dense-Net201, MobileNetV2, ResNet50V2 and EfficientNetV2B3 is the initial model. The model is trained to recognize eight classes with a confidence level (CL) greater than 95 percent. This ensemble model is retrained by the database update module to recognize minor and major modifications.
Model Expansion
When the neural network model cannot recognize the vehicle make and models even after training, the current ensemble model is expanded with the new neural network architectures. The current and new architectures are trained and tested with the updated database. In this framework, the models with validation accuracy greater than 95 percent are formed as the new updated ensemble model. This model is used for vehicle classification. Conversely, if the model does not achieve 95 percent validation accuracy, next model is trained for the updated dataset. Average Ensemble method is used in this work as the ensemble combination strategy. After forming the updated ensemble model, the model is validated with more images of trained classes. Is the confidence level of recognition at all sides is greater than 95 percent, the model is loaded to the hardware.
Remote Model Update
The updated ensemble model is loaded in the raspberrypi through wireless communication.

Maruti_Omni and Maruti_Eeco recognized as Maruti_Eeco.
The inference time for vehicle classification is noted. If the inference time is greater than 0.5 seconds, the hardware accelerators are selected to distribute the computation power. Thus, the computational resources of the system are shared for real time inferencing.
Testing
The scalability of the framework is tested for evolving and growing vehicle database robustly in four scenarios (i) Make and Model of trained class (ii) Minor Modification (iii) Major Modification (iv) Two make and models with similar vehicle structure.
Make and Model of trained class
When the scalable model is validated with make and model of trained class, all the sides of vehicles is recognized with a confidence level of 99 percent by the current ensembled model. Model expansion is not required. The current ensemble model classifies the vehicles at 0.2 seconds when deployed in Raspberrypi.
Minor Modification
When the scalable model is validated with vehicles having minor modifications, all the sides of vehicles is recognized with a confidence level of 99 percent by the current ensembled model. This outcome is observed since the database update module in the framework retrains the current ensemble model for minor modifications. Model expansion is not required. The current ensemble model classifies the vehicles at approximately 0.3 seconds when deployed in Raspberrypi.
Major modification/New make and model
The scalable model recognizes vehicles with major recognition/new make and model with a confidence level of 99 percent by the current ensembled model. This outcome is observed since the database update module in the framework retrains the current ensemble model for major modifications. Model expansion is not required. The current ensemble model classifies the vehicles at approximately 0.3 seconds when deployed in Raspberrypi.
Two make and models with similar vehicle structure
Rarely, confidence level decreases due to the similarity in vehicle structure of two classes. This occurs when the features of a trained vehicle are similar to features of major modification/New make and model. More detailed features are required to classify the make and model effectively. Hence, new base models like Squeezenet and existing models are trained with the updated dataset. In our work, the proposed model predicted the side view of Maruti_Omni as Maruti_Eeco with a confidence level of 96.22% even after training. Inclusion of Maruti_Omni to the database reduced the confidence level of Eeco since the side view is similar.
Hence, model Expansion is needed to extract more features and make better decision. Among the models, Dense-Net201, MobileNetV2, ResNet50V2, EfficientNetV2B3 and Squeezenet produced a validation accuracy of 95 percent at 500 epochs. Ensembling these models produced a confidence level of 95 percent at all sides. This improved the recognition of Omni and Eeco classes. This updated ensemble model is loaded to the raspberrypi for implementation. The average inference time observed for make and model recognition is 0.34 seconds. Hence, this model is used for real time license plate authentication.
Conclusion
In the proposed work, a multi-image-based authentication framework is proposed based on vehicle make, model, and License plate details from frontal, rear, and side images of vehicles. Due to the unavailability of a dataset with frontal, rear, and side images, the authors developed a dataset with 547 Indian cars. Moreover, a license plate database was also developed for this dataset. To classify vehicles based on side images, an ensembled model was proposed and compared with the state-of-the-art neural networks for VMMR. Based on the results, it was observed that the average ensembling method could achieve a significant accuracy of 0.89 for vehicle classification. This is better than the state-of-the-art networks. License plate recognition was successfully implemented using the Easy OCR package due to the limited memory size for real-time recognition. The total memory size of the au-thentication framework, including VMMR and LPR, is only 308 MB, which is convenient for real-time mobile robot patrol using edge devices like Raspberrypi. The performance of the system was analyzed in four distinct authentication cases. When the system was validated with 100 FLP images, it authenticated correctly for 91% of the images. From the results of the test cases, we observed that this framework reduces the dependence on complete manual license plate authentication system. In the future, more vehicle features like manufacturer year, color, license plate color could be added to the multi-image-based framework structure. Recognizing level of modifications to the vehicle will improve the model. The model shall be trained to recognize vehicle sides. Further, according to the position of the license plate, the patrol robot can be aligned to recognize the license plate characters. This will increase the accuracy of license plate recognition with OCR. The model shall be retrained automatically for every new vehicle make and model. This can be done seamlessly with blockchain technology.
Footnotes
Acknowledgments
The authors acknowledge the Management of Thiagarajar College of Engineering and National Institute of Technical Teachers Training and Research for their support in completing the research.
