Abstract
Due to the influence of recognition parameters, image recognition has low recognition accuracy, long recognition time and large storage cost. Therefore, an automatic image recognition method based on Boltzmann machine is proposed. Based on threshold method and fuzzy set method, image malformation correction is performed. The mean filter and median filter are combined to eliminate the influence of image filtering, and the pre-processing of image is completed by using the fuzzy enhancement of image. Based on the restricted Boltzmann method, the network model is dynamically evolved, and the identification parameters of each shape and contour are obtained. Different shapes and contours are classified and recognized. Simulation results show that image recognition method based on human-computer interaction has high recognition ability, shortens the time cost and greatly reduces the space needed for node storage.
Introduction
With the development and progress of science and technology, recognition technology is widely used, such as voice recognition, image recognition, spam SMS or email recognition, and has been integrated into our daily life from a simple theory. Image recognition technology is essential for fingerprint activation or clocks in machines used in mobile phones and computers, as well as facial-recognition check-in systems. Image recognition technology has developed from simple digital recognition to object recognition and face recognition, and various technologies are also developing and maturing [1]. However, there are more and more kinds of things recognized by classification, and the content of recognized objects is also more and more complex [2, 3]. Therefore, how to improve the recognition rate is of great significance. For example, face recognition mentioned above is directly related to security [4, 5].
Image recognition generally includes image preprocessing, image segmentation, image feature extraction, feature matching, recognition and classification. many scholars have done a lot of research. In order to make better use of the hidden feature information in image data, Yi and Deng [6] proposed a multi-channel convolutional neural network image recognition method based on image gradient, which takes the multi-directional gradient information as the basic expression of edge information. Firstly, the image is processed by Sobel operator, and four gradient images in horizontal direction, vertical direction and two diagonal directions are obtained. Then, four multilayer convolution neural networks are established to learn the features of four gradient images in different directions. Then four different direction features are randomly fused to get the features of the samples, which are then processed by batch standardization. Finally, the classification results are obtained by classifier. The results show that the proposed model has better generalization ability. Compared with single channel convolutional neural network, the recognition error rate in the two databases is reduced by 9.85% and 0.38% respectively, this method has good generalization ability, but the recognition accuracy is not enough. Tang et al. [7] introduced convolutional neural network for unsupervised training on the basis of traditional generation of countermeasures network; A conditional model based on GAN is extended by conditional generative adversarial network. Combining the advantages of both, a deep convolution generation model for adversarial network conditions is established, and the powerful feature extraction capability of convolutional neural network is utilized. On this basis, the conditional assistant generation of samples is used, and the structure is optimized and improved and used in image recognition. The method can effectively improve image recognition accuracy, but the correction part of the image is not detailed enough. Hua [8]proposed a face recognition method based on Zernike feature extraction and linear discriminant analysis classifier. Firstly, the image is normalized and cropped; Secondly, the Zernike moment is used to extract the global features, and the principal component analysis is used to reduce the dimension of the features; Finally, the decision fusion algorithm is used to fuse the local and global features, and the final feature vector is obtained as the input of LDA classifier for face recognition. This method has lower computation, but the accuracy of contour recognition is not high. The calculation model of heat release rate and fuel type of circulating pool fire is established [9]. The pool diameter, average flame height, heat release rate and fuel type are estimated by using flame video image recognition technology and the inverse modeling method of traditional fire dynamics theory. In the process of image recognition, the method of automatic seed placement and region growth is used to separate the flame from the non flame elements of each frame. This method is effective in removing non flame elements, and improves the flexibility of the model for large-scale flame video, but the recognition accuracy of contour image is not high. Corti et al. [10] used oscillatory neural network based on VO2 switch insulator metal transition for image recognition. VO2 oscillator is made on silicon by CMOS technology. The full connection network of the coupled oscillator is studied with the programmable resistor as the coupling element. In this method, image information input and data processing are carried out in time domain. In particular, the phase relationship between oscillators can be controlled by adjusting the coupling resistance. It is used to memorize and identify patterns in analog circuits. The noise reduction effect of this method is not good, and the details are not deep enough. On the basis of the above five methods, through the combination of Boltzmann machine algorithm and digital image processing technology, UAV can give full play to its unique advantages of simplicity, flexibility and wide application range. Image processing can be carried out on the images obtained by UAV aerial photography, so as to improve the recognition accuracy of image contour, gray image and other aspects, and complete the automatic recognition of the required target, which can save a lot of human and material costs, so as to achieve intelligent monitoring and judgment quickly and accurately [2].
Automatic image recognition based on Boltzmann machine algorithm
Image preprocessing
Geometric correction of image is mainly to correct nonlinear distortion of digital camera lens. The camera used is a common digital camera. There is optical distortion at the edge of the image. The distortion will make the actual image point position deviate, make the coordinates of the image point displace, change the ground position of the actual object, and finally affect the matching accuracy and the generation of digital orthophoto products. Therefore, the distortion difference must be corrected before the space three encryption and other later stages can be carried out Image processing.
Lens distortion has radial deformation and tangential deformation. For lens distortion correction of image, the self-inspection method in aerial triangulation operation of regional network is mainly used to introduce possible system errors, including the actual measurement focal length
Among them,
In the formula,
The steps of image preprocessing are as follows:
Step 1: mean filtering
Suppose that the image
where
The blur degree of the image will increase with the increase of the field. We often use the threshold method to reduce the blur effect [13], which is expressed as:
where, nonnegative real number
Step 2: median filtering
Firstly, an area
Median filtering produces less modulus and is more suitable for eliminating the isolated noise in images. The prominent feature of this method is that it can suppress the sharp pulse noise, and it does not degrade the edge and details of the image significantly, which is better than the average filtering method in maintaining the edge.
Step 3: image blur enhancement
With the extensive application of fuzzy set theory in various fields, its application value in image processing and recognition field cannot be underestimated. The fuzzy enhancement method for UAV image can achieve better results than the traditional method [14]. Fuzzy image enhancement combines the theory of fuzzy mathematics and the concept of image enhancement. Its basic idea is to transform the image from its spatial domain to the fuzzy domain by using membership function, to get the fuzzy feature plane, to enhance the image on the fuzzy feature plane, and finally to transform it back to the spatial domain to get the enhanced image. The basic principle of PAL fuzzy enhancement method is introduced as an example.
The membership function of PAL fuzzy enhancement method is:
where,
In the formula, in order to achieve the purpose of enhancing contrast, it is necessary to increase those
where
On the basis of the above image preprocessing, the image is recognized. The image recognition algorithm removes the connection between the same layer based on human-computer interaction. It is a generative random structure, and can learn the probability distribution of its input set [15]. As shown in Fig. 1, in RBM model, there is a bipartite graph, including visible layer and hidden layer, the former is used to input data, the latter is used to extract features, and there is no correlation between nodes of each layer. The values of RBM nodes are random binary values of 0 or 1. 1 represents neuron activation and 0 represents inhibition. At the same time, the total probability distribution satisfies Boltzmann distribution.
Restricted Boltzmann diagram.
Based on the restricted Boltzmann graph, the energy function of the restricted Boltzmann machine is constructed, and its expression is as follows:
where,
In the formula,
where
RBM model has made remarkable achievements in many aspects, such as dimension reduction, recognition, feature extraction and so on. RBM model can be trained in a supervised or unsupervised way according to tasks [17]. There is a simple method for training the combination of multi-layer RBM models. This method is the contrast divergence algorithm proposed by Hinton. First, initialize the visible layer under the given data, and then iterate
According to the dynamic evolution of the above network model, the degree of each image node is calculated, and the image recognition parameters are obtained. The identification parameters can be obtained only through this calculation, which simplifies the calculation process and reduces the cost of the identification method. More parameters are obtained with less computation to describe the shape contour features. Then the calculation formula of image recognition parameters is:
where,
After the recognition parameters of each contour are obtained according to the above steps, different contour can be classified and recognized. The specific steps are as follows:
Set up identification group The sample group and the test group of the shape contour are established respectively. Among them, the sample group is a set of shape contours with known classification. Different sample contours are divided into different sample contour groups according to the known categories. The number of contour samples in the sample group can be determined according to the actual situation of the recognition task. The test group is a set of shapes and contours to be determined [23]. Calculation and identification results The sample group and test group of the shape contour are marked as Group by contour According to the calculation results, the contour of test group is divided into known groups. When the second-order norm
To sum up, we use Boltzmann machine algorithm to preprocess the image, based on the limited Boltzmann machine image, complete the dynamic evolution of network model, extract image recognition parameters, and complete the classification of different shapes and contours.
Experimental environment
Simulation experiments will be carried out to verify the accuracy, time cost and storage space of this method. By comparing the image recognition method based on multi-channel convolution neural network based on image gradient proposed in Literature [6] and the image recognition method based on conditional depth convolution generation countermeasure network proposed in Literature [7], the automatic image recognition method based on Boltzmann machine is analyzed.
In order to provide a unified experimental platform, make the experimental data obtained by different recognition methods and the experimental data of the same method in different experimental environments comparable, and ensure the validity of the simulation experimental data, all experiments in this paper will be performed in the following listed hardware and software environments.
Hardware environment:
CPU: intel i5 with the frequency of 2.5 GHz Memory: 4 GB memory space
Software environment:
Operating system: Mac OS X operating system Simulation software: 4 GB memory space
Under the unified software and hardware platform environment, the recognition method in this paper is compared with the multi-channel convolution neural network image recognition method based on image gradient proposed in Literature [6], and an anti network image recognition method based on conditional depth convolution is proposed in paper [7].
Plane database
Plane database is a dynamic database of fighter’s dynamic attitude profile, according to Shenyang Aircraft Industry Group. The overall profiles of different groups are similar to each other. The grabbing angle and profile of the same group are also different. Generally speaking, it is more complex than MPEG database. Therefore, in recognition, the first six images of each group are 42 to form a sample group, and the other 168 to form a test group to verify the effectiveness of the method. The number of shape profiles in the sample group accounts for the total number of shape profiles in the whole database. Some plan fighter profile database samples are shown in Fig. 2.
Part of plan fighter profile database sample.
MPEG database is a static database, data sources and networks, which has less recognition objects and relatively simple contour shape. When recognizing the shape and contour, we take the first image of each group to form a sample group, 12 images in total, and the remaining 108 images are divided into the test group to verify the effectiveness of the method. The number of shape profiles in the sample group accounts for 10% of the total number of shape profiles in the whole MPEG database. Some MPEG shape profile database samples are shown in Fig. 3.
Some MPEG shape profile database samples.
ORL face Database consists of a series of human face images taken by Olivetti Laboratory from April, 1992 to April, 1994, with a total of 40 subjects of different ages, genders and races. ORL is the most widely used standard face database, containing 15 groups of gray scale human face images, 11 in each group, a total of 165. Different shooting angles are used to obtain various facial expressions and details under different lighting conditions. All database images are of uniform specifications, with resolution of 100 * 100, gray level of 256, and image format of BMP. Some ORL face database samples are shown in Fig. 4.
Some ORL face shape contour database samples.
Plane contour image recognition
Because a smaller distance threshold is conducive to reflecting the details of the shape and contour, the initial value
Shape and contour image of fighter.
Comparison of recognition accuracy of fighter shape and contour image by three methods.
The recognition accuracy of fighter’s shape and contour image is compared and analyzed by using the automatic recognition method of image based on human-computer interaction, the image recognition method of multi-channel convolution neural network based on image gradient and the image recognition method of confrontation network based on conditional depth convolution. The comparison results are shown in Fig. 6.
According to Fig. 6, As the number of iterations increases, the recognition accuracy of fighter contour image identified by the method in this paper also improves. At 50 iterations, the recognition accuracy reaches over 80%. In other iterations, the accuracy of the method in this paper is 5% higher than that of the method based on image gradient. Compared with Generation method of conditional depth convolution, it is more than 10% higher than that of conditional depth convolution. Therefore, the recognition accuracy of this method is the highest. Compared with the comparison method, the stele accuracy of this method is improved.
After calculating the determination parameter
MPEG shape contour image.
Comparison of MPEG shape contour image recognition accuracy of three methods.
In this paper, the recognition accuracy of MPEG shape contour image is compared and analyzed, the image recognition method of multi-channel convolution neural network based on image gradient and the image recognition method of anti network based on conditional depth convolution generation. The comparison results are shown in Fig. 8.
Comparison of recognition accuracy of face shape contour image by three methods.
Number of gray level contour points corresponding to different gray level thresholds.
According to Fig. 8, three methods of identification at the time of 30 to 80 iterations, the identification accuracy is rising slowly, but in this paper, the identification precision of recognition method in 30 times has reached more than 40%, but other methods under 30%, the precision is low, with the increase of the number of iterations, this article identification method at the time of 100 times, reached 84%. The recognition accuracy of Generation method of Conditional Depth convolution is 50%, and that of Image gradient based method is 25%. The recognition accuracy of the proposed method is 34% and 59% higher than that of the two methods respectively. Therefore, the recognition accuracy of the proposed method is higher than that of the proposed method in contour image recognition, which improves the recognition accuracy in this aspect.
When extracting the gray-scale contour of an image, the gray-scale threshold
The proposed method, multi-channel convolutional neural network image recognition method based on image gradient and conditional deep convolution generative adversarial network image recognition method are used to compare and analyze the recognition accuracy of face shape and contour images. The comparison results are shown in Fig. 9.
As can be seen from Fig. 9, the recognition method in this paper can effectively extract features from face shape and contour, achieving recognition accuracy of 90%. However, the maximum recognition accuracy of face contour of Generation method of Conditional depth convolution is 30%. The highest recognition accuracy of the Image gradient based method is 79%, which is 60% and 11% lower than that of the method in this paper, respectively. Therefore, the recognition method in this paper has a high recognition accuracy in face contour recognition, and the accuracy has been improved.
Storage space
The first gray image of the first group in ORL database is selected as the analysis object, and the number of nodes of the model based on different gray thresholds is shown in Fig. 10.
It can be seen from the figure that when the gray threshold is 100, the number of nodes of the network model established is the least, 372. When the gray level threshold is 160, the number of nodes in the network model is 885. The average number of gray-scale contour points extracted by the network model corresponding to 8 groups of gray-scale thresholds is 528. Without considering the dynamic allocation of storage units, the proposed recognition method, the multi-channel convolutional neural network based on image gradient and the adversarial network based on conditional deep convolution are adopted. At the same time, the storage of multiple groups of gray-scale contours is allocated separately, including the number of contour nodes, the size of adjacency matrix and the storage. The total number of units is shown in Table 1.
Storage cost comparison of three identification methods
Storage cost comparison of three identification methods
