Abstract
At present, the automatic attendance mode of distance education is not conducive to the confirmation and analysis of information after class. In order to study the effective automatic recognition algorithm of remote education classroom, this study takes the educational classroom of intelligent innovation and entrepreneurship of Internet + as an example for analysis. Moreover, this paper adopts facial features as the basis of recognition, establishes corresponding positioning points, and constructs precise positioning methods for real-time feature capture. At the same time, the ASM algorithm is used to extract facial features, and the algorithm is improved to improve the extraction effect. In addition, this paper proposes Gabor-wavelet packet set and Gabor beamlet set for auxiliary recognition, which improves the recognition rate. Finally, this paper designs experiments to analyze the performance of the algorithm of this study. The results show that the proposed algorithm has certain practical effects and can provide theoretical reference for subsequent related research.
Introduction
With the rapid development and increasing popularity of information technology, the wave of informatization has been pushed into various fields of social development, and the education system and model have also been greatly impacted, which has greatly promoted the development of new education models such as distance education and adult education. Online education not only breaks through the boundaries between traditional pen and paper, improves the efficiency of teachers, saves a lot of manpower, material resources and financial resources, but also optimizes teaching management and improves teaching quality, and breaks the geographical and time constraints. With these advantages, the scope of application of online education is expanding. For the fast-growing online education, the traditional means of monitoring and monitoring video surveillance cannot meet the requirements of online education. To this end, it is necessary to vigorously develop research on computer monitoring systems, comprehensively realize information monitoring, and assist in the smooth operation of online education systems. For the time being, this kind of computer monitoring system has good research and application prospects, and face recognition technology plays an important monitoring role in the intelligent entrepreneurship education classroom.
At present, the research on two-dimensional face feature point positioning has made some progress, and many classic algorithms have emerged. The method based on statistical active shape model and active appearance model proposed by T. ECootes et al. is well represented. They are all based on a point distribution model. That is, a set of discrete constrained points is used to describe the shape of the object. The principal component analysis method is used to establish the motion model of each constraint point, and the iterative method is used to find the optimal solution to locate each feature point [1]. ASM has a faster search speed, but it is not as good as AAM in texture matching of images. Moreover, when ASM and AAM perform feature point localization on face images with large expression changes, closed eyes and obvious illumination changes, the positioning accuracy decreases rapidly [2]. In recent years, Tang et al. introduced deep convolutional networks and multi-task learning methods into feature point localization. This method also has better positioning effects in the case of large attitude deflection. However, this method uses a large number of training samples to learn features, and the learning process is complicated. In order to solve the difficulty of positioning the feature points of the face with large attitude deflection and occlusion at the same time, Wu et al. use a cascaded regression model to predict and estimate the occlusion [3].
Compared with two-dimensional face feature point location, three-dimensional face feature point location is a relatively new research topic. Early 3D face recognition algorithms used the closest point to the 3D scanner to locate the tip of the nose. Moreover, Lee studied the relevant content in his literature. This method is easier to implement, but it is sensitive to noise and does not apply to points other than the tip of the nose [4]. According to the curvature of different feature points on the face and face of the face, Dorai et al proposed using the Shape Index to locate the feature points [5]. The method has scale and rotation invariance but can only locate points with obvious curvature features such as eye corner points, nose points and mouth corner points, and the method is not suitable for large expressions and noise points [6]. In order to locate the feature points of 3D faces in real time and automatically, Zhang Xiaobo et al. proposed a local descriptor based on histogram statistics and surface division and calculated the similarity of the descriptors to locate each feature point [7]. Zhao et al. proposed a statistical method based on facial feature point localization. This method constructs a statistical face feature model to locate feature points by training a global 3D face deformation model, local texture of feature points and local shape description. Although the method achieves 99.09% accuracy on the FRGC v1.0 face database, the accuracy of the method for locating feature points is not high, and the real-time performance of the algorithm needs to be improved [8]. Stefano et al. locate the tip of the nose according to the gray value and curvature feature on the depth map and use the Gaussian Laplacian and Gaussian difference calculation to locate the two nose points around the tip of the nose. Finally, the scale invariant feature transform descriptor is used to locate the corner point and the corner point. This method has better positioning effect when the expression changes little, but it cannot meet the requirements of large expressions [9]. Jahanbin et al. used two-dimensional and three-dimensional multi-modal fusion to locate feature points, and the fusion results have high positioning accuracy. However, this method needs to meet the one-to-one correspondence between two-dimensional and three-dimensional, and the universality is not strong [10].
In 1888 and 1910, Galton [11] published two articles on face recognition in the international authoritative magazine (Nature), which opened up the exploration of human face recognition. When Bledsoe and Chan entered the research of automatic face recognition technology in the mid-1960 s, they gradually attracted more and more scholars to join in [12]. Since the 1990 s, face recognition technology has developed rapidly, and many classic recognition algorithms have emerged. Today, more and more well-known institutions, research institutions and commercial companies including Google, Baidu, Alibaba, Netease, Tencent, Microsoft, etc. are also involved in the research of face recognition technology. Face database data in different environments is also constantly being supplemented. Moreover, the algorithm performance evaluation standards have gradually improved, face recognition research has been moving forward, and face recognition related products have begun to enter people’s lives [13]. For example, the access control system of this year’s National Day military parade, the APEC summit, and the Second World Internet Conference held in Wuzhen [14] all applied face recognition technology. Another example is Alipay’s face payment, Netease mailbox face recognition and authentication function [15]. With the continuous improvement of the performance of hardware devices such as modern computers and face acquisition systems, and the continuous improvement and breakthrough of face recognition algorithms, face recognition technology will become an indispensable part of people’s daily life [16].
At present, some of the widely used remote education automatic identification mainly confirms the information by means of screen-locked personnel, but it is not conducive to the confirmation and analysis of information after class [17]. Based on this, based on the automatic identification of distance education, this study analyzes the automatic identification of intelligent and innovative classrooms of Internet+ through feature recognition technology, which promotes the further improvement of the efficiency of the education method and promotes the smooth development of distance education.
Related work
This study proposes a method based on ASM+GWT for precise location of face feature points, and the method is suitable for precise location of 2D and 3D face feature points. The method consists of two steps: coarse positioning and precise positioning. Because the coarse positioning of the feature points requires a faster search speed and the positioning accuracy is not high, the ASM is first used for rough positioning of the face feature points. Secondly, the similarity between the Gabor wavelet packet set of the coarse location area of each feature point and the beamlet set of the corresponding manually marked feature points in the training set is calculated, and the similarity map of the coarse location area of each feature point is obtained. Finally, the point corresponding to the maximum value in each similarity graph is selected as the feature point of the final positioning. The experimental results in FRGC v2.0 and OLD show that the proposed algorithm can achieve accurate location of facial feature points under illumination changes and large expression changes and has high positioning accuracy and strong robustness [18].
Appropriate preprocessing of two-dimensional images and three-dimensional point clouds before feature point positioning is beneficial to improve the performance of the latter algorithm. For the two-dimensional face image, it is first normalized. Then, in order to weaken the influence of illumination and enhance the contrast of the image, this study performs white balance processing on all two-dimensional face images, and the white balance algorithm adopts the classic mirror method. For the 3D point cloud, the isolated noise points are first removed, and then the point cloud is converted to the depth map after linear interpolation on the x and Y coordinates.
According to the imaging theory of the image, the color of the reflected light after the light source of any color is irradiated to the pure white reflecting surface is the color of the light source.
Face feature image processing
Image preprocessing
Assuming that the picture I (x, y) has a purely self-colored area under the classical light source, the maximum value of the.R, G, and B channels of any pixel in I (x, y) is 255, or a given pure white value. Then, under any other light source, the pure white point in I (x, y) will be the brightest point in I (x, y), and the pixel value of these points will be smaller than the pixel value of the pure white point under the classical light source. Therefore, as long as the pixel values of the R, G, and B channels are normalized to the pixel values of the pure white dots, the white balance of I (x, y) can be achieved [19].
In the formula,

White balance effect image of a two-dimensional image.
In the process of acquisition, 3D face data may generate some outliers with sudden changes in position due to changes in reflectivity, etc., and the generation of these outliers often leads to incorrect depth information when mapped to depth maps. Therefore, before converting the 3D point cloud to the depth map, the 3D point cloud needs to be denoised to remove these outliers. We assume that d is a dynamic threshold. If the average distance between d point cloud and its neighbor point set is greater than d, then the point cloud is an outlier and needs to be removed. Among them, the dynamic threshold d is defined as:
In equation (3), μ is the average distance between any point cloud in the 3D face point cloud and its neighbor point set, and σ is the variance of the distance. μ and σ are generated through training, as shown in Fig. 2.

Smoothing of 3D face data.
Since holes are left in the position where the outliers are removed, the holes need to be filled by the cubic interpolation algorithm. The effect of converting a 3D point cloud into a depth map is shown in Fig. 3. In addition, a pre-processing step of cutting the face region of the three-dimensional point cloud is needed to extract the face region. Since the 3D face point cloud truly reflects the size of the face, this paper uses a spherical area centered on the tip of the nose and with a radius of 90 mm to obtain the face area. Moreover, this paper only retains the mapping points in the spherical area on the depth map, as shown in Fig. 3. Among them, the cutting of the face area can be performed after the feature positioning, and the positioning method of the nose point is as described later in this section.

Preprocessing of 3D point cloud.
The ASM algorithm is a method based on point distribution model. The algorithm consists of three parts: marking feature points, establishing global shape model and establishing local texture model. Moreover, the ASM algorithm is a method based on a point distribution model that requires a series of points to represent the contours of a human face. Marking feature points is to determine which points are used to represent the contour of the face. This is the premise guarantee that the ASM algorithm can effectively locate the feature points. In general, some points with prominent contour features are selected, such as corner points, points with higher curvature, and points at T-junctions. At the same time, some points are appropriately inserted in the middle of these points to make the point density of the marks uniform. The principles for marking feature points are as follows:
(1) Some points with obvious contour features are selected, such as nose point, corner point and corner point; (2) Some points in the marked feature points are evenly marked to make the point density uniform and the contour can be described more smoothly; (3) The density of the marked feature points needs to be appropriate. If the density is too large, the marker points will increase the marker workload and will also increase the time it takes to search for feature points. However, if the density is too small, the effect of feature point positioning will be affected. Therefore, in order to balance the effect and efficiency of feature point positioning, the marked feature points in this paper are shown in Fig. 4.

Gabor wavelet packet set in the left inner eye corner point.
In view of the shortcomings of Gabor beamlet, this paper proposes Gabor-wavelet packet set and Gabor beamlet set based on Gabor wavelet packet and Gabor beamlet. The Gabor wavelet packet set with any point on the face is defined as a set of Gabor wavelet packets consisting of 9 points in the 8 neighborhoods. The Gabor beamlet set is a set of Gabor wavelet sets with the same feature points of different faces. The Gabor wavelet packet set is defined for the face at any point, while the Gabor beam set is defined only for the face feature points. Taking the left inner corner point as an example, this study extracts the left inner eye corner point and its 8 neighborhood Gabor wavelet packet, a total of 9 points, to form the Gabor wavelet packet set of the left inner corner,

CMC curve of the FRGCv2.0 experiment.
The similarity of two different wavelet packet sets is defined as:
Considering that the test face is generally not in the training set, and even if the features of the same face are not necessarily the same at different feature points, the similarity between the wavelet packet set and the beamlet set is defined as:
Where
Based on the proposed Gabor wavelet packet set and Gabor beamlet set, the coarse positioning area of each feature point is further searched to accurately locate the feature points. Proceed as follows: 1) 50 two-dimensional face or three-dimensional point cloud data with different characteristics such as gender, age, expression and skin color are selected as the training set, and the training set data is preprocessed. 2) Gabor beamlet feature of 7 manually marked feature points of each face data in the training set is extracted. 3) A rectangular area of 25×25 pixels is established centering on the 7 feature points of the ASM coarse positioning and is used as a precise positioning search area of each feature point. 4) In order to reduce the time cost of searching, in the search area of each feature point, taking LEIC as an example, the similarity
FRGC v2.0 is the most widely used 3D face database, and experiments on this database can better verify the validity of the features proposed in this section. To this end, we randomly selected 82 people’s 3D face point cloud data from FRGC v2.0, which contains 684 different expressions. Among them, five three-dimensional face point clouds including different expressions such as smile, laughter, and closed eyes are selected as the training set of NLDA dimensionality reduction, and the rest are used as test sets. Moreover, NLDA was adopted to reduce all feature dimensions to 82 dimensions, and the experimental CMC curve is shown in Fig. 5.
To further illustrate the validity of the GL-GRIFR features proposed herein, recognition experiments were performed on OLD. A total of 595 three-dimensional face cloud data containing different expressions collected by 68 people based on the high-precision three-dimensional grating measurement system independently developed by the laboratory were randomly selected from the experiment. Among them, five 3D face clouds with different expressions such as closed eyes, smiles, and laughter were selected as the training set of NLDA dimensionality reduction, and the rest were used as test sets. Among them, NLDA is used to reduce all feature dimensions to 68 dimensions. The experimental CMC curve is shown in Fig. 6.

CMC curve of the OLD experiment.
Based on the above analysis, the algorithm research is carried out, and the recognition pictures under the algorithm model are collected, and the obtained results are as follows.
Fig. 7 shows the original image of the online education classroom recognition. It can be seen from the Fig. that the background of the Fig. is more complicated, and the number of people is large, and the expressions of different students vary greatly, so it is difficult to automatically identify in classroom education. After that, the Fig. is preprocessed, and the obtained result is shown in Fig. 8.

Original image of online education classroom recognition.

Preprocessed image.
It can be seen from the pre-processed image that more interference factors in the background part have been eliminated, and the face part and the facial features have been calibrated. However, from the actual situation, there are still many background interferences in the picture, and the image segmentation is not realized. After that, we use the research algorithm for personnel identification, and the results shown in Fig. 9 are obtained.

System identification image.
As shown in Fig. 9, the face has been directly segmented from the background by the system, and on this basis, face recognition is performed by face feature recognition. In this study, the students were allowed to move freely, and the personnel were selected for identification and recognition at a certain time interval. A total of 5 sets of recognition results were collected, and the final results are shown in Table 1.
Recognition rate statistics
As shown in Fig. 5, the recognition rate (Rank-1 is 95.20%) of the GRIFR feature based on the Gabor real part and the imaginary part fitting coefficient is better than the recognition rate (Rank.1 is 93.40%) of the L-GRIFR feature based on the Log-Gabor real part and the imaginary part fitting coefficient. Moreover, the feature GL-GRIFR obtained by the fusion of GRIFR feature and L-GRIFR feature shows a maximum recognition rate of Rank-1 of up to 97.44%. The reason is that the GL-GRIFR feature combines Gabor’s optimal time-domain (noise principle) expression characteristics and the characteristics of Log-Gabor wavelets with a wider frequency band than Gabor wavelets. Then, we compare the recognition rate of the GL-GRIFR feature incorporating the local fitting feature of the feature point with the recognition rate of the GGL-GRIFR feature (Rank-1 is 93.10%) of the local fitting feature that is not integrated into the feature point. It shows that local features of feature points are more robust to face deformation such as expression changes.
It can be seen from Fig. 6 that although the recognition rate of all features on OLD is slightly decreased, the overall experimental results can still be derived from the same experimental conclusions as in the FRGC v2.0 library. Moreover, the recognition rate of the GL-GRIFR feature using the fusion feature of the GRIFR feature and the L-GRIFR feature still reaches the highest recognition rate of 96.86%. Moreover, this recognition rate is 5.46% higher than the Rank-1 recognition rate of the GGL-GRIFR feature, 91.40%, that does not incorporate the local fitting feature at the feature point. It illustrates from the side that the integration of feature point local fitting features can improve the robustness to expressions. Therefore, the effectiveness of the GL-GRIFR features extracted in this paper is further illustrated.
As shown in Table 1, except for the fourth recognition, which is 81%, the rest are 100% identified. The reason is that during the recognition process, the two students did not directly detect the face due to the posture, which led to system identification errors. However, in the actual detection, the system detects the continuous frames, so this situation does not occur. It can be seen that the automatic recognition algorithm constructed in this study has certain practical effects.
Face recognition technology has been greatly developed in recent decades due to its broad application prospects in identity authentication and identification. As one of the most challenging topics in pattern recognition and computer vision, the development of face recognition technology is becoming more mature, but it also faces many difficulties. Although two-dimensional face recognition based on two-dimensional images can obtain better recognition performance in a restricted environment, illumination, posture and expression are still the bottleneck factors that further improve the recognition performance. The root cause is that the two-dimensional face image is a simple projection of a three-dimensional face on a two-dimensional plane. With the rapid development of 3D scanning technology and the fact that 3D face organs are almost unaffected by illumination, 3D face recognition based on 3D face information has gradually entered the field of scholars. From the perspective of the 3D face recognition system developed by our laboratory, this paper studies the key issues of two-dimensional face recognition including face key feature point location and 3D face recognition algorithm.
In this paper, from the perspective of 3D face recognition system in this laboratory, the key problems of two-dimensional face recognition including face feature point location and 3D face recognition algorithm are studied.
Although the research in this paper has achieved some results and progress, there is still room for improvement. The coarse positioning part of the feature points in this paper uses the ASM algorithm. In the case of large posture deflection, the coarse positioning effect is not good. If this can be improved, the key feature points of the face with large posture deflection can be accurately positioned. Secondly, this research method is still insufficient for the essential aspects of fully extracting features. If deep learning can be used to further reduce the dimension of the GL-GRIFR feature, a higher level of essential features can be extracted from the middle layer feature, which will further improve the recognition rate of this paper.
Conclusion
Based on the automatic identification of distance education, this study analyzes the automatic identification of the intelligent and innovative classroom of Internet+ through feature recognition technology, promotes the further improvement of the efficiency of the education method, and promotes the smooth development of distance education. Moreover, this study proposes a method based on ASM+GWT for precise location of face feature points. The method is suitable for precise positioning of 2D and 3D face feature points and consists of two steps: coarse positioning and precise positioning. In this paper, a spherical area with a radius of 90 mm centered on the tip of the nose is used to obtain the face area, and only the mapping points in the spherical area are retained on the depth map. In addition, based on the proposed Gabor wavelet packet set and Gabor beamlet set, the coarse positioning area of each feature point is further searched to accurately locate the feature points. Moreover, from the perspective of the 3D face recognition system developed by this laboratory, this paper studies the key issues of two-dimensional face recognition including face key feature point location and 3D face recognition algorithm. Finally, in order to further illustrate the validity of the proposed GL-GRIFR feature, this paper designs a recognition experiment for analysis. Research shows that the automatic recognition algorithm constructed in this study has certain practical effects.
Footnotes
Acknowledgments
National Natural Science Foundation of China, Concurrent scheduling and cost optimization for multiple workflows in hybrid cloud computing environment, (NO:61363004).
