Abstract
At present, online education evaluation models are insufficient when dealing with small-scale evaluation data sets. In order to discriminate the learner’s learning state, this paper further studies online teaching machine learning methods, and introduces adaptive learning rate and momentum terms to improve the gradient descent method of BP neural network to improve the convergence rate of the model. Moreover, this study proposes a deep neural network model to deal with complex high-dimensional large-scale data set problems. In the process of supervised prediction, this study uses support vector regression as a predictor for supervised prediction, and this study maps complex non-linear relationships into high-dimensional space to achieve a linear relationship similar to low-dimensional space. In addition, in this study, small-scale teaching quality evaluation data sets and large-scale data sets are input into the model to perform experiments. Finally, the model proposed in this study is compared with other shallow models. The results show that the model proposed in this research is effective and advantageous in evaluating teaching quality in universities and processing large-scale data sets.
Introduction
In the 21st century, China’s higher education has entered the stage of mass education, which is followed by quality problems. Improving the quality of higher education is the eternal theme of developing education. In 2001, the Ministry of Education issued “Several Opinions on Strengthening Undergraduate Teaching in Higher Education and Improving Teaching Quality’’, which proposed 12 measures and opinions on strengthening the teaching of undergraduate education and improving the quality of teaching in universities. In 2003, the Ministry of Education’s “Notice on Starting Quality Construction of Teaching Quality and Teaching Reform Projects in Colleges and Universities” emphasized that the implementation of quality engineering is an important starting point for improving the quality of higher education. The purpose is to use this project to comprehensively improve the quality of higher education. In 2004, the Ministry of Education of China issued “Several Opinions on Further Strengthening Undergraduate Teaching in Higher Education’’. The document emphasizes that improving the quality of personnel training is an important and fundamental task for colleges and universities. Since improving teaching quality is the lifeline of undergraduate colleges and universities, it is necessary to firmly establish the central position of teaching work in colleges and universities. In 2012, the Ministry of Education issued “Several Opinions of the Ministry of Education on Comprehensively Improving the Quality of Higher Education’’. The document proposes a new teaching quality evaluation plan for colleges and universities, adopts measures to improve the teaching ability and level of teachers, implements a multi-agent multi-evaluation method, takes the basic state of teaching as the main content of evaluation, and is committed to improving teacher teaching quality [1]. It can be seen from this that the establishment and improvement of the teaching quality evaluation system of colleges and universities is an important measure to improve the teaching quality of colleges and universities, and the objective teaching evaluation of the teaching quality by each evaluation subject is the basic premise of perfecting the quality evaluation system. The research on the evaluation system of teaching quality in colleges and universities can make the educational goals of colleges and universities keep up with the development trend of the times, strive to improve the teaching methods, improve the teaching quality of teachers, and then complete the important task of training talents in colleges and universities [2].
Establishing a teaching quality evaluation system for teachers in universities to carry out evaluations of teacher teaching quality is of great significance in terms of cultivating real talents, achieving educational goals, or strengthening school education management and improving education systems. First of all, from the perspective of students, the evaluation of teachers ’teaching quality focuses on students’ learning achievements and focuses on the interests of students, so that students can better understand their dominant position in teaching and be actively responsible for their learning quality [3]. Secondly, in terms of improving teaching quality, by evaluating the teaching quality of college teachers, it can continuously promote teachers to adjust the teaching direction, improve teaching methods, optimize teaching content, accurately carry out classroom teaching evaluation, and strive to obtain the best teaching effect. At the same time, it can promote the cooperation between teachers and teachers and students, close the teaching relationship, and greatly help improve the quality of teaching [4]. Finally, from the teacher’s perspective, effective evaluation of teacher teaching quality can enable the school education department to more accurately understand the teacher’s work status and teaching progress at the current stage of colleges and universities, and carry out college teaching in a purposeful and planned manner. Moreover, it is conducive to teachers ‘self-summarization of teaching experience through feedback of evaluation results and constantly discovering their deficiencies in teaching work, and it is conducive to teachers’ continuous self-improvement.
Related work
The main content of the essence of classroom teaching evaluation includes research on definition, type, and constituent elements. Regarding the definition of classroom teaching evaluation, there is currently no unified conclusion. The literature [5] believed that classroom teaching evaluation is an interactive activity between learners and the teaching environment in order to complete the preset learning effect. The literature [6] believes that classroom teaching evaluation exists as part of the teaching process, has certain goals, and can be used as the research object of researchers. The literature [7] believes that the evaluation of classroom teaching is mainly to promote the language learning process as a selective activity for learners. In addition, researchers have different classification standards for classroom teaching evaluation. The literature [8] proposed a teaching mode of direct teaching, and believed that the classroom teaching evaluation under this model should include seven categories including statement of learning objectives, review of knowledge points, presentation of new materials, learning inquiry, independent student practice, assessment feedback, practice and review. The literature [9] believed that classroom teaching evaluation mainly includes eight activities including lecture, tutoring, experiment, project learning, question and answer, review, forum discussion and demonstration. The literature [10] introduced 75 types of online learning activities from the perspective of learning activity flow according to the sequence of learning activities and the relationship of flow direction. The literature [11] put forward the concept of online learning circle, and summarized 42 kinds of classroom teaching evaluation for the “teacher” in the circle. Although researchers have different understandings on the definition of classroom teaching evaluation, it does not affect the development of classroom teaching evaluation design tools and related research on classroom teaching evaluation and evaluation. The literature [12] designed a special tool for learning activities, which can be used to guide and support teachers to create, modify, and share learning activities and resources. CSALT has established a framework model that distinguishes between “tasks” planned by designers and “activities” carried out by learners. The IMS learning design support specification defines the main elements of learning activities, the sequence and methods of learning activities, and their orientation to learning results. In addition, the IMS organization also proposed a caliper analysis to record learning activities in a standardized manner. It records learning activities in the form of “metering spectrum’’, and developers of the system platform can build learning platforms based on this activity recording model. The literature [13] described the metadata framework for classroom teaching evaluation, developed three tools and a teaching resource library, and defined the model in five dimensions: teacher, student, content, process, and resource. The literature [14] defined seven elements of the learning situation, which contains four necessary elements and three optional elements, and provided an active sentence structure in the form of triples, which contains the learning environment, learners, learning activities, and learning content. The literature [15] constructed a semi-open-loop framework system for evaluating the instructor’s classroom teaching evaluation process under the information environment. The framework system mainly includes three major categories and seven minor items, and collects and analyzes classroom teaching evaluation data generated by teachers and students during the teaching process in a real information environment for many times. Moreover, this document combined with the questionnaire survey method to verify the effectiveness of the framework tool.
The connotation research of classroom teaching evaluation mainly includes related research such as concept definition, type, characteristics, design principles and so on. The literature [16] believes that classroom teaching evaluation is a bilateral activity of teaching and learning between teachers and students, which means that students actively master scientific knowledge and related skills under the guidance of teachers. The literature [17] believed that classroom teaching evaluation refers to the use of a series of teaching methods in the classroom for teachers and students to complete teaching and learning tasks in accordance with the teaching goals. Moreover, the literature divided the evaluation of high school ideological and political classroom teaching into four types of activities based on their overall level: memory classroom teaching evaluation, understanding classroom teaching evaluation, inquiry classroom teaching evaluation and creative classroom teaching evaluation. The literature [18] believed that classroom teaching evaluation is a bilateral activity of teacher-student teaching and learning, which is composed of successive links. Moreover, the literature also proposed the role of the four elements of English classroom instruction evaluation design, such as subjectivity, objectiveness, authenticity and effectiveness, in English classroom instruction evaluation design. The literature [19] believed that classroom teaching evaluation is a practical activity in which teachers and students as the main body jointly act on teaching resources and aim at recognizing the objective world and enhancing the subjective world. The literature [20] believed that English classroom teaching evaluation is a learning process in which students actively use the target language to achieve communicative competence in real communicative situations. The literature [21] believed that classroom teaching evaluation is a process that uses certain teaching media and teaching methods to teach students content, so as to achieve the goal of teaching.
BP neural network theory
The three-layer BP neural network with a hidden layer has strong nonlinear processing capabilities and is widely used. Its network structure is shown in Fig. 1 [22–24]:

Three-layer BP neural network structure.
The learning process of a three-layer BP neural network with a hidden layer is as follows. We assume that the number of neurons in the input layer, hidden layer, and output layer of the BP neural network are n, p, and q, and the number of data samples is m; the input feature vector of the input layer is x = (x1, x2, ⋯ , x
n
), the output vector of the hidden layer are hi = (hi1, hi2, ⋯ , hi
p
) and ho = (ho1, ho2, ⋯ , ho
p
), the input vector and output vector of the output layer are yi = (yi1, yi2, ⋯ , yi
q
) and yo = (yo1, yo2, ⋯ , yo
q
), and the network target expects the output vector to be t = (t1, t2, ⋯ , t
q
). Then, w
ij
w
jt
is the connection weight of the input layer and the hidden layer, and the hidden layer and the output layer, and b
j
b
t
is the threshold of each neuron in the hidden layer and the output layer. Among them, (i = 1, 2, ⋯ , n ; j = 1, 2, ⋯ , p ; t = 1, 2, ⋯ , q). The activation function is the Sigmoid function Initialize the network: The connection weight w
ij
, w
jt
and the threshold b
j
, b
t
of the network are assigned to random numbers in (- 0.5, 0.5), respectively, and the target accuracy ɛ of the network, the maximum number of iterations M and the error function The sample K, the input vector x (k) = (x1 (k) , x2 (k) , ⋯ , x
n
(k)), and the target expected output value t (k) = (t1 (k) , t2 (k) , ⋯ , t
q
(k)) are randomly selected from the data set. Through the sample data x (k), the connection weight w
ij
, and the threshold b
j
of the hidden layer, the input value hi and the output value ho of each neuron in the hidden layer are calculated.
According to the output ho
j
(k) of the hidden layer, the connection weight w
jt
, and the threshold b
t
of the output layer, the input value yi and output value yo of each neuron in the output layer are calculated.
The above is the forward propagation process in the learning process of the BP neural network. The input data is transmitted layer by layer until it is output by the output layer. The following introduces the error back propagation process to correct the error generated by the forward propagation process. The error function calculates the error between the actual output yo (k) and the target expected output t (k). If the set ɛ is not satisfied, the partial derivative δ
t
(k) of each neuron in the output layer is calculated.
The partial derivative δ
t
(k) of the output layer neuron and the output ho
j
(k) of the hidden layer neuron are used to adjust the connection weight w
jt
(k) and the threshold b
t
(k) between the modified hidden layer and the output layer. Among them, N is before correction, N + 1 is after correction, μ is the learning step of correction, and the value range is (0, 1). The correction formula is as follows:
According to the connection weight w
jt
(k) of the hidden layer and the output layer, the partial derivative δ
t
(k) of the output layer error, and the output ho (k) of the hidden layer, the partial derivative δ
h
(k) of the hidden layer neuron is calculated. Furthermore, the input value x
i
(k) of each neuron in the input layer is used to modify the connection weight w
ij
(k) and the threshold value b
j
(k) between the input layer and the hidden layer.
The global total error E is calculated.
Whether the network error satisfies E < ɛ is judged. If it is satisfied, the BP neural network learning process ends. Otherwise, the next sample is randomly selected and the algorithm moves to step (3) to continue learning and training the samples until the error meets the requirements or the number of iterations reaches the maximum number of iterations.
Deep neural networks have a large number of basic models. This section mainly discusses the automatic encoder model and the restricted Boltzmann machine model.
(1) Auto-encoder encoder model
The concept of auto-encoder was first proposed in 1986. It is an unsupervised algorithm that is mainly used to reduce the dimension or feature extraction of data. The automatic encoder is composed of an encoding network and a decoding network, and is a three-layer network model. The idea is to use the BP algorithm to back-propagate the error, and continuously adjust the weights and thresholds between the network layers to minimize the error between the original input data and the final output data to reconstruct the original input data. The process of the automatic encoder is to first transform the input data through the encoder network, for example, to transform the input data from a high dimension to a low dimension. Then, it restores the original input data through the decoding network, and uses the error function to calculate the error between the original input data and the output data, and minimizes the error to achieve the original input data reconstruction. The purpose of the auto-encoder is to find an approximate identity function that makes the output infinitely close to the input. Its structure is shown in Fig. 2. We assume that the input data feature vector is x = (x1, x2, ⋯ , x
n
), Convert to feature vector in hidden layer h = (h1, h2, ⋯ , h
m
) and the output feature vector is y = (y1, y2, ⋯ , y
n
). Then, the mathematical expression of the automatic encoder from the input layer to the intermediate hidden layer and the decoder mapping from the intermediate hidden layer to the output layer are as follows:

Auto-encoder model structure.
Among them, f and g are the encoding function and decoding function, respectively, s f and s g are the activation function of encoding and decoding, which are generally nonlinear functions. W f and W g are the weight matrix of the network, and b i and b j and threshold matrix of the network.
Autoencoders generally use gradient descent to adjust the weights and thresholds between layers. The purpose is to minimize the error between the input feature vector x and the output feature vector y to reconstruct the original input. Moreover, the cost function is generally a mean square error function or a cross-entropy loss function, and the expression is as follows:
(2) Restricted Boltzmann Machine Model
The restricted Boltzmann machine model is an improvement of the Boltzmann machine by Hinton et al. and it is a two-layer undirected graph model. The restricted Boltzmann machine model only connects the nodes between the visible layer and the hidden layer. Moreover, the connection between Boltzmann machine units is limited to two adjacent layers, and there is no connection between the same layer unit and the cross-layer unit. The model structure is shown in Fig. 3:

Model structure of restricted Boltzmann machine.
We assume that the visible layer unit v and the hidden layer h unit of the restricted Boltzmann machine model can be arbitrarily distributed exponents, and are a bipartite graph model, v i ∈ { 0, 1 } , h j ∈ { 0, 1 } . Among them, (i = 1, 2, ⋯ , n ; j = 1, 2, ⋯ , m) . After obtaining θ ={ w ij , a i , b j }, we can determine the restricted Boltzmann machine model. w ij is the connection weight between the visible layer and hidden layer units, and a i and b j represent the offset of the visible layer and hidden layer units, respectively. For a given set of states (v, h), the energy formula of RBM is:
After exponentialization and regularization of the energy function, the joint probability distribution of (v, h) in which the visible layer unit and the hidden layer unit are respectively in a certain state is obtained.
Among them, Z (θ) is the normalization factor. From the formula (18), the marginal distribution of visible layer units and hidden layer units can be obtained. Subsequently, Hinton proposed a deep-confidence neural network model in 2006. The deep belief network model can be interpreted as a Bayesian probability generation model, which is composed of multiple constrained Boltzmann machines stacked and trained with a greedy learning algorithm. Its structure is shown in Fig. 4:

Structure of deep confidence neural network model.
The evaluation of university teaching quality is a multi-objective, multi-level and complex nonlinear problem. Moreover, the existing methods and models for evaluating the teaching quality of colleges and universities have the problems of difficult determination of standard weights, strong subjectivity and randomness, prone to overfitting, and slow speed of optimization. In addition, the standard BP neural network has the problems of slow convergence and easy to fall into the local minimum, etc. In view of the above problems, this chapter proposes an adaptive BP neural network model. The main idea of this model is to introduce adaptive learning rate and momentum terms to improve the gradient descent method of BP neural network to improve the convergence speed and optimize the network structure to ensure the stability of the model. In addition, new evaluation indicators are added to the traditional evaluation indicators to construct a cost-oriented teaching quality evaluation indicator system to ensure that the model comprehensively evaluates classroom teaching evaluation. Moreover, the evaluation index sample data set is normalized as a model input feature vector to improve the calculation efficiency of the model.
(1) Network structure determination
According to experience, there are multiple formulas for determining the number n
i
of hidden layer neurons in the three-layer BP neural network. In this paper, formula (19) is selected and the best neuron number is determined by trial and error method.
Among them, the number of neurons in the input layer is n, the number of neurons in the output layer is m, and a is a constant between [1, 10].
(2) Adaptive learning rate and increased momentum term
The learning rate, also known as the learning step size, is fixed in the standard BP neural network. When the learning rate is too large, it will lead to instability of the network structure and network oscillation. However, when the learning rate is too small, the network convergence speed will be slow. Therefore, it is difficult to ensure the best learning efficiency of the entire network structure in practical specific problems. The adaptive learning rate is to automatically adjust the learning rate according to the change of the network error to repeatedly correct the weights and thresholds between the connected layers to appropriately increase the convergence speed. We assume that the initial learning rate is μ (0), and the network error calculated by the nth iteration of the model is E (n). Then, the change of the learning rate is shown in formula (20).
In general, the values of β and γ are 1.05 and 0.7, respectively.
In the process of error back propagation, adaptively adjusting the learning rate can effectively improve the convergence rateHowever, only the gradient descent direction adjustment at the current time t is considered, and the gradient direction before time t is not considered.In order to solve this contradiction, the momentum term is introduced, that is, the correction weight plays a damping role in the process of error back propagation, and its formula is shown in (21). If formula (21) is regarded as a time series (0 < t < N) with t as a variable, formula (21) can be regarded as the first-order difference equation of Δw (n), and its formula is shown in (22):
Among them, α is the momentum term (0 < α < 1), w is the weight, μ is the learning rate, and E (n) is the error. Then, the weight adjustment of the BP neural network is shown in formula (23):
(3) By adding two first-level indicators for pre-teaching preparation and the situation in the teaching process, this chapter combines the 23 second-level indicators included in the traditional evaluation indicators to ensure a more comprehensive evaluation of the teacher’s teaching process. Moreover, this study normalizes the evaluation sample data to reduce the difficulty of adjusting the weight and threshold due to the large change of the input value, thereby improving the calculation efficiency of the BP neural network.
Support Vector Machine (SVM) was first proposed by Vapnik in 1995, and it is a supervised machine learning algorithm based on statistical theory. The idea is to establish a classification hyperplane as a decision surface, so that the isolation edge between the positive and negative examples is maximized. It is mainly used for classification, pattern recognition and regression. Support vector machines have certain advantages in solving complex nonlinear problems and high-dimensional pattern recognition. It is an approximate implementation of structural risk minimization to obtain a good generalization of a limited number of learning modes. Vapnik et al. proposed the regression model of support vector machine in 1997, which is called support vector regression. Because there are many evaluation indexes of teaching quality in colleges and universities, and the complex non-linear relationship between the evaluation indexes received and the evaluation results of teaching quality, it is difficult to express them mathematically. However, support vector regression has a good nonlinear function fitting ability, which can be used to solve this problem. Therefore, this chapter chooses support vector regression as the predictor of the output layer of the deep neural network model in this chapter to evaluate and predict the teaching quality.
Support vector regression is divided into linear regression and nonlinear regression, which is divided according to whether it is to be embedded in a high-dimensional space. Because the quality of teaching in colleges and universities is a complex non-linear problem, this section mainly discusses the non-linear regression of support vector regression. The nonlinear regression of support vector regression is to map the complex nonlinear relationship into the high-dimensional space, and then realize the linear relationship similar to the low-dimensional space in the set high-dimensional space. We assume that the data set is:
For a data set S that cannot be linearly separated in the original space R
n
, a nonlinear mapping function ϕ is first set to map S to a certain high-dimensional space, and make ϕ (S) have a good linear regression feature in the feature space H. Therefore, for the linearized representation of the nonlinear problem, it first performs linear regression in the feature space H, and then returns to the original space R
n
. After being given a kernel function K (x
i
, x) = (ϕ (x
i
) , ϕ (x)), the expression of constructing a nonlinear function is:
The frequently used kernel functions are as follows. Among them, σ is a parameter.
(1) Linear function:
(2) Linear function:
(3) Radial basis function:
(4) Sigmoid function:
The online teaching quality evaluation model constructed on the basis of the above analysis is shown in Fig. 5.

Teaching quality evaluation model.
In the process of evaluating the teaching quality of the model, in order to achieve the best evaluation and prediction performance of the model, the relevant parameters of the model need to be set. At present, the setting is mainly carried out experimentally. During the experiment, the relevant parameters of the model are constantly adjusted to improve the computing power and prediction accuracy of the model, and to obtain the optimal combination of relevant parameters of the model. In this section, the unsupervised learning training process and supervised prediction output are used to optimize and adjust the relevant important parameters of the model to improve the prediction accuracy of the model to evaluate the teaching quality.
We assume that the number of hidden layers in the DDAE-SVR deep neural network model is 3, and the number of neurons in the hidden layer is 20. The mean squared error function is used to calculate the error between the unsupervised training output data and the original input data. The gradient descent algorithm, the RMSProp algorithm, the momentum algorithm (the momentum term is selected as 0.65), and the Adam algorithm are used as optimization algorithms for training calculation errors, and their error change curves are shown in Fig. 6 and Table 1. It can be seen from the figure that although the gradient descent method and the momentum algorithm have been declining all the time, their descent speed is slow and the number of iterations is increasing, resulting in a slow convergence speed. In the first 500 iterations of RMSProp algorithm and Adam algorithm, the error between the output feature vector and the original data drops quickly. As the number of iterations continues to increase, although the error continues to decline, the trend of error convergence tends to be flat. It can be seen from the figure that the Adam algorithm has the best effect on reconstructing the original input data during the unsupervised learning training process, so it is selected as the optimization algorithm for the unsupervised learning training process.

Comparison diagram of training errors of different optimization algorithms.
Comparison table of training errors of different optimization algorithms
In order to determine the appropriate number of hidden layers in the deep neural network model, and consider the scale of the evaluation sample data set, the number of hidden layers is set in the range of 2 to 5, and the number of hidden neurons is 20, and the Adam algorithm is used as an optimization algorithm for the unsupervised learning training process. The features of the evaluation sample data set are input into the DDAE-SVR deep neural network model for training. Then, after unsupervised training of the deep noise reduction automatic encoder, the error curve of the reconstructed output data feature vector and the original input data set is shown in Fig. 7 and Table 2. It can be seen from the figure that when the number of hidden layers is the same, the error between the reconstructed output data and the original input data gradually decreases as the number of iteration training increases. When the number of iteration training is the same, the error will gradually increase with the increase of the number of hidden layers. Therefore, when the number of hidden layers of the DDAE-SVR deep neural network model is 2, the error between the reconstructed output data after unsupervised training and the original input data is optimal.

Comparison diagram of the errors of different hidden layer layers.
Comparison table of errors of different hidden layer layers
After that, the effectiveness of the algorithm in this study is verified. The teaching quality evaluation model of this study is used for a 60-day teaching scoring and is compared with the manual scoring results. The results are shown in Figs. 3 and 8.

Comparison diagram of model scoring results.
Comparison table of model scoring results
It can be seen from Fig. 8 that the research model and the manual scoring results are relatively close, so the research model has high reliability in the teaching quality scoring and can be applied to the teaching quality evaluation system.
In order to solve the problems of the existing methods and models when dealing with small-scale data sets and the problems of BP neural network, such as slow convergence speed and easy to fall into local minimum, this paper proposes an adaptive BP neural network model. This model mainly introduces adaptive learning rate and momentum to improve the gradient descent method of BP neural network to improve the convergence speed of the model and optimize the network structure to ensure the stability of the model. In addition, a new evaluation index is added to the input feature vector of the model and normalized to ensure a more comprehensive evaluation of the model and improve the efficiency of model calculation. Finally, the evaluation sample data set is input into the model for training and compared with other models. The results show that the model can not only solve the problems of the existing evaluation methods and models that are too subjective and random, prone to over-fitting, and slow convergence speed, but also the predicted evaluation results are optimal. Therefore, this model is effective in solving the teaching quality evaluation of colleges and universities.
The model proposed in this paper is composed of deep noise reduction automatic encoder for unsupervised training and support vector regression as supervised prediction. In the unsupervised training process, the error between the reconstructed output data and the original input data is minimized to obtain the essential feature vector of the original input data. In the process of supervised prediction output, the acquired essential feature vectors of the data are input into the support vector regression as input feature vectors for prediction evaluation.
