Deep learning and multimodal target recognition of complex and ambiguous words in automated English learning system

Abstract

On the basis of convolution neural network, deep learning algorithm can make the convolution layer convolute the input image to complete the hierarchical expression of feature information, which makes pattern recognition more simple and accurate. Now, in the theory of multimodal discourse analysis, the nonverbal features in communication are studied as a symbol system similar to language. In this paper, the author analyzes the deep learning complexity and multimodal target recognition application in English education system. Multimodal teaching gradually has its practical significance in the process of rich teaching resources. The large-scale application of multimedia technology in college English classroom is conducive to the construction of a real language environment. The simulation results show that the multi-layer and one-dimensional convolution structure of the product neural network can effectively complete many natural language problems, including the tagging of lexical and semantic roles, and thus effectively improve the accuracy of natural language processing. Multimodal teaching mode helps to memorize vocabulary images more deeply. 84% of students think that multi-modal teaching mode is closer to life. Meanwhile, multimedia teaching display is more acceptable. College English teachers should renew their teaching concepts and adapt themselves to the new teaching mode.

Keywords

Deep learning multimodal target recognition information technology artificial intelligence

1 Introduction

Multimodal discourse refers to the comprehensive use of auditory, visual, and tactile ways to communicate through various means, such as language, image, sound, and action. In addition to the content of speech, gestures, body potential, multimedia equipment, environment and other factors can convey the meaning of discourse. Nonverbal and adjoint linguistic features are the object of linguistic research. Now, in the theory of multimodal discourse analysis, the nonverbal features in communication are studied as a symbol system similar to language. All kinds of symbol systems can be used to communicate in social communication and to express the meaning of speakers. (Li Zhanzi, 2003). Modality (Mode) refers to the symbol resource that synchronously implements discourse and communication categories, which can be realized through a medium. There is cooperation between modes, and mutual coordination between modes needs to be designed.

New London team put multimodal discourse theory in language teaching for the first time. In 2003, Chinese scholar Li Zhanzi introduced the theory of multimodal discourse analysis to our country, and Chinese language researchers have carried out this research in succession. Hu Zhuanglin (2007) distinguishes multimodality and multimedia, and studies the application of multimodality in academic and teaching. Based on systematic functional linguistics, Zhu Yongsheng (2007) introduces the theoretical basis and analytical methods of multimodal discourse research. Based on systemic functional linguistics, Zhang Delu (2010) studied the application of multimodal discourse analysis theory in foreign language teaching, and established a multimodal discourse comprehensive analysis framework. Based on the theory of multimodal discourse analysis, this study tries to apply this theory to the vocabulary teaching in the primary school aiming at the characteristics and existing problems of College English teaching. The author studied the influence of learning performance of multimodal teaching on students’ vocabulary acquisition, vocabulary teaching and seek the best way to explore multimodal vocabulary teaching mode, operation principles, strategies, methods and means, so as to provide an effective evidence for the multimodal English teaching.

The general principle of multimodal discourse analysis is to make full use of modern media technology to express the meaning of the speaker to the maximum and achieve the best effect. Combined with the theory of multimodal discourse analysis, some inspirations for Higher Vocational English teaching are as follows:

Higher vocational English teaching based on digital language laboratory. Modern digital voice laboratory has many functions, such as broadcast function, foreign language database, chat room function, network teaching and so on. The digital language laboratory can stimulate learning motivation and mobilize various senses in an all-round way, thus realizing the process of knowledge construction in multimodal. At the same time, we can also realize exploratory study, independent study and cooperative learning, and provide strong support and guarantee for the continuous development of Higher Vocational English teaching.

Cultivate the ability of reading and writing. Multiple reading and writing ability refers to the information of all modes that can be read and can produce the corresponding material accordingly. In English teaching, it is equipped with multimedia equipment and channels, such as simulation software, image recognition software and scientific simulation. Multimodal discourse theory tries to analyze the path of information transmission and interpretation, and discusses how to produce a diversified overall meaning. The New London team put forward the following four aspects: the teaching mode of real practice, clear guidance, critique and transformation practice. In Higher Vocational English teaching, we can explore the specific process and training mode suitable for this kind of teaching according to the existing teaching methods, and cultivate students’ multiple reading and writing ability.

Multimodal class type of English Teaching in higher vocational education. According to “employment oriented and vocational ability training as the goal”, English teaching in higher vocational colleges can use many classroom types from the multi-modality perspective. Here are two ways to learn. Practice skill training type: practice skill training means that students can master certain skills according to certain procedures in teaching. Taking job orientation and company interview as an example, teaching content is designed for personal capability mining, target positioning, resume making, interview preparation, interview simulation, interview assessment and so on. In a specific virtual workplace, the basic teaching steps are organized according to “participation, plan, implementation, inspection and evaluation". According to the design skill requirements, students show through PPT display, oral expression, role demonstration and other modes, so as to strengthen training and complete the set think and feedback evaluation in practical skills.

The main contribution of this paper is to design a human-computer interactive English experience teaching based on fuzzy set and BP neural network. With the development of intelligent education, the improvement of English teaching quality is particularly important, especially the effectiveness of classroom teaching. The construction of interactive education system has changed the traditional classroom teaching mode and made it possible for individual autonomous learning.

This paper is organized as follows: The related work is introduced in Section II. BP neural network and error revision method is described in Section III. Cloud computing and education is presented in Section IV and case analysis and Judgment matrix of AHP is presented Section V. Finally, Conclusions are given in Section VI.

2 Related work

Globalization helps to enrich and interact different cultures around the world in the scope and depth. At the same time, the development of science and technology has also made tremendous changes in the way of communicate. It is worth noting that, in addition to language, rich symbolic resources are coordinating and realizing the meaning of communication. For a long time, Linguists have neglected images, music, sounds and other non-linguistic symbolic resources, and the linguists’ research focused on language, which is considered the only way of leading significance. With the rapid development of modern technology, people are aware of the fact that there is little linguistic significance. Looking back over the past ten years, the ways of communication have changed tremendously, and the English teaching model also needs to be updated. In order to cater to the trend of the large-scale increase of cultural categories and integration between each other, new curriculum reform has put forward new requirements for teaching methods and content. In addition, the “requirements for the teaching of College English Courses” promulgated in 2004 points out that the number of college students is increasing rapidly, and the existing educational resources are relatively limited, so we should make full use of the opportunities brought by multimedia and Internet, and improve the old teacher led speech teaching mode. Therefore, the multi-modal teaching method appears, it is obvious that the study of multimodal analysis is imminent.

People have multiple sensory organs and different channels for receiving information, but there is no appropriate way to combine these channels together. Multi-modality uses symbols to build bridges between these channels, modifies the behavior of people’s information exchange, and gives different functions based on different environments and purposes. Vision and hearing are the main perceptual information that human beings receive, and it mainly undertakes the duties of information exchange and interpersonal communication. The rest of the senses, such as smell and taste, lack certain intuition for the transmission and reception of information, and are often used as an auxiliary means. These perceptions are not completely fragmented, and it can be converted to each other in a specific environment to achieve different purposes. In information communication, the interlocutor can decide what kind of modality to choose finally according to many factors.

The main theoretical basis of multimodal teaching is the theory of multimodal discourse analysis proposed in 1990 s, and the development process of this theory cannot be separated from the linguist Ha Lide’s theory of functional linguistics, which holds that language is a symbolic form that has a social role. Multimodal discourse theory holds that discourse is multilevel and consists of various senses and the symbol with different forms. The main content of multimodal discourse analysis is to systematically analyze the language, and systematically study the concept and meaning of social semiotics at the linguistic level. The role of multimodal discourse can be embodied in a specific context, and it has no practical significance. In late twentieth Century, the idea of multimodal teaching model was first proposed by New London. Based on the development of multimedia technology, this theory thinks that teaching and science and technology should be combined, and a variety of ways should be used to exchange information with students. For example, by showing teaching pictures, playing teaching audio and video and other measures, we can stimulate students’ senses in all aspects, and improve their learning enthusiasm instead of traditional single transmission.

As early as the 20th century, foreign scholars have done a great deal of research on multimodal discourse theory, and published a series of discussions. For example, Ronald Barthes’s “Rhetoric of Images” made a detailed description for the relationship between language and image information. OHalloran has made a great contribution to the study of multimodal phenomena in the article. Leies introduced the idea of predecessors to the education industry, and thought that there was a complementarity between multimodal discourses. which can effectively improve the effectiveness of second language teaching. The domestic research in this field is relatively late, the theoretical basis is not deep, and the results are relatively few, but there are also many outstanding scholars and articles. For example, Gu Yueguo expounded the difference and relationship between the media and the modal by comparing the multimedia teaching and multimodal teaching. Combined with practical experience, Zhang Delu put forward his own opinions on the research of multimodal discourse analysis theory.

The traditional English teaching is divided into three aspects: reading, listening and speaking. In the teaching of listening, a relatively single teaching mode is generally adopted in the domestic universities, which is mainly to transmit information to the students’ hearing. This single teaching mode not only exists in the university stage, but runs through the whole education system, from primary school to high school, which is long and boring, and it is hard to arouse students’ interest and maintain enthusiasm of learning, and teachers can’t achieve their teaching goals fully. In order to change this situation, the author thinks that in addition to traditional listening exercises, teachers should also use audio, video and pictures to assist teaching in the process of teaching, and when necessary, touch can be introduced to improve students’ participation in class. In some college English listening courses, the practical application of this theory has emerged, such as group activities, drama plays, PPT topic presentations and so on. The use of auxiliary means should be combined with concrete actual conditions, cannot put the cart before the horse, and these means ultimately to achieve the desired effect of teaching. The focus of listening teaching is still the listening part, the proportion of multimodal distribution should also increase the proportion of listening.

The level of speaking ability can judge the practical effect of language learning to a great extent, and the speaking course also has its own characteristics. Based on this, speaking teaching should not be limited to the content provided by the teaching materials, and it is necessary to strengthen the use of network resources, and teaching through the way of multimodality. Speaking should eventually be used in practice, and to be able to speak. It should be highlighted in practical teaching, arrange students’ speaking time and theme content, carry out targeted exercises, and make students become the main body of class. In the process of practice, we should also pay attention to the diversity and interest of the form, and around the mainstream topics of the students, such as popular movies or song stories. The form of activity can use voice, speech or performance and other forms to enhance the practical ability of students. Group activities are also conducive to promoting communication and interaction between students, and when learning films and drama works, students’ accent problems can also be corrected because of the influence of these films. The article [26] addresses the issue such as enormous volume of bigdata and come up with the concept of SmartBuddy to form brilliantly and savvy environment utilizing human practices and human elements. The article [27] talks almost the development of coordinated non-cyclic chart for video coding calculations for movement estimation in parallel reconfigurable computing frameworks. Moreover, the partitioning algorithm plays a major part to speed up the video processing. The article [28] dealt exploiting IoT and BigData Analytics utilizing Hadoop environment in genuine time situations. Execution of IoT-based Smart City is accomplished by the above-mentioned processes. The article [29] centers around IoT and its significant job in sophisticating the human practices and endeavors. This paper additionally managed the assortment of different information from different assets that are associated with the web. The literature [30] addresses the different issues within the field of vehicle communication with the recommendation of a common bound together and scattered range detecting demonstrate [31]. The application of the shared cognitive paradigm minimizes struggle and different obscure problems [32].

3 The establishment of model

3.1 AHP method

This paper takes the non-English major of a university as an example to study, and 204 students in class A and B in grade 2017. Among them, class A is the traditional English teaching method, and class B is a multimodal English teaching method. The eight teachers have the equivalent of postgraduate education, teaching experience and teaching level. This questionnaire is based on the theory of multimodal teaching and is divided into three parts:

The evaluation of students for teachers’ multimodal teaching in class B, which includes satisfaction survey of vision, hearing and other aspects.

The investigation on students’ interest for different teaching modes.

Comparison of final grades between class A and B.

The construction of layer B to the layer A contrast matrix

(3) The determining of judgment matrix $A = [\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ a_{31} & a_{32} & \dots & a_{3 n} \\ a_{n 1} & a_{n 2} & \dots & a_{nn} \end{matrix}]$ (1)

(4) Weight vector ɛ (A) = (ɛ₁, ɛ₂, ɛ₃, ⋯ , ɛ_n) is got. $ɛ = \frac{\sqrt{\prod_{j = 1}^{n} a_{ij}}}{\sum \sqrt{\prod_{j = 1}^{n} a_{ij}}}$ (2)

Maximum eigenvalue is: $λ = \frac{1}{n} \sum_{i = 1}^{n} \frac{{(A ɛ)}_{i}}{ɛ_{i}}$ (3)

Judgment matrix order $CI = \frac{λ_{n} - n}{n - 1}$ (4)

Consistency ratio is got. ${CR}_{A} = \frac{CI}{RI}$ (5)

Two level index consistency ratio is: ${CR}_{B} = \frac{\sum_{i}^{n} ɛ_{i} {CI}_{i}}{\sum_{i}^{n} ɛ_{i} {RI}_{i}}$

3.2 BP neural network

With the rapid development of multimedia network technology, it is possible to exchange and share educational resources all over the world. At present, many colleges and universities have built campus network, which is in line with the Internet. It provides a huge teaching resource bank and a good teaching platform for college English teaching. Teachers and students can download useful learning resources from the Internet at any time to enrich English classroom teaching.

Both computer teaching and classroom teaching have strongly withered the cultivation of students’ comprehensive abilities in listening, speaking, reading, writing and translation, but each has its own emphasis. Computer teaching focuses on listening and speaking, giving consideration to reading, writing and translation, while classroom teaching focuses on reading, writing and translation, giving consideration to listening and speaking. In this way, students can give full play to the advantages of computer teaching, audio-visual combination, strengthen listening and speaking training; teachers can guide through the classroom, impart reading, writing, translation knowledge and skills, so that students under the guidance of teachers, according to their own English level, choose suitable learning content, with the help of computers. Help, strengthen self-regulated learning, and rapidly improve.

Although multimedia teaching mode has incomparable advantages compared with traditional teaching mode, it still has the following limitations: some multimedia courseware emphasizes form, does not emphasize content, too much information, students can not write down effective information in limited time, therefore, after a class, there is a blank mind, learning effect is twice as good as half due to students’English level; There are differences, teaching will appear such a situation: excellent students can not eat enough, backward students can not keep up with, it is difficult to achieve the desired teaching objectives; due to the lack of systematic training, many teachers are not skilled in operation, problems must start from scratch, not only waste time, affect the teaching progress, but also easy to make students anxious and bored mood, affecting the teaching effect.

The relationship equations between input data and output data are: $β = \sum_{m = 1}^{n} α_{mn} x_{n}$ (6) $y = f (β_{j} - λ_{j})$ (7)

The threshold is: $f (x) = {\begin{matrix} 1, x ⩾ 0 \\ 0, x < 0 \end{matrix}$ (8)

(2) Operation algorithm

In the above picture, the input layer, the hidden layer and the output layer are respectively from bottom to top. Connection weights are connected between each level.

(3) Normal transmission process

In the input layer, the output data source of the hidden layer is: $y = f (\sum_{m = 0}^{n} α_{mn} x_{mn})$ (9)

In the above equation, variable data in the input layer is x, output variable data is y, m is the serial number of the input layer node, n is Serial number of implicit hierarchy data node. When the value of x is –1, the data source of the input layer is the data source of the output layer. In other words, the threshold is introduced in implicit level.

In the process of the input layer, the output data source of the output layer is: $v = \sum_{m = 0}^{n} t_{mj} y_{m}$ (10) $L = f (v_{j}) = f (\sum_{m = 0}^{n} t_{mn} y_{m})$ (11)

In the above equation, v is the weight value of connecting the hidden layer and the output layer. When the value of y is –1, the threshold is introduced in output level.

(4) Error correction transfer process

Calculation error of each layer $E = x_{m}^{j} (1 - x_{m}^{j}) (x_{m}^{j} - y_{m}^{j})$ (12)

The number of layers is the same as the number of verification.

If they are different, the calculation error exists in each layer. $E = x_{m}^{n} (1 - x_{m}^{n}) (x_{m}^{n} - y_{m}^{n})$ (13)

In the process of signal transmission and normal propagation, there will be a certain degree error. The error is expressed by error signal. $e = \frac{1}{2} \sum_{j = 1}^{n} {(r_{j} - L_{k})}^{2}$ (14)

Error signal to the hidden layer is expressed by the following equation. $e = \frac{1}{2} \sum_{j = 1}^{n} {[r_{j} - f (\sum_{m}^{n} t_{m j} t_{m})]}^{2}$ (15)

Error signal to the output layer is expressed by the following equation. $e = \frac{1}{2} \sum_{j = 1}^{n} {(r_{j} - f [\sum_{m = 0}^{n} t_{mj} f (\sum_{m}^{n} α_{mj} x_{m})])}^{2}$ (16)

In the above equation, e is the function relation of the weight value that connects each layer.

The computer operation neural network workflow is as follows.

3.3 Error revision

Overall revision. The above requirements can not meet the above requirements, so the calculation is repeated. In the process of repeated calculation, the threshold is adjusted. The adjustment formula is as follows. $\partial_{ij} (v + 1) = \partial_{ij} (v) - Δ e_{i}^{k} x_{j}^{k - 1}$ (17)

Partial revision. Taking into account the existence of the BP neural network, the convergence rate is slow and prone to local errors and other issues, momentum factor is increased, partial problem is adjusted (Seifollahi, 2012). The problem can be solved by the traingda, traingdm function in the MATLAB software.

The formula for solving the local problem is as follows. $Δ x (v + 1) = ɛ Δ x (v) + γ (1 - ɛ) \frac{\partial e (v)}{\partial x (v)}$ (18) $x (v + 1) = x (v) + Δ x (v + 1)$ (19)

In the above equation, v is the number of iterations in the computer calculation, γ represents values between 0 and 1, γ is momentum factor between 0–1, in this paper, the value of γ is 0.95.

To solve the convergence problem, the adjustment formula is as follows. $γ (v + 1) = {\begin{matrix} 1 γ (v), e (v + 1) < e (v) \\ 0.5 γ (v), e (v + 1) > 1 e (v) \\ γ (v), others \end{matrix}$ (20)

(3) Weight revision. In order to overcome the serious deficiencies of the grey model and multiple regression, by the propagation of the learning algorithm in the opposite direction, the weights are adjusted as follows. $γ_{mn} = γ_{lk} - α \frac{\partial e}{\partial γ}$ (21)

There is the following equation. $\frac{\partial e}{\partial v_{mn}} = \sum_{j = 1}^{m} \frac{\partial e}{\partial v_{mn}}$ (22)

The correlation coefficient is determined. $X^{2} = \frac{\sum_{i = 1} (x_{i} - \bar{x_{i}}) (\overset{\land}{x_{i} - \overset{\bar{\land}}{x_{i}}})}{\sqrt{\sum_{i = 1} (x_{i} - \bar{x_{i}})^{2} (\overset{\land}{x_{i} - \overset{\bar{\land}}{x_{i}}})^{2}}}$ (23)

Among them, there are the following equations. $\bar{x_{i}} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}$ (24) $\bar{\overset{\land}{x_{i}}} = \frac{1}{n} \sum_{i = 1}^{n} \overset{\land}{x_{i}}$ (25)

4 Education with the technology

4.1 Research on effectiveness experiment

Multimodal teaching evaluation in Teachers’ Classroom of class B as shown in Table 1 and Fig. 4.

Table 1
Multimodal teaching evaluation in Teachers’ Classroom

Type Very satisfied Basic satisfied Uncertain Dissatisfied

Auditory modality 66% 27% 5% 2%

Visual modality 88% 9% 2% 1%

Other modalities 70% 20% 5% 5%

Type	Very satisfied	Basic satisfied	Uncertain	Dissatisfied
Auditory modality	66%	27%	5%	2%
Visual modality	88%	9%	2%	1%
Other modalities	70%	20%	5%	5%

From Table 1 and Fig. 1, the satisfaction survey of multimodal teaching in Teachers’ classroom, it can be found that students are more satisfied with the visual modality of the teacher, reaching more than 80%, which is mainly influenced by the teaching model of college English. In the process of technology development, multimedia technology is becoming more and more mature, and it appears frequently in university classroom, especially in English teaching, it plays a great role. Multimedia creates preconditions for teachers to carry out multimodal teaching. When designing teaching PPT, teachers can get rid of the constraints of single characters, add pictures, audio and even videos to PPT, and allocate their proportion rationally, so as to enhance students’ learning enthusiasm and vividly convey the information they want to convey. The students are not satisfied with the teachers’ other modalities of teaching, which indicates that classroom teaching is mainly based on traditional teaching methods and student participation is supplemented.

Fig. 1

Graph of BP Neuron Model Function.

The relationship between teaching modality and student enthusiasm is shown in Table 2 and Fig. 5.

Table 2

The relationship between teaching modality and student enthusiasm

Id	Which modals can improve the enthusiasm of English learning?	Proportion
1	Let the students retell the text	92%
2	Explain important knowledge points	86%
3	Rich and vivid courseware	83%
4	Group discussion	76%
5	Classroom questioning	84%
6	Role playing	78%
7	Eye contact	82%
8	Explain the text knowledge point one by one	49%

From Table 2 and Fig. 2, it can be seen that the teaching modality that has the greatest influence on the interest of English learning is the students repeat the text voluntarily, which indicates that the students are more inclined to the modality that allows themselves to participate in the process of learning. The lowest score is the traditional teacher’s explanation model, which shows that the application of multimodal is beneficial to improve the effect of English learning.

Fig. 2

Neural network topology diagram.

After a semester of teaching, the final English scores of the two classes are compared, and as shown in Table 3 and Fig. 3.

Table 3

Comparison of English scores

	A class (traditional teaching)	B class (multimodal teaching)
Speaking	80	92
Reading comprehension	85	86
Writing	83	91
Cloze test	86	87
Comprehensive	83.5	89

Fig. 3

The computer running flow chart of neural network to work.

From Table 3 and Fig. 6, it can be seen that multimodal teaching is more effective than traditional teaching, especially in speaking and writing, multimodal teaching is obviously better than traditional teaching, which because that multimodal teaching is to get rid of the traditional teaching methods, and take the students as the leading role to carry out the teaching of visual, auditory and other modalities. Therefore, the effectiveness of multimodal teaching in English teaching is proved through the three groups of experiments.

Fig. 4

The evaluation chart of multimodal teaching in Teachers’ Classroom.

Fig. 5

The Contrast diagram of relationship between teaching modality and student enthusiasm.

Fig. 6

The contrast diagram of English scores.

4.2 Teaching strategy under multimodal teaching

Introducing the concept of multimodal literacy. In the process of multimodal learning, students are not simply information receivers. Besides understanding the functions and meanings of different media modalities, it is necessary to make practical use of it in the exchange of information, and effectively improve the communicative competence.

Training teachers’ ability to use multimedia technology. Many teachers are unskilled for the use of multimedia technology, not often used and even some old teachers have a psychological exclusion. Teachers should make use of multimedia to enrich the content of courseware, and the process of explanation should be combined with multimodal theory. However, teachers use too many multimodal teachings will make the focus of the class unclear. Teachers should improve the effectiveness of multimodal teaching to the greatest extent.

Encouraging students to learn autonomously, and expanding their learning beyond the classroom. Education and learning are a long-term and multi-field process, and only by relying on teachers’ guidance and explanation in the classroom cannot meet the needs of learning. Extracurricular learning is not only the need to consolidate the classroom knowledge, but also an important way to acquire new knowledge to improve student’s comprehensive quality. The development of network technology also provides convenience for students to learn multimodal information, and teachers should play the role of good organizer and supervisor in this process.

Multimodal means should be adopted for the acceptance of English teaching results. The traditional evaluation method is mainly the examination based on the text level, but this way has a certain degree of disconnection with the multimodal English teaching process, and the evaluation means should be matched with the teaching mode and carry out the modal reform.

To sum up, teachers’ use of multimodality (mainly visual modality and auditory modality) in English classroom helps students understand and memorizing what they are learning, and can stimulate students’ interest in learning and concentrate their attention, which is more popular among most students. Therefore, teachers should design the multi-modal teaching for the students according to the specific situation of the students, so as to achieve the optimization of the teaching effect. It is worth noting that although multimodal can attract students’ attention, too many multimodal factors distract students’ attention. Teachers need to adjust the multimodal quantity in the process of design.

Comparison between experimental and control class vocabulary pretest results

From the table, we can see that the average score of the pre-test of the experimental class is 23.56, while that of the control class is 23.14 and the average score of the two classes is very small. Further, when the T test was carried out for the two classes, the probability value of bilateral Sig was 0.224, which was greater than 0.05, indicating that there was no significant difference between the experimental class and the control class. There was no significant difference in the overall level of the two classes.

4.3 Data analysis after experiment

The questionnaire 2 was carried out at the end of the experiment. The subjects were students in the experimental class. It aims to investigate students’ attitudes towards multimodal vocabulary teaching, the influence of traditional vocabulary teaching and multimodal vocabulary teaching on students’ learning vocabulary after three months’ experiment. A total of 64 questionnaires were issued and 64 valid questionnaires were collected in the survey. The questionnaire is analyzed. Pretest results as show in Table 4, Independent sample T test for pretest results as show in Table 5, Analysis of the results of questionnaire-2 as show in Table 6, Post test results as show in Table 7.

Table 4
Pretest results

Pretest vocabulary achievement Class N Mean value Standard value Standard error of mean value

1 64 23.56 1.664 0.208

2 57 23.14 2.090 0.277

Pretest vocabulary achievement	Class	N	Mean value	Standard value	Standard error of mean value
	1	64	23.56	1.664	0.208
	2	57	23.14	2.090	0.277

Table 5

Independent sample T test for pretest results

		Levene test of variance equation		T test of mean equation
		F	Sig.	t	df	Sig(bilateral)	Mean difference	Standard error value	95% confidence interval of difference
									Lower limit	Upper limit
Pretest vocabulary achievement	Equal variances assumed	1.096	0.303	1.224	119	0.224	0.418	0.342	–0.259	1.095
	Equal variances not assumed			1.208	106.823	0.230	0.418	0.346	–.268	1.105

Table 6

Analysis of the results of questionnaire-2

Option\Question No.	1	3	4	8	10
A	5.2%	1.7%	1.7%	3.4%	1.7%
B	0%	1.7%	3.4%	0%	3.4
C	8.6%	5.2%	6.9%	10.3%	6.9%
D	56.9%	55.2%	41.4%	51.7%	51.7%
E	29.3%	36.2%	46.6%	34.5%	36.3%

Table 7

Post test results

	Class	N	Mean value	Standard deviation	Standard error of mean value
Total score of post test	1	64	83.11	11.523	1.429
	2	57	77.40	19.493	2.582

In the questionnaire-2, 1, 3, 4, 8, 10 is to investigate students’ perceptions of multimodality teaching method: 86% of the students think thatin the classroom, teachers use multimedia means to simulate real situations, which make them experience the real target language environment and improve the ability to use vocabulary. Therefore, 91% of the students like this multichannel and multisensory stimulation teaching mode. 88% of the students are more willing to answer the teacher’s questions in such a relaxed and active learning atmosphere.86% of the students believe that multimodal vocabulary teaching stimulates their interest in learning English and has a positive impact on their learning attitude. 88% of the students believe that the multimodal interaction with teachers and students in the classroom improves their initiative and enthusiasm in their classroom participation. Thus, it can be seen that the multimodal vocabulary teaching method can help students understand and memorize vocabulary, and can stimulate students’ interest in learning and learning enthusiasm.

In the questionnaire-2, 2, 5, 9, 15, 16, 17 questions are to investigate students’ opinions on memory advantage.95% of the students believe that, compared with the traditional teaching mode which lacks interaction, students are divided into groups to discuss each other, cooperative learning is more conducive to knowledge memory, and often can be remembered more firmly.78% of students believe that blackboard writing teaching is not vivid, while multi-modal PPT can display pictures, sounds and videos, making teaching more vivid and interesting, and improving the attention of class.78% of the students think that the learning words in the traditional teaching mode are usually rote learning, and they will soon be forgotten, while the multimodal teaching mode will help the image memory of words be more profound.90% of the students believe that PPT courseware, projection, multimedia video and other multimedia tools are conducive to systematic learning of vocabulary, and it can better grasp and understand the meaning of words, and improve the ability of using words. 91% of the students think that the multimodal PPT courseware is more beneficial to expand the knowledge and improve the vocabulary. 84% of the students think that the mode of multi-modal teaching is closer to life, the teaching of life is easier to accept, and it is easy to produce association.

5 Conclusions

In the age of information technology, multi-modal discourse analysis theory is used to guide teaching as a meaningful and significant teaching model. Through the presentation of images, and the body language of teachers, we can make teaching more vivid, so as to improve students’ interest in learning and expand their horizons. English teaching can carry out multimodal teaching methods in the aspects of modern language laboratory utilization, multiple literacy training, classroom type and teaching process setting. The multimode teaching model can be realized to cultivate higher vocational talents with better and more practical English ability.

Multimodal teaching gradually has its practical significance in the process of rich teaching resources. The large-scale application of multimedia technology in college English classroom is conducive to the construction of a real language environment. From the traditional paper-based teaching develop to the real sense of multi-sensory teaching, it stimulates the learning enthusiasm of students, and improves the classroom participation, which enables the students’ English proficiency to be truly improved in their practical application, and get rid of the predicament that is only to be tested and not to be applied and to cultivate high-quality personnel. This paper verifies the effectiveness of multimodal English teaching through examples, and puts forward some superficial understandings and suggestions. It is hoped that this paper will provide help for the first-line educators to apply the theory to the teaching practice, and contribute to the college English education in China.

References

Aslahi-Shahri

B.M.

, Rahmani

, Chizari

, Maralani

, Eslami

, Golkar

M.J.

and Ebrahimi

, A hybrid method consisting of GA and SVM for intrusion detection system, Neural Computing and Applications 27(6) (2016), 1669–1676.

Azizi

, Rezakazemi

and Zarei

M.M.

, An intelligent approach to predict gas compressibility factor using neural network model, Neural Computing and Applications (2017), 1–10.

Demertzis

, Iliadis

, Avramidis

and El-Kassaby

Y.A.

, Machine learning use in predicting interior spruce wood density utilizing progeny test information, Neural Computing and Applications 28(3) (2017), 505–519.

Ding

, Zhang

and Wu

, Twin support vector machine: theory, algorithm and applications, Neural Computing and Applications 28(11) (2017), 3119–3130.

Finardi

K.R.

, Silveira

, Lima

and Mendes

A.R.

, MOOC in the Inverted CLIL Approach: Hybridizing English Teaching/Learning, Studies in English Language Teaching 4(4) (2016), 473.

, The Strategy of Integrating Enterprise Culture into College English Teaching Under the New Curriculum Reform. DEStech Transactions on Social Science, Education and Human Science (2016), 15–20.

Esteva

, Kuprel

, Novoa

, Ko

and Swetter

, Dermatologist-level classification of skin cancer with deep neural networks, Nature 542 (2017), 115–118.

Fan

and Xiao

, A framework for knowledge discovery in massive building automation data and its application in building diagnostics, Automation in Construction 50 (2015), 81–90.

Liang

, Hong

and Shen

, Occupancy data analytics and prediction: A case study, Build, Environ 102(2) (2016), 179–192.

10.

Kaneko

, A new measure of regression model accuracy that considers applicability domains, Chemometr Intell Lab 171(2) (2017), 1–8.

11.

Wang

, Benchmarking whole-building energy performance with multi-criteria technique for order preference by similarity to ideal solution using a selective objective-weighting approach, Appl Energy 146 (2015), 92–103.

12.

Pingkuo

and Zhongfu

, How to develop distributed generation in China: in the context of the reformation of electric power system, Renewable & Sustainable Energy Reviews 66 (2016), 10–26.

13.

Zhou

, Wang

and Jiang

, Predicting the gas-liquid critical temperature of binary mixtures based on the quantitative structure property relationship, Chemometrics & Intelligent Laboratory Systems 167 (2017), 190–195.

14.

Salem

, Shaffer

and Kublik

, Microstructure-informed cloud computing for interoperability of materials databases and computational models: microtextured regions in Ti alloys, Integrating Materials & Manufacturing Innovation 6(1) (2017), 111–126.

15.

Beuscart

, Mellet

and Trespeuch

, Reactivity without legitimacy? Online consumer reviews in the restaurant industry, Journal of Cultural Economy 9 (2016), 458–475.

16.

Centeno

, Hermoso

and Fasli

, On the inaccuracy of numerical ratings: Dealing with biased opinions in social networks, Information Systems Frontiers 17 (2015), 809–825.

17.

Ghaderi

, Mohammadpour

and Ginn

, High impedance fault detection in distribution network using time-frequency based algorithm, IEEE Trans Power Deliv 30(3) (2015), 1260–1268.

18.

Abdel

, Xu

and Josang

, A normal-distribution based rating aggregation method for generating product reputations, Web Intelligence 13 (2015), 43–51.

19.

Lin

, Chen

and Wu

, Study of image retrieval and classification based on adaptive features using genetic algorithm feature selection, Expert Systems with Applications 41(15) (2014), 6611–6621.

20.

Ullah

and Zeb

, The impact of emotions on the helpfulness of movie reviews, Journal of Applied Research and Technology 13 (2015), 359–363.

21.

Djenouri

, Bendjoudi

and Mehdi

, Gpu-based bees swarm optimization for association rules mining, The Journal of Supercomputing 71(4) (2015), 1318–1344.

22.

Castelli

, Manzoni

and Vanneschi

, An expert system for extracting knowledge from customers’ reviews: The case of amazon.com, inc, Expert Systems with Applications 84 (2017), 117–126.

23.

Bayindir

, Colak

and Fulli

, Smart grid technologies and applications, Renewable & Sustainable Energy Reviews 66 (2016), 499–516.

24.

Ebesu

and Fang

, Neural semantic personalized ranking for item cold-start recommendation, Information Retrieval Journal 20(2) (2017), 109–131.

25.

Fang

, Zhang

and Bao

, Towards effective online review systems in the chinese context: A cross-cultural empirical study, Electronic Commerce Research and Applications 12 (2013), 208–220.

26.

Niu

, Shu

X.-B.

and Li

, The Existence and Hyers-ulam Stability for Second Order Random Impulsive Differential Equations, Dynamic Systems and Applications 28 (2019), 673–690.

27.

Ryoo

C.S.

, Agarwal

R.P.

and Kang

J.Y.

, Some Properties Involving 2-Variable Modified Partially Degenerate Hermite Polynomials Derived from Differential Equations and Distribution of Their Zeros, Dynamic Systems and Applications 29(2) (2020), 248–269.

28.

Paul

, Ahmad

, Mazhar Rathore

and Jabbar

, Smartbuddy: defining human behaviors using big data analytics in social internet of things, IEEE Wireless Communications 23(5) (2016), 68–74.

29.

Paul

, Jiang

Y.C.

, Wang

J.F.

and Yang

J.F.

, Parallel reconfigurable computing-based mapping algorithm for motion estimation in advanced video coding, ACM Transactions on Embedded Computing Systems (TECS) 11(S2) (2012), 1–18.

30.

Rathore

M.M.

, Paul

, Hong

W.H.

, Seo

H.C.

, Awan

and Saeed

, Exploiting IoT and big data analytics: Defining smart digital city using real-time urban data, Sustainable Cities and Society 40 (2018), 600–610.

31.

Paul

, “Internet of Things: A primer’, R Jeyaraj Human Behavior and Emerging Technologies 1(1) (2019), 37–47.

32.

Paul

, Daniel

, Ahmad

and Rho

, Cooperative cognitive intelligence for internet of vehicles, IEEE Systems Journal 11(3) (2017), 1249–1258.

Deep learning and multimodal target recognition of complex and ambiguous words in automated English learning system

Abstract

Keywords

1 Introduction

2 Related work

3 The establishment of model

3.1 AHP method

4.1 Research on effectiveness experiment

Table 1 Multimodal teaching evaluation in Teachers’ Classroom Type Very satisfied Basic satisfied Uncertain Dissatisfied Auditory modality 66% 27% 5% 2% Visual modality 88% 9% 2% 1% Other modalities 70% 20% 5% 5%

4.3 Data analysis after experiment

Table 4 Pretest results Pretest vocabulary achievement Class N Mean value Standard value Standard error of mean value 1 64 23.56 1.664 0.208 2 57 23.14 2.090 0.277

References

Table 1
Multimodal teaching evaluation in Teachers’ Classroom

Type Very satisfied Basic satisfied Uncertain Dissatisfied

Auditory modality 66% 27% 5% 2%

Visual modality 88% 9% 2% 1%

Other modalities 70% 20% 5% 5%

Table 4
Pretest results

Pretest vocabulary achievement Class N Mean value Standard value Standard error of mean value

1 64 23.56 1.664 0.208

2 57 23.14 2.090 0.277