Abstract
The purpose is to minimize color overflow and color patch generation in intelligent images and promote the application of the Internet of Things (IoT) intelligent image-positioning studio classroom in English teaching. Here, the Convolutional Neural Network (CNN) algorithm is introduced to extract and classify features for intelligent images. Then, the extracted features can position images in real-time. Afterward, the performance of the CNN algorithm is verified through training. Subsequently, two classes in senior high school are selected for experiments, and the influences of IoT intelligent image-positioning studio classroom on students’ performance in the experimental class and control class are analyzed and compared. The results show that the introduction of the CNN algorithm can optimize the intelligent image, accelerate the image classification, reduce color overflow, brighten edge color, and reduce color patches, facilitating intelligent image editing and dissemination. The feasibility analysis proves the effectiveness of the IoT intelligent image-positioning studio classroom, which is in line with students’ language learning rules and interests and can involve students in classroom activities and encourage self-learning. Meanwhile, interaction and cooperation can help students master learning strategies efficiently. The experimental class taught with the IoT intelligent positioning studio has made significant progress in academic performance, especially, in the post-test. In short, the CNN algorithm can promote IoT technologies and is feasible in English teaching.
Keywords
Introduction
In recent years, the rapid development of the Internet of Things (IoT) and Artificial Intelligence (AI) technology has brought many breakthroughs, showing great application potential in medical imaging, intelligent driving, and urban management [2,4,29]. The IoT is essentially the technological expansion of the Internet. The emergence and development of IoT technology enrich market demand, innovate new ideas, and provide a novel perspective for the treatment of different problems in various fields [11]. Particularly, in the field of education and teaching, the IoT has shown its advantages [3,9]. The integration of the IoT technology into the middleware in current education and teaching platform can help schools and teachers supervise teaching activities and students’ physiological indexes, thereby effectively monitoring students’ health status through the digitalized teaching process. Meanwhile, the application of IoT can greatly facilitate school management for teaching equipment, books, and personnel.
With the continuous optimization of education and teaching, the studio classroom is developing into the focus of teaching and learning activities, improving the construction of teaching resources and distance education [40]. In general, the studio classroom has various functions, such as intelligent recording, positioning, and broadcasting, which facilitate open courses and distance-education teaching. Studio classrooms can manage and control teaching and learning activities uniformly through a central control system, image positioning system, courseware recording system, intelligent audio and broadcasting system, and late nonlinear editing system. In short, it can greatly optimize the current education and teaching mode.
Deep Learning (DL) belongs to AI technologies. Meanwhile, DL and Neural Network (NN) are the two most mentioned concepts in AI technologies [18,31]. In the field of DL, Convolutional Neural Network (CNN) has extensive applications, especially, in image feature extraction and processing and indoor positioning [1]. Teaching methods have been diversified with science and technological advancement. In particular, the advent of the Internet and multimedia has revolutionized traditional teaching concepts. In recent years, many researchers study the application of Computer Technology (CT) and multimedia equipment into education to innovate the traditional teaching methods.
The feasibility is explored for the IoT intelligent positioning studio classroom in English teaching. First, the IoT is introduced along with the equipment distribution of the studio classroom. Afterward, the CNN algorithm is explained, and then the experimental contents are presented. Finally, the CNN algorithm is evaluated in real-time positioning for intelligent images, and the feasibility of the IoT intelligent image-positioning studio classroom in English teaching is discussed.
Related works
IoT intelligent positioning and studio classroom in education and teaching
Many kinds of research have been conducted on the application of studio classrooms in actual teaching. For example, Zhu launched the One Teacher-One Course campaign based on the studio classroom, which provided a new outlook for teaching and learning, thereby enhancing the urban-rural balanced education [44]. Wang believed that the studio classroom could offer functions, such as recording, live-broadcasting, editing, saving, and on-demand broadcasting, thus promoting undergraduate teaching with online training courses [36]. Bautista et al. thought that teaching skills could be optimized through analysis of classroom video. The studio classroom was just underway compared with traditional mathematics and science education [6]. By comparing with the traditional classroom teaching, Prakash et al. found that the students showed better concentration and higher knowledge acceptance in the studio classroom with poetic visual context [27]. Intelligent positioning is an important part of the studio classroom system. At present, experts and scholars have discussed the application of IoT in intelligent positioning. Yogish et al. handled garbage bins through the IoT technology to calculate and utilize the losses, in which many technologies, such as Global Positioning System (GPS) and global mobile communication systems were utilized [43]. Xu and Liang studied the logistics GPS based on the IoT and found that the logistics information could be dynamically monitored through the introduction of wireless transmission technology, GPS, and anti-theft warning technology, with good stability [37]. Venkatanarayanan et al. designed the intelligent sensor for bicycle positioning through the open IoT platform using MATLAB, proving that the platform had good stability [34]. Yang et al. proposed an IoT network integrated with visible light communication and positioning. Yang et al. introduced the low complexity iterative algorithm and considered the joint optimization of access point selection, bandwidth allocation, and power allocation, finding that the proposed scheme could significantly improve the data transmission efficiency and optimize positioning accuracy [41]. To find a general positioning scheme, Martin-Escalona and Zola put forward a passive Wireless Fidelity (Wifi) indoor positioning scheme by combining the time difference algorithm with the fine time measurement algorithm, revealing its application value in the field of intelligent positioning [24].
Application of CNN in image processing
There are many studies on the application of the CNN model in image processing. Emoto and Hirata proposed an end-to-end batch processing method for gaze estimation and event detection based on lightweight CNN for eye image processing. It is found that this method could simultaneously detect the gaze direction and event occurrence with small memory and low computational complexity [12]. Kruthiventi et al. analyzed the visual attention mechanism and introduced a fully CNN DeepFix, finding that the CNN Deep Fix had a very large acceptance domain network layer, achieved semantic capture at multiple scales, and performed excellently in image processing [20]. Ponce et al. introduced CNN into the classification of olive fruit images, finding that the classification accuracy could reach 95.91% with the Inception-ResnetV2 architecture, which was expected to be applied to the processing and classification of postharvest olive fruit [25]. For face image processing, Yin and Liu suggested a posture-oriented multi-task CNN based on the learning of different identity features. The test results on the multi-cake graph data set showed that this method was applicable in the field data set of pose-invariant face recognition, thus showing excellent performance in face image processing [42]. Guzewich and Zahorian analyzed CNN’s ability to remove noise and reverberation in image processing, revealing the superiority of Deep Neural Network (DNN) in the image processing field [16].
The above analysis indicates that both the IoT intelligent positioning technology and the CNN algorithm show some advantages in the positioning field and image processing. At the same time, the superiority of the studio classroom in education and teaching has been highlighted. However, at present, their application of English teaching is still scarce. It is believed that the studio classroom based on intelligent positioning is also applicable in English teaching, which is the basis of the subsequent research.
Method
Fusion of video data and the IoT
The IoT is first defined by MIT’s Auto-ID research center in 1999, which can connect information and equipment through Radio Frequency Identification (RFID), Quick Respond Code (QR code), and sensors to intelligently manage data [8]. The IoT identifies objects through RFID tags and exchanges information through the CT, RFID, and sensor equipment [26]. IoT, CT, and the Internet have witnessed three global technological revolutions, and the industrial value of the IoT is believed to reach 30 times that of the Internet [28]. Generally, IoT devices are defined as sensor equipment that can acquire information through laser Radio Frequency (RF), RFID, and infrared sensing technology [30]. A network of goods and information can be established through IoT technologies, in which people can exchange goods through intelligent identification and positioning [7].
In recent years, short video platforms, such as Tiktok and Kwai have gone viral and attract much attention. With the development of modern Information Technology (IT) and the IoT, the application of big data fusion technology of video and the IoT is expanding, such as safeguarding and monitoring goods, transportation, personnel access, logistics, and market demands. The IoT is the innovation and further development of the Internet. Video technology can record things, events, and data, as well as entertaining people. Meanwhile, the combination of video and IoT technologies can facilitate people’s life and production. Coupled with intelligent hardware and software, video data can promote IoT development remarkably. More effective and straightforward information can be presented through videos. The intelligent camera is a high-performance IoT terminal and is crucial for the acquisition of high-quality information that can be used for the input of the overall IoT system. The intelligent visual IoT connects objects to the Internet through sensor equipment using network protocols to exchange information, and thus realizing intelligent identification, monitoring, management, and positioning [39].
Therefore, the intelligent visual IoT has diverse functions, which will sure to play an important role in education. Here, the real-time positioning for intelligent images is studied based on the IoT, and the feasibility of the studio classroom is explored in English teaching.
Equipment distribution of the studio classroom
The distribution of Audio-Visual (AV) equipment in the studio classroom varies according to system requirements and configuration [15]. Here, a studio classroom that accommodates 60 students is chosen, and the equipment distribution is shown in Fig. 1. Cameras and multimedia sound acquisition devices are the key to the studio classroom and are often installed with more than one for better classroom experience and AV recording. The functions of each device are expounded as follows. In a studio classroom, the volume of the sound signal is uneven due to position, noise, and echo, so the performance of single equipment is poor [14], and multiple sound acquisition devices are installed on the ceiling to collect sound signals from all dimensions [19]. The acquired sound signals can be further distinguished and refined so that only the principal elements will be tuned and recorded, while others are minimized, thus improving the quality of the studio classroom [22]. The data can be timely transmitted to the host machine of the platform through Wireless Network (WN).

Equipment distribution of the studio classroom.
DL is a set of Machine Learning (ML) algorithms used in multi-layer NN image processing. The DL learns and extracts features from multi-layer NN to solve complex and large data operations [23]. DL algorithms, such as CNN (see Fig. 2), Backpropagation Neural Network (BPNN), and multi-layer Recurrent Neural Network (RNN) algorithm, are widely used [33]. CNN has derived from artificial NN, can process images because of good applicability in image recognition and classification [5], and is mainly consists of the pooling layer and convolution layer [10].

CNN model.
The pooling layer can sample data, reduce their dimension, and extract the essential features, and the mean pooling and maximum pooling algorithms are most commonly used [21]. Pooling can trim the high-resolution images so that the image data can be computed while preventing overfitting. The convolution layer can compute the sum of each element’s position and weight product. The algorithm is shown in Eq. (1).
In Eq. (1), x denotes the input matrix in the convolution layer, W represents the weight matrix, and a, b stands for the height and width of the weight matrix in the convolution network, respectively.
The objective function is crucial to image data processing in the CNN model, which is composed of penalty term and loss function [32], as shown in Eq. (2).
In Eq. (2),
In Eq. (3) and (4), V stands for a constant in the penalty term. The difference between
The loss function is commonly used in image classification and regression. The sum of the probability of all classified images should be one, that is, the probability distribution must be met. The softmax can classify image data through Eq. (5).
In Eq. (5), n is the number of categories in image classification and
The aim is to enhance students’ listening, vocabulary, and comprehensive scores based on the full exertion of their personalities. The IoT equipment in the studio classroom can train students. Two classes of high school students in their first year with similar academic performances are chosen for experimental analysis. The experimental class is taught English with the studio classroom and then compared with the control class who is taught with the traditional English teaching method. Under the studio classroom, some multimedia coursewares are utilized, and a fresh and interesting teaching and learning environment is created through diversified AV [17]. The intelligent images are introduced into reading comprehension in the studio classroom to arouse students’ interest. For example, teachers can prepare Questions and Answers (Q&S) through intelligent image positioning equipment and interact with students in comprehensive text analysis. Consequently, the student’s thinking skills and a positive learning attitude are cultivated [38]. When teaching listening skills, teachers can design teaching scenarios that actively involve students’ participation through intelligent image positioning equipment. This can help students better understand what they have learned, get involved in classroom activities, and become more focused. When teaching vocabulary, teachers can explain words and phrases more straightforwardly through videos or images using intelligent image tracking equipment so that students can memorize them with associative methods. Meanwhile, interactive games can be introduced in the studio classroom, in which two students are coupled to draw and guess according to the glossary correlation images. Hence, students’ competitiveness, collaboration, interest, learning, and memory are cultivated and strengthened [13]. At the end of the term, the two classes of students’ academic scores on English are compared and analyzed to explore the feasibility of the IoT intelligent image-positioning studio classroom in English teaching.
Results and discussion
Application of CNN in intelligent image positioning
The intelligent image is trained through the CNN model to extract the visual features, with which the images can be positioned. The loss function image of the model training is shown in Fig. 3.

Convergence of model training loss.
In the CNN model, each pre-training set contains 10 samples, with a maximum of 20 phases. The CNN model stops training stops when the loss function reduces less than 0.01. Figure 3 shows that when the CNN model is trained with intelligent images, the loss function reduces less than 0.01 from the sixth phase, indicating that the CNN algorithm has a better training effect and faster image classification. For a shorter image training and prediction time, the number of training samples is positively correlated with the pixel rate of stroke coverage. The CNN can segment the image into multiple superpixels quickly. The color overflow of the image has also been significantly improved through the CNN algorithm: the edge color becomes clearer, and the color patches have been reduced. The experimental results show that the CNN algorithm can effectively extract and classify the features of intelligent images, and can accurately position the features to optimize the image color. Consequently, intelligent images can easily be edited and propagated.
Intelligent image positioning is a major index of the development of IoT technologies and is embodied in many fields, such as transportation, home, and electronics. Here, intelligent image positioning is realized through the CNN model, a combination of DL NN and IoT. CNN algorithm can extract image features and position images, thereby promoting the development and expansion of IoT technology. When the image information acquired through the CNN algorithm is further processed in the ubiquitous network, a new form of application may be found for IoT.
To explore the feasibility of the IoT intelligent image-positioning studio classroom in English teaching, two classes of students in a high school are selected for experiments, and the one-term scores are compared. Figure 4 shows the comparison of the post-term listening scores of the two classes.

Comparison of the one-term listening scores.
Figure 4 indicates the three times comparison of the listening scores of the two classes of students. Apparently, at the pre-term exam, the average scores of the two classes are very close, and the pass rates show no significant difference. As time goes by, the average scores and pass rates of the experimental class become significantly higher than those of the control class in the mid-term and post-term exams. An in-depth analysis of the average scores and pass rates of the three listening exams in the two classes shows that the students in the experimental class have made greater progress in the mid-term and post-term exams through the studio classroom. The average score of the students in the control class is 15.8 in the pre-term exam, 16.8 in the mid-term exam, and 17.2 in the post-term exam. Hence, the overall students in the control class haven’t made much progress in comparison. The analysis of the pass rate of the students in the control class shows that the traditional English teaching model has little effect on the improvement of students’ listening scores. Thus, the studio classroom is more feasible in language learning and interest-guided education. The intelligent image positioning equipment can create learning scenarios and can guide students to actively participate in the classroom and learn self-consciously. Consequently, students can effectively master English listening skills and learning strategies through classroom interactions and cooperation.
The results show that the students in the experimental class have improved in vocabulary learning much faster than those in the control class. The statistical analysis results are shown in Fig. 6.

The comparison of one-term vocabulary scores.
Figure 5 displays the three times comparison of the vocabulary scores of the two classes of students. Hence, the average score and the pass rate are 10.8 and 19% of the experimental class in the pre-term exam, which is lower than those of the control class of 11.5 and 24%. In the mid-term exam, the average score and pass rate of the experimental class become significantly higher than those in the pre-term exam. Meanwhile, in the control class, the average score and pass rate improvement are relatively small in the mid-term exam. In the post-term exam, the average score and the pass rate are 17.4 and 81% of the experimental class, which are much higher than those of the control class of 14.6 and 34%. The overall analysis in one term shows that the average score of the experimental class in the post-term exam is 17.4, which is 6.6 points higher than that of the pre-term exam of 10.8. The average score of the students in the control class is 14.6, which is only 3.1 points higher than that of the pre-term exam of 11.5. The pass rate of 81% in the post-term exam of the experimental class is also much higher than that of 34% in the control class. The gap of improvements is noticeably widened between the two classes. The above analysis indicates that the intelligent equipment can help students overcome some disadvantages in traditional English vocabulary learning, arouse students’ interests, and overcome the weariness and fear in vocabulary learning. A reasonable scenario design through intelligent equipment is very effective in English vocabulary teaching. The associative memory method and interactive games can strengthen students’ in-depth memory of English vocabulary. Hence, the English vocabulary scores of the experimental class are considerably higher than that of the control class.
The results show that the comprehensive scores of the experimental class are much higher than those of the control class. The statistical analysis results are shown in Fig. 6.

Comparison of one-term comprehensive scores.
Figure 6 indicates the three times comparison of the comprehensive scores of the two classes of students. Thus, the average comprehensive score in the experimental class has increased nearly 17 points from 61.5 to 78.08 points in the pre-term exam, while in the control class, it only increases about 10 points from 62.8 points. The results indicate that the improvement of the average comprehensive score of the experimental class is much larger than that of the control class. An in-depth analysis of the pass rate in the comprehensive scores implies that in the experimental class, the students’ academic improvements in mid-term and post-term exams are both obvious. In the control class, the students’ academic performances have improved significantly in the mid-term exam and have hardly improved in the post-term exams. The statistical results suggest that in the pre-term exam, the comprehensive scores of the two classes are more or less the same, and in the post-term exam, the gap in scores between the two classes is widened considerably. Thus, the feasibility of the studio classroom in English teaching is proved. The application of IoT equipment in high school English teaching can effectively stimulate students’ learning potential. Therefore, the effectiveness of the IoT intelligent image positioning based on the CNN model is proved in English teaching.
Here, the feasibility of the studio classroom in English teaching is explored through the intelligent image positioning technology using IoT. First, the application of the CNN algorithm is discussed in real-time image positioning. The results reveal that the CNN model can better extract and classify the features of intelligent images, and the extracted feature can position the image in real-time, reduce color overflow, brighten edge color, and minimize color patches in images. Finally, the feasibility of the IoT intelligent image-positioning studio classroom in English teaching is analyzed. Two classes of students are selected as the experimental group and the control group. The comparison of the three times exam scores of the two classes implies that the students’ listening scores, vocabulary scores, and comprehensive scores in the experimental group are higher than those in the control group. The experimental results also prove the feasibility of the IoT intelligent image-positioning studio classroom in English teaching.
Conflict of interest
The authors have no conflict of interest to report.
