Abstract
Based on the analysis of the characteristics of artificial intelligence and agent, this paper discusses the feasibility of introducing Web services and intelligent agent technology into online teaching and learning and proposes a modern distance education system model based on artificial intelligence agent technology web. The architecture integrates the advantages of Agent technology and Web services. Starting from improving the shortcomings of the traditional Web-based distance teaching system, it strives to increase learners’ self-directed learning interest, monitor students’ emotions, and exchange knowledge between teaching agents. To realize students’ on-demand learning according to their aptitude, teachers’ teaching ultimately improve the system’s flexibility, personalization, and artificial intelligence. Under the guidance of learning communities and other theories, construct learner model ontology inference rules. Based on the learner’s relationship characteristics, the knowledge domain that the learner is interested in is inferred, thus constructing an intelligent information retrieval system. Knowledge retrieval is realized quickly and accurately, thereby verifying the application of learner relationship characteristics in digital learning.
Introduction
The era of artificial intelligence is a process based on cloud computing, big data, deep learning algorithms, and comprehensively advancing AI technology in various fields of human production and life. The arrival of the era of artificial intelligence has had a profound impact on the development model of enterprises, people’s lifestyles, and the development of higher vocational education [1]. This series of influences has triggered new demand for talents, mainly including talents with artificial intelligence thinking, artificial intelligence application talents, and cross-sector composite talents [2]. New talent needs to promote vocational education to reposition training objectives, namely an adaptation training objectives era of artificial intelligence, from specific types, levels and can post and so on were described in general terms, and from the knowledge, competence and integrity in three areas [3, 4]. A specific analysis was carried out. The realization of the goal of higher vocational education talent training in the era of artificial intelligence requires the cooperation and concerted efforts of countries, regions, and colleges. Talent training goals, as the standards and requirements for schools at all levels and types of their training, have important significance for the entire education and the development of specific schools and specific professions. It is the junction of various important relationships in the school education system [5]. On the one hand, it connects the requirements of social and economic development for higher vocational education and reflects the latest demand for talents in social development. On the other hand, connected to vocational colleges of the personnel training process, and determine the overall direction of the fundamental principles of the work of the colleges, the basic premise is to determine the course of vocational colleges, but also to promote the fundamental basis for teaching work [6]. Therefore, to achieve better and faster development of higher vocational education, higher vocational education must accurately grasp the positioning and development direction, consciously assume the responsibility of the time to serve the society, and cultivate qualified talents needed for social development.
Agent research technology has made great progress. The agent more in-depth study of the technology will make the Agent of artificial intelligence, the stronger, the MAS function more powerful, the more widespread application in many fields [7]. For example, in the manufacturing field, it involves all-round manufacturing systems, supply chains, robots, scheduling, assembly, and product design. In the field of information and control, it involves network automation and artificial intelligence, information control, traffic control, networked computer-aided teaching, and medical treatment. In the field of computer applications: including expert systems, software development, virtual reality, and distributed computing. MAS is also used in the military field. For example, Leszczyński proposed an operational modeling method based on Agent action maps and constructed Agent action maps that meet the knowledge background and usage habits of military personnel to meet military needs [8]. Blazer pointed out that artificial intelligence can develop personalized teaching machines that can dialogue with individuals based on their specific situations, difficulties, and needs to help them understand the problems or achieve a certain goal [9]. Celentano proposed that the typical application of artificial intelligence to the field of education mainly includes intelligent tutors assisting personalized teaching and learning [10]. With the rapid development of artificial intelligence technology, the scope of its application will become wider and wider, and its application in education will also rise to a new stage [11]. In the future, what kind of talent training model will be worthy of our in-depth study?
Vania emphasized the transformation from pure knowledge transfer to effective modeling. The article pointed out that the intelligent tutoring system can be used to guide students’ learning. Different forms of support will produce different learning outcomes. At the same time, in the teaching process, an artificial intelligence system can be used to establish a task-based model to diagnose students’ knowledge and provide effective feedback for teaching [12]. Settapat proposed that artificial intelligence should also play a role in motor skills learning [13]. He pointed out that many learning tasks need to be completed through repeated movements, such as learning to write, drawing, playing musical instruments, practicing sports skills, dancing, using sign language, etc. His research direction, conceived in the field of psychomotor learning, needs to turn to support physical activities rather than support teaching. This means that physical activities carried out in practice need to be monitored, modeled, and corrected as necessary to achieve successful motor skills learning [14]. Brown introduced the potential role of AI in various aspects of education in the article. Susanne emphasizes the application of artificial intelligence in education [15]. Artificial intelligence can provide students with an open learning environment, and new computer technology can provide new opportunities for learning exploration and collaboration. AI’s role in education is not just a tutoring system. He suggested that the system must be able to support incremental learning and on-demand learning need, and must support users involved in their behavior, not users of the system.
Current research hotspot Agent technology in the field of artificial intelligence has opened up a new way of solving dynamic and complex tasks [16]. A plurality of an Agent multi-agent system composed of building a suitable collar domain of MAS to solve practical problems has strong robustness and reliability. Complex problems in many fields, such as mechanical product design, intelligent manufacturing, and traffic control, are solved using MAS, which has been solved quickly and efficiently [17]. Therefore, research on solving complex dynamic problems based on MAS has theoretical significance and practical value. This research aims to analyze the profound impact of the era of artificial intelligence on all aspects of society, propose what kind of talent training goals should be established in higher vocational education to train new talents that meet the needs of the new era, and finally propose higher vocational education talents in the era of artificial intelligence. The realization path of training goals promoting the ecological development of higher vocational education in China.
Design and analysis of distance education system of artificial intelligence agent technology analysis
The structure of the improved agent
Agent basic structure describes the Agent main constituent parts, each of which is composed of modules, each module having what kind of function, as well as the interaction between these modules. The structure of the Agent in this paper is shown in Fig. 1. It mainly includes the perception layer, behavior control, and decision layer, and evaluation layer. In the perception layer, there are only environmental awareness modules. Behavior control and decision layer are composed of an information processing module, execution module, intelligent control, and decision module, a communication module, knowledge base, and task table. There is only the main evaluation module in the evaluation layer.

The structure of Agent.
The environment awareness module is used to sense outside information. The information processing module is used to process, process, and store the information obtained from the environment awareness module. The execution module is used to get the final decision information from the intelligent control and decision module and transfer it to the external environment of the system. The intelligent control and decision-making module is used to receive information from the information processing module, which mainly includes external environment and resource information and communication information of other agents, and uses the knowledge in the knowledge base or the real-time status information in the database to further the received information analyze and reason, and make reasonable decisions accordingly [18]. The communication module is responsible for the interaction and transmission of information. The knowledge base stores abundant relevant inference knowledge. The database stores the real-time status information of problems obtained by Agent reasoning. The task table stores the goals and tasks that the Agent needs to complete.
Clustering is a common data analysis tool. Its meaning is to divide physical or abstract objects in a large number of data sets into several classes or clusters using certain similarity measurement methods so that the similarity between objects in the same cluster is the largest, and the similarity between objects in different clusters is the smallest [19]. In machine learning, cluster analysis is an unsupervised machine learning process. It did not have the relevant knowledge of class division at first, but by studying the similarity or commonality between data objects, it is possible to divide the class or cluster. The result after cluster analysis should meet two conditions: First, there is no empty cluster, that is, each cluster contains at least one object. There is no common part between clusters, that is, there is no intersection of the two clusters.
The data matrix used numerical values to represent data objects. For example, tooth shape, the number of teeth, modulus, pressures angle, etc. are all attributes of gears. This matrix form is shown in Equation 1.
The similarity matrix can also be called a distance matrix, used to describe the distance between data objects. It is an asymmetrical square matrix.
Euclidean distance, a widely used distance measurement method, is shown in Equation (2):
Manhattan distance, as shown in Equation (3):
Multiple atomic operators can be combined to form a more complex compound operator. Compound operators and atomic operators or with compound operators can form more complex compound operators, such as the double crank mechanism in a hinged four-bar mechanism, as shown in Fig. 2.

Schematic diagram of operator division hierarchy.
To make it easier to divide different types of row and column elements into different clusters, the DSM row, and column elements can be divided into ordinary cluster elements, independent cluster elements, and bus cluster elements. When an element is related to most of the ranks and elements, it is called Bus clustering element. Independent elements do not belong to any cluster, and are rarely affected by other elements, but also rarely affect other elements. Therefore, they are in a parallel relationship with the other row and column elements [20]. Common clustering elements are the main objects for clustering.
The decomposition and recombination operators are packaged into the Agent knowledge base, and then the perception layer and the evaluation layer are added based on the module. Operators in the operator set are decomposed and reorganized and encapsulated to form an agent’s knowledge base [21]. The agent can perceive the change of external conditions through the perception layer, use the inference knowledge in the knowledge base to solve the task autonomously, evaluate the solution result through the evaluation layer, and record the learning process of the solution. A complex task can be solved by multiple agents working together. MAS is formed by effectively organizing multiple Agents together to form a complex network.
The effective organization of various agents can greatly improve the efficiency of task solving. To this end, this paper builds the architecture of MAS. Figure 3 shows the architecture of MAS.

MAS architecture framework.
For the entire MAS to solve tasks efficiently and accurately, the solution agent team to which the subtasks are assigned should be more intelligent, and team members can cooperate. In an ideal state, the corresponding solution agent team assigned to each subtask can complete the solution, and the team members can get full and effective communication, can share information and interaction, and make each member’s ability play to the extreme. In this way, the accuracy of the subtask assignment and matching is relatively high. However, due to the limitation of the knowledge of the solving agent team, the assistance of the cooperative agent is required to solve the subtasks. In this process, a solving agent may be a common member of the two teams [22]. When forming the solution agent team, the solution agent can only belong to one of the teams. Which solving agent team this solving agent belongs to affect the solving efficiency and accuracy of the whole task. Therefore, it is very necessary to reasonably plan the solution agent team corresponding to each subtask.
For the foregoing Agent features, the Agent technology technique used in distance education can overcome the limitations and shortcomings at this stage distance education system [23]. We used the Agent technology technique can increase the fun and user-friendly color teaching content, improve teaching effectiveness, and improve teaching quality. The use of Agent information management learners can study the dynamic behavior of the track learner for the establishment of the student model provides a more reliable basis. Agent agency will be able to meet the need of constructivist collaborative learning, each learner can be seen an Agent by between learners Agent collaborative learning complete coordination mechanism. Teachers can also be understood as an Agent. You can also exchange information with students and keep abreast of students’ learning status. Using Agent ideas to analyze the overall needs, the distance education system can fully reflect the intelligence and initiative of teaching, especially embedding artificial intelligence technology and Agent technology based on the now popular Web technology [24]. The performance of the system will be far superior to the traditional distance teaching system, which will play a positive role in promoting the entire teaching reform and the implementation of quality education. The FOAF ontology contains multiple classes and attributes, of which classes mainly include agents, files, organizations, online accounts, people, personal information files, projects, etc. It can be seen that the description range of the FOAF ontology is relatively wide, and the main categories in this ontology are Table 1.
Main categories in FOAF
To reasonably plan the solution of the Agent team corresponding to the subtask, the method of periodically updating the subtask allocation scheme is used, but there may be a situation in which the current planning and matching of the Agent team are solved during the update period of the subtask allocation scheme. There is a certain difference between the last update record or the initial allocation plan obtained by the device, but under normal circumstances, within the update cycle of the allocation plan, the information about the composition of the solution agent team corresponding to the subtask will not be particularly large. When the normal update cycle is reached, the results of solving the agent team planning are evaluated and analyzed according to the MAS solution to the same kind of problem. If the composition information of the solution agent team does not change during this time, the allocation plan will not be updated. Until there is a big change in the composition information of the solving agent team.
In a system based on blackboard communication, each agent does not directly interact but uses the blackboard as a public work area to exchange information, data, and knowledge. An Agent can write the call request or the operation result as an information item on the blackboard, which can be responded to or used by other Agents. The agent can access the blackboard at any time and use filters to extract the information it is interested in. In a multi-agent system using message transmission, each agent can exchange information directly or through an intermediary agent. The use of message communication can achieve flexible and complex collaborative interaction but requires that each agent in the system has a large amount of information about other agents. Besides, because the content of the communication is knowledge level, the communication protocol must specify the communication process, message format and the agents involved in the communication must know the syntax and semantics of the communication language.
The learner dimension includes self-efficacy, result expectation, perceived usefulness, perceived ease of use, and member interaction. Due to the openness, personalization, and other characteristics of the online learning space, learners’ learning activities in the online learning space require the learner’s initiative and self-discipline, and the learning input is more affected by the learner’s characteristics. During the learning process of the network learning space, the learner’s sense of strangeness and now belonging to space from the beginning of users are affected by the usefulness of perception and the ease of use. Whether the learner chooses to continue learning. In certain network learning when learning space experience, learners dominant learning in the Learning Network, the results are expected to meet the degree of influence the level of learning into the learning process of learners [25]. The teacher dimension includes teacher participation and learning activity design. Compared with the learning in the traditional education field, which depends on the teacher’s learning, the learning of the learner in the online learning space is relatively weakly affected by the teacher. However, there are still studies showing that teacher participation and learning activity design in the network environment can positively affect the learner learning input. E-learning space is one of the new learning environments for college students. When learners carry out learning activities in e-learning space, they will be affected by the learning atmosphere and interpersonal relationships among members in the e-learning space. It affects whether the learner continues to carry out learning activities, and affects the level of learning input, as shown in Fig. 4.

Schematic diagram of evaluation index analysis.
Teacher participation refers to the frequency of feedback and evaluation of teachers or teaching assistants to college students in the online learning space. Learning activity design refers to the learning activities developed and designed by teachers in the e-learning space, such as group collaborative learning, cooperative inquiry learning, etc. Good learning activity design can stimulate the interest of college students in learning, increase the participation of college students in e-learning space learning, and promote the learning input of college students. The environmental dimension includes the space atmosphere and trust. The space atmosphere refers to the learning atmosphere in the online learning space. Trust refers to college students’ trust in resources in the online learning space and trust among members in the online learning space. A strong space learning atmosphere and good interpersonal trust can improve college students’ satisfaction and sense of belonging to the online learning space, and can promote college students’ learning investment. Perceived ease of use refers to the learner’s perception of how difficult a system or platform is to use. In this study, perceived ease of use refers to the degree of difficulty that college students believe in the operation, interface, and resource download in the online learning space and whether they can perform mobile learning. If college students are not restricted by technology and space-time when using e-learning space, college students will have a sense of identity in e-learning space, which will promote learner’s investment in learning. On the contrary, it will prevent learners from continuing to use e-learning space for learning.
Agent technology MAS task solving efficiency analysis
Whether the task is solved in the original model or the MAS system before and after optimization, the form of solution is to use different knowledge to perform inference. The effective operators used in the solution process of the same task are the same. The solution process of the original model accesses the knowledge in all modules in the model one by one, that is, accesses all the operators in the original model used for inference. The constructed MAS solves the related agents by dynamically calling the tasks, and only calls the operators related to the tasks in the agent. The knowledge base of the MAS team members before the optimization is incomplete, and the solution needs to frequently send requests to the central control agent to find a cooperative agent, thereby increasing the task solving time. The optimized MAS is the result of clustering a large number of instances of the same type, which is highly targeted. The knowledge base of the team members is relatively complete, and the information interaction between team members no longer needs to frequently send requests to the central control agent, which reduces the interaction time.
We use the Java language on the software development platform Myeclipse to program to simulate the process of task solving, and calculate and visualize the results through Matlab language. Using 12000, 24000, and 36000 operators in a field operator set respectively, using 300 questions of the same type, the task is solved to obtain the average time t of the original model, the average time before the MAS optimization, and after the MAS optimization. The average running time is 3. Figure 5 lists the average running time of the original model solution under different numbers of operator sets, and the average running time before and after MAS optimization. Figure 5 shows the trend of the time, and intuitively shows the level of operating efficiency.

Comparison of average running time.
It can be seen from the experimental data that in the same field, for the same problem, the solution time of the original model is much longer than that of the MAS, and the solution time of the optimized MAS is shorter than that before the optimization. This shows that the optimized MAS has higher solution efficiency. With the expansion of the scale of the field, the number of operators set by Canada, generally speaking, the optimization MAS has a high task solution efficiency. The final performance data of its model is shown in Fig. 6.

A2C and PPO in-game score at the calculated budget.
As showed in Fig. 7, by observing the experimental data table and performance curve diagram, this paper also made some discoveries. A2C early performance will be lower than PPO, but with the increase in budgetary calculations, it will exceed the performance of PPO. In the PPO paper, the experiment was conducted only at the computational cost of 10M, and it was concluded that its effect is better than A2C. At present, it seems that this is not rigorous and correct. Generally, due to the longer training time of video games and the limitation of computing resources, researchers often only conduct experiments with less computing budget settings, so it is difficult to find this phenomenon, which is also a comparison for future algorithms. It can be found that in almost all test environments, the variance of PPO is significantly larger than the A2C algorithm, especially in the Seaquest and Space Invaders game. Therefore, it can be considered that A2C has better robustness.

Performance curves of A2C and PPO under different environments and different calculation budgets.
The typical positioning of talent training goals is the division of the attributes of the talents they train. The division of talent types involves many fields and is determined by the demand for different types of talents by the social division of labor. According to different division criteria, talents can be divided into different kinds. Traditionally, we divide talents into two categories, academic talents and application talents. Among them, application talents can be divided into engineering talents, technical talents, and skill talents according to different levels or scope of work. In the past, we usually positioned the type of talent training in higher vocational education as technical skills, emphasizing the practical ability to transform technical principles into physical entities. In the era of artificial intelligence, the development of technology has transformed technological skill processes into artificial intelligence programs. Big data, machines, products, and people together form an interconnected intelligent system. The production process of artificial intelligence needs to be proficient in AI technology and network technology. Intelligent talents focus on the application, operation, and maintenance of artificial intelligence systems, emphasizing professionalism and soft production capacity of technology and technology research and development capabilities. At the same time, they can use digital technology based on traditional work, build a bridge between the various departments of the enterprise and between the enterprise and customers, so that the design, R&D, production and sales departments can communicate with each other. Among them, ordinary higher education pays special attention to the cultivation of intelligent R&D talents. Compared with it, the intelligent talents trained in higher vocational education are more prominent in the operation of artificial intelligence production systems, requiring them to play an important role in the intelligent production process, mainly including the Monitoring of smart machine operation and repair and maintenance of smart machines.
As shown in Fig. 8, in the t-test statistic, the absolute value of the t-value difference test for the high and low groups of 20 items is greater than 3.000, and the significance is less than 0.05.

Independent sample test of the questionnaire on the status quo of system learning input.
It can be seen that in the t-test statistic, the absolute value of the t value of the item high and low group difference test is less than 3, the rest of the items are greater than 3. Structural validity, also known as construction validity or theoretical validity, refers to the degree to which a measurement tool reflects a concept or proposition, that is, the degree of structure, that is, if the questionnaire survey results can measure its theoretical characteristics, the survey results are consistent with theoretical expectations, It is considered that the data has a high structural validity to reflect the internal. Confirmatory factor analysis is a program that uses AMOS software to analyze the validity of the measured variables and the fit between the factor model of the questionnaire and the actual collected data. The fitting index is a statistical indicator that examines the degree of fitting of the theoretical structural model to the data. The commonly used fitting indicators are GFI, RMR, RMSEA, AGFI, NFI, and CFI. The evaluation criteria of the absolute fits index and the relative fitting index are shown in Fig. 9.

Evaluation criteria of model fitting fitness index.
Through the above two experiments, we can see that for A2C and P PO, the number of processes is small, resulting in low sampling efficiency, which will cause the training time to continue to increase. The increase in the number of processes can shorten the training time of agent, with the increase in the number of processes, the effect is particularly obvious. However, it can also be seen from the sample efficiency experiment that on some tasks, the large number of processes will cause a decrease in performance. Therefore, there is a need to balance the specific tasks and the researcher’s situation. Besides, the number of processes does not involve the reinforcement learning algorithm itself, but it has a huge impact on the performance of the reinforcement algorithm. Under the same calculation budget and the same reinforcement learning algorithm, due to the different settings of the number of processes, it may eventually lead to a huge difference in model performance. This is also worthy of researchers’ consideration.
We first briefly introduce the background of reinforcement learning and deep learning, which leads to deep reinforcement learning methods. On this basis, a systematic investigation was conducted on deep reinforcement learning methods for machine games, including reinforcement learning methods based on value functions and reinforcement learning methods based on policy gradients. Besides, layered reinforcement learning and multi-agent reinforcement learning is introduced. In the experimental research section, we briefly introduced the mainstream current code base and standard test platform for deep reinforcement learning. Aiming at the problem of domain baseline failure, this article gives the latest baseline results based on the domain standard experimental methods and evaluation methods. Finally, related experimental research on mainstream reinforcement learning algorithms is carried out. The performance of A2C and PPO at different computational costs and specific training curves is given as discussed in Table 2. The experimental results are analyzed. Besides, the sample efficiency of A2C and PPO is analyzed. And comparative analysis of sampling efficiency was carried out, and the results were given.
The status quo of learning investment includes the standardized path coefficients of all paths between the observation variable and the latent variable, and the latent variable is all greater than 0.5, the critical value is greater than 1.95, and the significance level is less than 0.001, indicating that the relationship of each path is significant, and each observation variable can fully reflect. The measured latent variables have good model fit, and the results are shown in Fig. 10.

System performance results.
Detailed description of evaluation indicators
Web services are loosely coupled and reusable software modules. After being released on the Internet, various servers can be accessed between programs through standard Internet protocols. Web Service is a common means for application programs to communicate. The Agent is running on the dynamic concept of the environment. The software can fire a high degree of autonomy of the body, can entrust other entities and serve. It has the characteristics of learning, knowledge, initiative, and collaboration. These characteristics are particularly suitable for constructing a distance education system is a complex network environment. Combining web services and agents in a modern distance education system can effectively improve the deficiencies of traditional distance education, making it much better than the general distance education system in terms of system performance and teaching effects.
This paper studies and constructs web remote systems based on artificial intelligence Agent technology. Artificial intelligence and distribution will become the direction of future software development. Agent technology is an emerging technology for developing artificial intelligence software systems, while web services provide for building distributed systems. In this paper, the existing distance education system problems proposed based on web services and multi-Agent of modern distance education system architecture, which integrates the respective advantages of agent technology and web services. Starting from improving the defects of the traditional Web-based distance teaching system, it strives to increase learners’ self-interest in learning and among various teaching agents. The exchange and sharing of knowledge realized demand learning and teachers’ teaching according to their aptitude, and ultimately improve the system’s flexibility, personalization, and artificial intelligence. Using Agent ideas to analyze the overall needs of the distance education system, design solutions can fully reflect the intelligence and autonomy of teaching. Web Service technology can solve the problem of resource sharing and system interoperability, and combine the two applications. The modern distance education system can effectively improve the deficiencies of traditional distance education, making it far superior to the general distance education system in terms of system performance and teaching effects.
