Abstract
In the process of informatization, there are also some new problems, mainly information can’t be shared and integrated, distributed resources can’t be used effectively, these problems make the industry face new challenges. The goal of this paper is to combine the grid technology and ontology organically, to build a unified information system integration and interoperation platform based on semantics, to realize information sharing and accelerate the pace of informatization. The method is to construct the whole structure of the system according to the actual needs of the system. This paper firstly analyzes the current research status and existing problems of semantic grid service matching, and proposes a semantic layered matching algorithm based on Massimo Paolucci elastic matching algorithm. To verify the feasibility and effectiveness of the hierarchical matching algorithm based on semantics, a prototype system named SGSM was designed and its functional model, matching process and performance were studied. Experimental results show that for the semantic-based hierarchical matching algorithm proposed in this paper, the threshold value of service semantic correlation degree is 0.84, the threshold value of service basic concept matching degree is 0.89, the threshold value of service comprehensive similarity degree is 0.66, and the threshold value of service quality matching degree is 0.78. Statistics through the experiment, the above three methods of recall, respectively, 33%, 62%, 85%, the precision is respectively: 29%, 57%, 88%, and illustrate the hierarchical matching algorithm based on semantic is feasible in practical application, compared with the traditional service based on keyword matching algorithm and Massimo Paolucci elastic matching algorithm on the recall and precision are improved significantly.
Introduction
Causes of the problem has two aspects: one is the informatization level is only limited in a small range, most of the information platform of information organization and use of basically still in a relatively isolated within the system, some of similar nature or function information platform as the data structure, the conceptual model and the difference of hardware and software environment formation heterogeneous distribution of information island, greatly hindered the information sharing and application; Second, people in different regions have different cognition and expression habits on the concept of “domain", resulting in inconsistent descriptions of concepts in the same domain and regional limitations in descriptions, resulting in differences in descriptions and semantic heterogeneity. Therefore, the semantic interpretation of the same concept by information platforms in different regions often varies greatly. If such semantic differences are not taken into account, the query accuracy will be seriously affected and users’ needs cannot be met.
Service matching is a prerequisite for realizing service discovery, invocation and sharing. Whether the services in the semantic grid environment can be effectively matched directly affects the degree of comprehensive utilization of services [1]. As the core of service discovery, service matching technology provides effective solutions for comprehensive utilization of services and semantic interconnection of network resources, which has attracted more and more attention. From the integration of heterogeneous databases for a specific field to the grid computing for a single country or even the whole world, all of these are aimed at fundamentally solving the interworking and sharing of resources, while the core problem of solving the interworking and sharing of resources is to identify and solve the semantic conflict of resources [2]. Both domestic and foreign academia and industry have invested human and material resources in service matching to carry out in-depth research. From the perspective of research status and application prospect, service matching still has many problems worth studying [3, 4]. Semantic service matching in a grid environment is more complex, involving the semantic grid of data mining and knowledge discovery, artificial intelligence theory, in natural language processing technology, such as different subjects, if you can solve the expression of semantic information, semantic information between service access and the problem of matching based on semantic is bound to fundamentally eliminate the information under the network environment, realize the online information resources based on semantic interconnection and resource sharing [5, 6]. Therefore, how to organize, manage and maintain mass information and provide effective services for users has become an important and urgent research topic.
Semantic Web technology is the first step to realize automated service discovery, and most current methods of Web service discovery are targeted at semantic Web services [7]. Nor Azizah Saadon et al. proposed an enhanced CMD is for cloud-based MWS discovery framework based on semantic matching, using semantic lightweight web service descriptions with grid-based architectures [8]. Nwe Htay Win et al. proposed a new QoS based adaptive service discovery mechanism, which can adapt the discovery process with the help of the ontology tree with semantic structure when encountering unexpected results [9]. Keyvan Mohebbi proposed a semantic Web services matcher and the performance evaluation of the information retrieval domain metrics showed that the combination of designed matching filters could significantly improve the semantic matching of owl-s services [10]. Simhadri Srujana et al. proposed an extensible semantic web services discovery method, which USES prolog and graph database (such as Allegro graph) for reasoning without reducing the expressive power [11]. To improve the accuracy and efficiency of traditional service discovery using the vector space model, Huiying Gao proposed a hierarchical clustering method for semantic Web service discovery, which reduced the dimension of semantic vector and satisfied the service demand of users [12]. Marcio Fuckner’s application of several semantic Web technologies in a service-oriented architecture enables explicit representation and reasoning of service operations, improving the accuracy of algorithms [13]. Durmus et al. proposed an intelligent discovery protocol (SDP), which proved the feasibility of integrating service semantic representation with other service discovery protocols [14]. Jin Li et al. proposed a p2p-based structured Web services organization model. Based on structured P2P technology, Web services can be scheduled in a variety of collaborative computing modes, improving performance [15]. Based on summarizing previous research results, Mahdi Bennara et al. proposed a semantically based BFS discovery algorithm, which calculated the semantic distance between resource description and user request concept, and then sorted resources accordingly to minimize the number of search links and maximize the diversity of results [16, 17]. CAO Jing took the transformer defect text as the research object, and based on the analysis of text features, established the defect text mining model based on semantic framework. This model solves the segmentation of sentence elements of defective text and the accurate extraction of digital information [18]. The traditional approach to semantic Web services discovery is centralized, leading to a single point of failure and performance bottlenecks that quickly become impractical. In response to this problem, SHI Min proposed a semantic Web services discovery algorithm in structured P2P networks. Compared with the precise discovery algorithm, the recall rate of this algorithm is significantly improved [19]. With the rapid adoption of the service-oriented computing paradigm, there is a growing need for efficient mechanisms for web services discovery. Wang Xin Ying proposed a semantic and improved web service discovery algorithm, which is superior in recall and precision [20]. Roberto De Virgilio proposed a service matching algorithm based on the SWS metamodel, which maximizes the effectiveness of the discovery process in terms of accuracy and recall rate by continuously relaxing matching constraints to make SWS comparisons [21].
The grid system gradually transforms into a dynamic service pool, providing users with access to heterogeneous resources that span multiple security domains [22]. As a new distributed system technology, Web services have been widely applied in enterprise application integration EAI, business process management BPM, virtual organization VO and other fields. However, the lack of semantics in current Web services standards has become a major obstacle to service discovery and composition [23, 24]. Semantic web services represent the potential of the web, they have an important impact on the discovery process, quality of service plays a crucial role, and it becomes a very important factor in discovering and selecting these candidate services to best meet the needs of users [25, 26]. This paper firstly analyzes the current research status and existing problems of semantic grid service matching, and proposes a semantic layered matching algorithm based on Massimo Paolucci elastic matching algorithm. To verify the feasibility and effectiveness of the hierarchical matching algorithm based on semantics, a prototype system named SGSM was designed and its functional model, matching process and performance were studied. Experimental results show that for the semantic-based hierarchical matching algorithm proposed in this paper, the threshold value of service semantic correlation degree is 0.84, the threshold value of service basic concept matching degree is 0.89, the threshold value of service comprehensive similarity degree is 0.66, and the threshold value of service quality matching degree is 0.78. Statistics through the experiment, the above three methods of recall, respectively, 33%, 62%, 85%, the precision is respectively: 29%, 57%, 88%, and illustrate the hierarchical matching algorithm based on semantic is feasible in practical application, compared with the traditional service based on keyword matching algorithm and Massimo Paolucci elastic matching algorithm on the recall and precision are improved significantly.
Semantic service discovery in a grid environment
Overview of semantic grid
With the rapid development of the Internet, the existing defects gradually exposed, such as the monotonous function of Web pages and the low intelligence degree of search engines, mainly because most of the content on the Web is designed for people to read, rather than let computer programs operate according to its meaning. The Semantic Web is proposed to make up for this deficiency by extending the Semantic information that can be processed by computers. In the semantic web, its internal resources are artificially endowed with a variety of clear semantic information, which can be distinguished and recognized by the computer and interpreted, exchanged and processed automatically. The so-called “semantic” is the meaning of the text. The semantic web is a network that can make judgments based on semantics. To put it simply, the semantic Web is an intelligent network that can understand human language. It can not only understand human language, but also make communication between people and computers as easy as communication between people.
Grid is to meet scientific research and engineering application of the demand for high-performance scientific computing, with the continuous development of grid technology, the grid is no longer confined to the category of scientific computing, but to develop in the direction of resource sharing and distributed collaboration, main show is in the form of a grid middleware to support large-scale data sharing and computationally intensive problem-solving. However, with the deepening of grid research, new applications require the grid to realize a higher level of service management, more flexible sharing and collaboration, to realize seamless integration and interaction of grid services. The grid is the improvement of the computing power of the Web, while the semantic grid is the extension of the semantic power of the grid. On the other hand, semantic Web enhances semantic capability on the existing Web, and the semantic grid is an extension of computing power of semantic Web.
The semantic grid is an extension of the grid, which defines the meaning of information and services on the grid and enables it to better support human-computer interaction. It implements semantic discovery, invocation and assembly of grid services by semantic description of services. Semantics in the grid is an important feature is the reference of the ontology technology, based on the ontology definite formal specification of related concepts, enables in the semantic grid resources from the semantic level, be in the form of unified, formal definition and description, and the characteristics and requirements from the conceptual level to realize intelligent information retrieval based on semantic retrieval coincides with mine. Therefore, the combination of semantic grid and information retrieval will make the retrieval system carry out at the semantic level in a machine-understandable way, thus greatly improving the recall and precision of retrieval.
Grid service technology
Web service is an interface proposed to enable information between previously isolated sites to communicate and share. From an external consumer’s perspective, a Web service is an object/component deployed on the Web that has the following characteristics: Perfect encapsulation: Web service is an object deployed on the Web with good encapsulation. For the consumer, only the list of functions provided by the object can be seen. Loosely coupled: this feature is also from the object/component technology, when a Web service implementation is changed, the caller will not feel this, for the caller, as long as the Web service call interface, implement any changes to a Web service for them is transparent, even when the Web service implementation platform from the J2EE migration into the. NET or opposite migration process, the user can all don’t know anything about it. For loose coupling, especially for Web services in the Internet environment, a message exchange protocol suitable for the Internet environment is required. XML/SOAP is currently the most suitable message exchange protocol. Use of standard protocol specifications: as Web services, all their common protocols are completely required to be described, transmitted, and exchanged using open-standard protocols. These standard protocols have completely free specifications that can be implemented by any party. In general, the vast majority of specifications will be published and maintained by W3C or OASIS as the final version.
Semantic grid services
Semantic Grid Service is an important applied basic research field based on Semantic Grid and ontology. First, it is a Web Service that provides a set of interfaces that are clearly defined and adhere to specific conventions to solve the problems of services, dynamic Service creation, lifecycle management, notification, and so on, thus simplifying the work of developers dealing with common problems and behaviors of distributed computing over large networks. The main goal of semantic grid services is to overcome the limitations of traditional Web services semantic operational capabilities and enable the discovery, execution, and dynamic composition of services to be accomplished intelligently.
The general framework of LF-grid
LF-grid is a service-oriented semantic Grid implementation, it according to the simple operation environment to build the entire Grid system, supplement many Grid operation required components, and from the semantic perspective of the entire framework to expand, its basic structure is divided into some clients, LF-grid core components and LF-grid resources.
LF-grid resources include high-performance computers in the Grid, relational databases, certain instruments and equipment and other physical resources, including network bandwidth, applications and other logical resources. Because these resources may belong to different organizations with independent access policies, the heterogeneity, distribution, and autonomy of an application environment are apparent. To solve this problem, we proposed resource service, a set of grid services for encapsulating various heterogeneous resources. From the perspective of the grid, the original various resources become a unified interface service, which is easier to manage and use. Resource services run on specific resource nodes with better performance.
Simulation analysis of service discovery based on semantics
Algorithm principle
Service matching is an important process and main content of service discovery, that is, the comparison between service requests and service. The service request is first defined. Service request R can be formalized as a triple: R = (I r O r φ).
Where, I r and O r respectively represent the input set provided by the user in the service request and the desired output set. Each set element in the input and output sets corresponds to a concept in the constructed ontology. φ is a matching threshold that represents the level of matching required between the service and the request. φ is an interval value, and 0≤φ≤1. The closer it is to 1, the higher the degree of matching between the service request and the service is required. Only when the match between the service and the service request reaches this threshold can the service be obtained, that is, a service is successfully discovered.
This paper provides semantic support for the concepts of elements in a service and elements in a service request through WordNet.
Define the matching degree of service s and service request R, that is, their similarity degree is Sim (s, R). Then according to the definition of service and service request in front, there are:
The result of equation (3) can be normalized to obtain a value between [0, 1], that is, 0≤Sim (s, R) ≤1, and if and only if Sim (s, R), the service s meets the service request R.
When calculating the similarity of each element, this paper takes Sim (I
s
I
r
) as an example according to the WordNet structure characteristics:
Synset (I s ) and Synset (I r ) represent the conceptual semantic set of each element in WordNet about the input set of service and the input set of service requests, namely the sense set about the input set element. The formula shows that the more sense lines I s and I r are similar, the higher the similarity between them will be.
By the above calculation, a service that satisfies the request can be successfully discovered as long as the Sim (s, R) ≥φ is satisfied. But to achieve better service quality, it is not enough to only find a service. To prevent sudden service failures, you also need to find a set of services of the right size for rapid migration in case of service failures. When dealing with this problem, traditional algorithms often compare the relationship between several services and requests multiple times. This article compares the similarities between the discovered services and other services so that more semantic support information can be obtained. If the similarity between a service in the alternative service set and a service that has been successfully matched is greater than or equal to a certain threshold, the service is considered to also satisfy the service request.
When comparing the similarity of two services, the algorithm in this paper mainly compares the similarity of the two service behaviors. The similarity of the two services is defined as follows:
The similarity degree of the two services’ behaviors will be calculated according to their input set, output set and operation set using the method similar to formula (4), pp.
The similarity results obtained from formula (6) were adjusted by using ω so that high similarity values could be obtained under the influence of strong correlation, while low similarity values could be obtained under the influence of weak correlation. The set of services that further meet the requirements is calculated by the following formula:
In this paper, only the Input similarity of the service provider (SP) and service requester (SR) is calculated, and the similarity calculation of Output, Precondition and Effect is similar to that of Input similarity.
Service provider SP and request service SR are both about the same concept Date, and the input number is the same, so their matching degree is the maximum value of 1;
The service providing SP is associated with the time point, while the request service SR is associated with the date. Since the date is a subclass of the time point, the matching degree between the two is the maximum value of 1.
The service provided by SP is associated with the calendar time, the request service SR is associated with the date, then: Sim (P S 1 P s 2 ) = 0.768, Sim (s1s2) = 0.563. The calculated results are shown in Table 1:
The calculation of matching similarity of two concepts
The calculation of matching similarity of two concepts
Resource status and level comparison table
Retrieval performance refers to the effectiveness of the retrieval results of the model and reflects the retrieval ability of the model. There are many indexes to describe the retrieval performance, which can be divided into efficiency index and effect index. The effect index includes recall, and the effect index includes response time and throughput. The recall rate and accuracy rate are the most important indexes and the two key indexes of traditional information retrieval performance evaluation.
Recall refers to the percentage of relevant documents retrieved in the total number of relevant documents in the document set, or the concept that relevant documents can be retrieved by the retrieval system, reflecting the ability of the model to retrieve relevant results. The search accuracy is the percentage of the number of relevant documents retrieved and the total number of documents retrieved, or the probability that a given search result set is related to the user’s query, reflecting the accuracy of the search result. The calculation formulas of recall and accuracy are as follows:
Service discovery data may be described in a discrete manner or an interval manner. In discrete description methods, is further divided into two cases, one is the semantic full recognition part, basic recognition, recognition, a small amount of recognition, not identify five discrete to describe the work of manufacturing resources status, another is to use specific numerical description, such as the requirement of service response time is to use specific numerical description. The values set in this article are shown in Table 2 and Fig. 1.

Resource status and level comparison table.
In this paper, QoS is used to describe the ability of a service to meet the needs of consumers. QoS matching is the matching of service response time, service cost and service reliability in OWL Profile. This paper provides a function to match the quality of service with the three, which is proposed based on the MAE (Mean Absolute Error) algorithm. MAE is a calculation method to measure the degree of difference (error) between two objects. Its calculation is simple and easy to understand. It is widely used in various fields of statistical accuracy. MAE reflects the difference between the actual value and the target value. The smaller MAE, the closer the actual value is to the target, and the smaller the deviation; On the contrary, the more the actual value deviates from the target value, the greater the deviation. In this paper, the cost is taken as an example, and its QoS is shown in Table 3 and Fig. 2:
Provides a grid quality of service description for services and request services
Provides a grid quality of service description for services and request services

Provides a grid quality of service description for services and request services.
This paper mainly takes recall and precision as the main basis for evaluating the performance of service matching. The precision ratio is the ratio of the number of Web services associated with the query in the query result to the total number of Web services returned by the query result. To verify the performance of the semantic-based layered matching algorithm designed in this paper, the keyword-based matching algorithm, Massimo Paolucci elastic matching algorithm and the semantic-based layered matching algorithm designed in this paper are compared.
For the semantic-based hierarchical matching algorithm proposed in this paper, the threshold value of service semantic correlation degree is 0.84, the threshold value of service basic concept matching degree is 0.89, the threshold value of service comprehensive similarity degree is 0.66, and the threshold value of service quality matching degree is 0.78. The recall rates of the above three methods are shown in Table 4 and Fig. 3.
Recall of three algorithms
Recall of three algorithms

Recall of three algorithms.
Because the algorithm is first from the different aspects of the service to the calculation of matching degree, finally computing services overall compatibility, USES a modular way of matching algorithm is of good extensibility and set up in the process of semantic similarity computation adjustable parameters, can through dynamic changes in the algorithm implementation process parameters to adapt to specific situations.
For the semantic-based hierarchical matching algorithm proposed in this paper, the threshold value of service semantic correlation degree is 0.84, the threshold value of service basic concept matching degree is 0.89, the threshold value of service comprehensive similarity degree is 0.66, and the threshold value of service quality matching degree is 0.78. The accuracy of the above three methods is shown in Table 5 and Fig. 4. It is shown that the semantic-based layered matching algorithm is feasible in practical application. Compared with the traditional keyword service matching algorithm and Massimo Paolucci elastic matching algorithm, the recall rate and precision are significantly improved.
Precision of three algorithms
Precision of three algorithms

Precision of three algorithms.
This paper analyzes the current research status and existing problems of semantic grid service matching, introduces and compares three technologies of the semantic web, grid and semantic grid, and mainly discusses the basic functional modules of grid and three mainstream grid architectures (five-layer hourglass structure, OGSA and WSRF). This article focuses on how to define and represent grid services. Service description is intended to provide a standard way of describing service providers and service requesters and is the basis of service discovery. Based on the requirements of service discovery, this paper analyzes the service description languages and proposes the semantic description of service description documents to expand the service functional information, laying a good foundation for service discovery. To effectively publish the expanded service description, this paper utilizes the characteristics of the data entities to realize the use of the expanded service description in the UDDI center by registering the semantic information of the service description in the form of tModel. It provides the basis for better efficient and accurate service discovery. To provide effective semantic support for semantic grid service matching, ontology technology is introduced in this paper. Firstly, the ontology modeling method, description language, description logic and owl-s language technology are described in detail. Secondly, the ontology description framework of the semantic grid, the construction rules and methods of domain ontology, the extension of grid service ontology description to owl-s, and how to map semantic information to UDDI are introduced. According to the requirements of intelligence and efficiency, this paper studies how to conduct service discovery of advertisement service description and request service description based on semantic description and ontology. Service matching is a key problem in service discovery. At present, to improve the ability of service matching in the process of service discovery, many methods consider the effective use of ontology technology for semantic matching of services. Based on the extended service description, the semantic matching principle and algorithm are proposed in this paper. This matching method matches the linguistic features and background features of ontology concepts and calculates the semantic similarity. Then based on the matching algorithm, three-phase grid service discovery algorithm is proposed to make full use of the potential of grid service semantics, in addition to the operation of the service function, semantic similarity matching output and input parameters, and increased the premise condition and the effect of semantic matching, greatly improving the service retrieval precision rate and recall rate. Based on Massimo Paolucci elastic matching algorithm, a semantic layered matching algorithm model is proposed. The model is mainly divided into the following three layers: the first layer is the matching of the basic concept of service. Based on the category of the conceptual model to which the attribute value belongs, the basic concept of service is matched. This layer mainly matches the service category, service name and text description. The second layer is the matching of service semantic relational degree. This layer is the matching of service functional attributes, including input, output, preposition and effect matching. The third layer is the matching of quality of service, which is mainly the matching of non-functional attributes, including service response time, service cost and service reliability. To verify the feasibility and effectiveness of the Semantic-based layered matching algorithm, a prototype system named SGSM (Semantic Grid Service Matcher) was designed. In this paper, the functional model, matching process and performance of the system were studied. Experiments show that the semantic-based layered matching algorithm is feasible in practical applications, and has significantly improved recall and precision compared with the traditional keyword service matching algorithm and Massimo Paolucci elastic matching algorithm.
Footnotes
Acknowledgments
This work was supported by the Key Scientific and Technological Research Projects of Henan Province, China (182102210416).
