Abstract
Recently, machine learning (ML) techniques have been introduced into various domains. This study focuses on projects for the development of ML-based service systems in which ML techniques are applied to enterprise functions. In these projects, constructing reusable knowledge on projects that develop ML-based service systems is important to effectively implement such projects. Here, the collection of insights and development of architecture and design patterns for ML-based service systems are considered. We propose a method for collecting insights by referring to a development model based on project practices and developing patterns for ML projects as an enterprise architecture model. Through a practice, we attempt to collect insights as best practices and construct design patterns for ML projects using the proposed method.
Introduction
Currently, numerous machine learning (ML) techniques are available as application programming interfaces (APIs). Therefore, ML techniques can be used for practical business applications. Accordingly, enterprises have begun to implement such techniques in their business functions. Here, we consider projects for developing service systems in enterprises using ML APIs. New features have emerged in service systems that use ML techniques (ML-based service systems). Furthermore, when ML techniques are applied to these business functions, acquiring training data on the target business domain is important. Thus, sufficient knowledge of, or prior experience in, the target business domain is essential. Therefore, representatives of the IT division and relevant business divisions are required to participate in the project. Consequently, numerous challenges arise in ML service system development projects (ML projects) in terms of requirements, design, implementation, and test phases [1, 2].
Therefore, collecting insights from project practices and creating reusable knowledge is necessary for the effective undertaking of ML projects. Here, we focus on the patterns of ML projects as reusable knowledge. Practitioners are considered to obtain insights through ML project practices and share them with their organizations. Such insights are described using different formats and are not shared beyond organizations.
We propose a method for constructing reusable knowledge for ML projects as patterns based on project practice. To this end, we implement a development model for ML-based service systems, such that practitioners can provide insights from projects by referring to the model. Moreover, we define steps according to which practitioners construct patterns from the collected insights, using enterprise architecture (EA)-based generic ML architecture and a design pattern model. By applying the proposed method, we attempted to confirm that practitioners can effectively collect insights on ML projects and construct patterns as reusable knowledge to conduct ML projects.
The remainder of this paper is organized as follows. In Section 2, we describe related studies. We introduce the ML-based service system and architecture and design patterns for ML service systems in Section 3. Moreover, we define the research hypothesis of this study. In Section 4, we present our method for constructing the reusable knowledge for ML projects from project practices. In Section 5, we demonstrate the practice in which we apply the proposed method. Finally, Section 6 discusses our results, and Section 7 summarizes key points and presents directions for future studies.
Related work
By reviewing the literature and surveys of several actual projects, [1, 2] identified many software engineering challenges that arise in a project for developing ML systems , and the extraction of knowledge for conducting ML projects was emphasized.
The best practices in ML projects for constructing knowledge were collected through a literature survey [3]. Furthermore, [4] introduced a general workflow for the development of ML systems as reusable knowledge for ML projects. The role of data scientists in ML system development projects was discussed in [5]. By combining these findings, a project model that represents the relationships among project activities, stakeholders, and project goals was subsequently proposed [6]. An architecture for representing an entire system is required for knowledge in practical projects in which big data analytics or ML techniques are applied [7]. In addition, reference architectures for development teams have been proposed [8, 9]. In [10], a reference architecture for intelligent systems was presented, which combines digital strategies and architectures [11] with artificial intelligence.
The development of architecture and design patterns has been considered a knowledge resource for software system architecture, and several studies have addressed this. In [12], several patterns focusing on the operational stability of ML systems were introduced. Moreover, through a systematic literature review, software engineering patterns for ML systems were identified and formalized in [13, 14]. These patterns are typically described as itemized documents and primarily target data scientists and ML application developers. Some surveys have been conducted to clarify how ML developers perceive these patterns [15]. The representation of ML patterns as EA-based models was investigated in [16]. In addition, code smells [17] and data smells [18] for ML systems were collected through literature reviews.
Research subject and hypothesis
ML-based service system
In our research, we considered developing a system using ML techniques for business functions that either support or are substitutes for human activities.
For a given input data, ML techniques predict and output the optimal option from a predefined set. In a workplace, these techniques are used in routine activities such as replying to service queries or conducting business assessments based on customer-provided information. To develop systems using ML techniques, the options for the target business domain are first defined; subsequently, the example inputs related to each option are collected. An ML model is generated (trained) from a training dataset containing such pairs of options and examples. Subsequently, this model is deployed into a runtime ML engine, which then obtains the input data and provides the output data using this prediction model.
In this study, we represent an ML-based service system development project carried out by employing an EA modeling approach using ArchiMate[19] as the EA modeling language. The ML-based service system using an EA, as illustrated in Fig. 1, was obtained by employing the three business concepts and three application concepts listed below.
ML-based service system represented by ArchiMate.
Business layer concepts:
Business service: An explicitly defined and executed business activity. Business process: A sequence of business activities that produce a planned outcome. Business object: A set of concepts used within a particular business domain. Application layer concepts:
Application service: An explicitly defined and exhibited application behavior. Application component: An element of application functionality aligned to the implementation structure. Data object: Data structured for automated processing.
Software design patterns are a form of reusable knowledge in software engineering. In software design patterns, best practices are formalized such that engineers can use them to solve typical problems that occur when designing an application or system. Although no standard format exists for many patterns, the following items are defined to describe software design patterns:
Intent: Objective of the pattern; Problem: Forces that the pattern seeks to resolve; Solution: Suggested activities to solve the problem; Context: Environmental information on the system; and Discussion: Pre-conditions or limitations for applying the pattern.
Several security patterns [12], and architecture and design patterns [13, 14] have been introduced for ML service systems. For example, in [14], a pattern related to the ML system architecture pattern, known as the “data flows up, model flows down with federated learning” pattern, was described. Table 1 lists the aforementioned pattern elements. Figure 2 shows the solution proposed by the “data flows up, model flows down” pattern.
Data flows up, model flows down with federated learning pattern
Solution proposed in “data flows up, model flows down with federated learning” pattern.
As indicated in the context field, the “data flows up, model flows down” pattern is implemented in the ML system using mobile/edge devices such as mobile phones, cameras, and IoT devices. A major retail company in the USA introduced the ML system to identify traffic problems in the parking lots of their stores.1
Federated learning is a special example of this pattern. It was implemented on Google Keyboard using Android and is known as Gboard.2
As previously mentioned, the best practices, architecture, and design patterns of ML-based service systems are based on literature surveys. This implies that experts with sufficient knowledge of software engineering and ML techniques must analyze the literature and construct patterns. Moreover, insights obtained through project practices are not systematized as best practices or patterns unless published. A large organization conducting various types of ML projects has published their insights as patterns [20], how the insights were systematized into such patterns is unclear.
Overview of proposed method.
Generic agile development model.
Workflow for using ML techniques.
Within this context, we considered the following research question (RQ):
How can practitioners construct reusable knowledge for ML projects from real project practices?
For this RQ, we propose a method for collecting insights from project practices and constructing reusable knowledge as patterns from the insights. Furthermore, we confirm the effectiveness of the proposed method in practice.
Overview
In this study, we consider knowledge construction based on ML project practices. Figure 3 overviews the proposed method.
The proposed method consists of the following steps:
Prepare a development model based on ML project practices. Derive insights from ML projects by referencing the development model. Construct patterns from the collected insights.
In the proposed method, a development model for collecting insights is prepared as the first step. In this study, we used the agile development model for ML service systems proposed in [21]. This model was extended from the general agile development model and ML workflow model based on actual ML project practices.
A generic agile development model was proposed in [22] and is represented as shown in Fig. 4.
A workflow model for ML-based service systems was proposed in [4] and is represented as shown in Fig. 5.
Figure 4 shows that the work items are defined by specifying requirements, and iteration backlogs are specified from the work items. In Fig. 5, detailed activities for specifying the requirements for ML-based service systems, work items, and iteration backlogs are not represented. In ML projects, practitioners in the user segments are assigned to development activities and must clearly understand the items they are responsible for in each activity. Therefore, in [21], a reference agile development model for ML projects was extended from existing models based on the project practice data.
As project data, we used data on 23 ML projects collected in [23]. In these data, each ML project is represented as an ordered list of project activities defined in the ML project canvas [24]. By comparing the ML projects in this data, common activities in the ML projects were identified. Therefore, we analyzed the activities conducted before data collection. Table 2 lists the analysis results. For example, the purpose of the project was considered before data collection in 20 out of 23 projects (87.0%).
Activities conducted before data collection
Activities conducted before data collection
From Table 2, the following common activities are derived to specify requirements:
Consider the purpose or goal. Consider the action based on the prediction. Determine the user segments. Define the metrics of success. Determine the algorithms and infrastructure.
By combining the first and third activities, we define a new activity: “Consider the user segments and their goals.” The requirements for the entire system and ML model were specified as work items based on these four activities. To execute the development tasks, we defined the metrics for the ML model and entire system as iteration backlogs from these work items. Using these derived activities, work items, and iteration backlogs, we extended the existing models described in Figs 4 and 5. Consequently, using ArchiMate [19], we represented a practice-based reference model for the agile development of ML-based service systems, as shown in Fig. 6.
In the second step, practitioners provide insights derived from ML projects by referencing this development model. In this study, we arranged a workshop in which practitioners participated and shared their insights.
Subdivided pattern elements
Agile development model for ML service systems.
Generic ML architecture and design pattern represented using ArchiMate.
Patterns were constructed from the insights collected in the second step. The insights provided by practitioners describe recommended activities for conducting ML projects effectively, as well as for the project phases during which the activities are conducted. For example, the insight “We should consider the metrics on the reliability, safety, or fairness, as well as that on the accuracy when defining the metrics of project success” is recommended for the project planning phase. In this section, we outline the steps for constructing patterns from insights with this formatted information.
We used an EA model for the ML architecture and design patterns [16]. The relationships among key elements are sometimes not clearly described in the pattern documents, which hinders the common understanding of the patterns among stakeholders. By contrast, the EA model using ArchiMate represents the relationships among the pattern elements. To represent the ML design patterns as EA models, we first attempted to identify the common elements described in each field in the ML design pattern documents. By observing existing ML design patterns, we determined the following elements in the problem field.
Situation to be improved Assessment result of the situation to be improved Expected goal achieved by applying the pattern
In addition, the following elements are typically described in the context field.
Phase in which the pattern is applied Device where the system is running System user
For these subdivided elements and other existing elements, such as intent, solution, and discussion, we can assign the model elements defined in ArchiMate. Table 3 lists the mapping between the pattern elements and EA model elements.
Next, we attempted to represent the relationships between the elements in Table 3. For example, the solution represented as a principle is considered to realize the object of the pattern, which is represented as an outcome. This relationship can be represented using the realization notation defined in ArchiMate. Using relationships such as realization, composition, access, flow, serving, assignment, and association, the pattern elements in the ML architecture and design patterns can be connected. Consequently, as shown in Fig. 7, we can represent a generic ML architecture and design pattern using ArchiMate.
Summary of collected insights
Collected insights
In this model, the pattern elements in the ML architecture and design patterns can be connected with relationships, such as realization, composition, access, flow, serving, assignment, and association.
The descriptions in the collected insights correspond to the “Solution in the pattern” and “Phase in which the pattern is applied” in the generic model. From the generic model, we construct specific pattern models according to the following steps:
Obtain the “Objective of the pattern” by analyzing the “Solution in the pattern” described in the insight. Analyze the issue to be solved using the solution and current situation and derive the “Goal achieved by applying the pattern,” “Situation to be improved,” and “Assessment result of the situation to be improved.” Analyze the exceptions in the solution and identify the pre-condition or limitation in the pattern. Assess whether the solution can be applied only to the specific ML service system and derive the “System user” or “Device where system is running” if necessary.
Each model element corresponds to a description of the pattern documents. Therefore, the constructed pattern model can be converted into a pattern document.
We hosted an online workshop at the Working Conference on Machine Learning Software Engineering (MLSE2021) in Japan on July 2, 2021. A total of 12 practitioners with some experience in ML projects participated in the workshop. All participants accessed an online canvas where the reference ML development model was presented, and posted their insights that were obtained through the ML projects on the canvas. As a result, we collected 28 insights and categorized these based on the viewpoints extended from those in [3]. Table 4 displays a summary of the collected insights.
Constructed pattern (“Check the origin of the data”)
Constructed pattern (“Check the origin of the data”)
Pattern model constructed from collected insights.
In Table 5, we show the collected insights as best practices.
From Table 5, we selected S09 “Confirm the processes and methods by which data are collected” as an example and applied the proposed pattern construction method. This insight corresponded to the description in the solution element in the pattern. By analyzing the purpose of the activity described in this insight, it was found that certain fields in the training data could not be used for the prediction because the runtime input for the prediction was not clearly defined when collecting the training data. This issue is known as “target leakage.” Therefore, the intent of the pattern was avoiding target leakage and the goal of the pattern was improving the accuracy. As a result, we obtained the pattern model for this insight, as illustrated in Fig. 8, and assigned the pattern name “Check the origin of the data.”
The pattern descriptions were converted from the pattern model. Table 6 presents the constructed pattern. Through the example analysis, it was confirmed that the RQ could be solved by the proposed method.
In the proposed method, practitioners provide insights on their ML projects by referencing the development model. This reference model is based on the project practice, and insights are collected as best practices. In the implemented practice, 12 practitioners discussed their ML projects for two hours, based on which we obtained almost as many insights as those in the literature survey-based methods [3]. Therefore, project insights can be expected to be effectively collected by practitioners using the proposed method.
Table 5 demonstrates that we can collect insights that differ from those obtained using the literature survey-based method. This is because the detailed activities during the project planning stage are represented in the reference model. For example, we obtained the following insights in the project-planning stage:
Agree with the business division on the business goal and goal to be achieved by the ML-based service system. Agree with the goals and available computing resources. Confirm both potential users of the ML-based service systems and the number of such users.
The proposed method is expected to be usable in conjunction with the literature survey-based method. However, whether insights from ML project practices can be exhaustively collected using the proposed method is unclear. Furthermore, whether the number or quality of the collected insights depends on the experience or skill of the practitioners remains unconfirmed. These items must be investigated through continuous insight collection, and we plan to address this in future studies.
Through the implemented practice, we confirmed that we can successfully construct the patterns of ML projects using insights collected using the proposed method. Moreover, we can obtain a pattern description without missing any elements by converting the constructed model. This implies that practitioners can systematize the reusable knowledge of ML projects as patterns from the collected data without the support of experts with strong software engineering and ML skills. However, when constructing the pattern models of ML projects, knowing the quality characteristics required for ML-based service systems is necessary. Thus, determining the typical issues or risks in ML-based service systems and systematizing them as knowledge should be investigated in future studies.
In this study, we focused on projects for the development of ML service systems in which ML techniques are applied to enterprise functions. We considered a method for collecting insights on ML projects from the practices and construction of reusable knowledge as patterns from the insights collected. Therefore, we proposed a reference development model for ML projects by extending a generic agile development model and ML project workflow model. We also presented the steps for constructing the patterns as models from the EA-based generic ML architecture and design patterns. We collected 28 insights as best practices using the proposed method and confirmed that practitioners could collect insights effectively and that the patterns of ML projects could be successfully constructed based on the insights collected. Future studies should focus on investigating the quality or coverage of the collected insights and on systematizing the typical issues or risks in ML-based service systems as knowledge required for pattern development.
Footnotes
Acknowledgments
This work was supported by a JSPS Grant-in-Aid for Scientific Research (KAKENHI), Grant No. JP19K20416, and the JST-Mirai Project (Engineerable AI Techniques for Practical Applications of High-Quality Machine Learning-based Systems), Grant No. JPMJMI20B8.
