Abstract
A scientific and reasonable indicator system is the key to measuring the level of data governance in manufacturing enterprises (DGME). The study analyzes documents and policy analysis to classify manufacturing enterprise data governance into five dimensions and constructs a manufacturing enterprise data governance indicator system with 5 primary indicators and 23 secondary indicators. Study 1 utilized Exploratory Factor Analysis (EFA) analysis to determine that the data governance indicator system of manufacturing companies presents a five-factor structure; Study 2 used Confirmatory Factor Analysis (CFA) and Average Variance Extracted (AVE) to determine that the overall fitness of the model and the reliability and validity of each level of the indicator measurement items were at the desired level. The proposed indicator system can provide a reference for data governance in manufacturing enterprises.
Introduction
As the main body of the national economy, the manufacturing industry is the foundation of the country, the tool for the development of the country, and the basis for the strengthening of the country. In order to effectively respond to the demand for massive multi-source heterogeneous data management 1 and value mining in the context of the big data era, manufacturing enterprises need to manage their massive data assets by means of data governance, 2 that is, to effectively control the whole process of data generation, processing, transmission, use, and destruction, and to solve the problems and needs faced in the process of releasing the value of data. Making data available and accessible in a systematic way. With the development of the times, data governance gradually has obvious industry characteristics, and relevant research around data governance in manufacturing enterprises 3 has gradually become an important field. By combing and segmenting the data governance literature and research directions in manufacturing companies, it is concluded as follows:
Research on the concept of data governance “Information Technology Service Governance Part 5: Data Governance Specification” (hereinafter referred to as the “Data Governance Specification”) defines data governance as the collection of control activities, performance, and risk management associated with data resources and their applications. Janssen et al. 4 define data governance as a model of an organization and its personnel, applications, control rules, and permissions to guide the normal operation of the entire life cycle of data and algorithms within and across organizations and ensure its institutional system. Bozkurt et al. consider data governance as the integration of strategies for data-related structures and entities with data processes, participants, architecture, and overall data management. 5 Mercuri and Emerson 6 argue that data governance is about who should have access, how the data will be used, how the products generated by the data will be shared, and how to ensure that the privacy of the individuals to whom the data relates is respected. Data governance and information governance are two relatively similar concepts. Many scholars even interchange these two concepts. Information governance is a broader definition based on the research perspective of information technology issues. 7
Research on how to conduct data governance. Wei et al. 8 established a manufacturing supply chain data governance platform using secure and reliable block-chain technology. Zorrilla and Yebenes 9 constructed framework models for industrial data governance practices. Janssen et al. 4 established a data governance framework for trusted Big Data Algorithmic Systems (BDAS) to enable external review, trusted information sharing within and between organizations, risk-based governance, system-level control, and data control through shared ownership and autonomous identity. Mao et al. 10 constructed a new government data governance framework based on the concept of data intermediary platform to understand the detailed requirements and functions of a government data governance framework. To the dramatic increase in data volume and the accelerated pace of data assetization, Zhang et al. 11 developed a framework to explain how organizations configure data governance activities and take relevant strategic actions through an in-depth case study of a Chinese gold mining company. Li et al. 3 proposed a blockchain-based “value-standard-process” collaborative framework for data governance in manufacturing organizations, which helps to ensure a high level of data security, high reliability of collaborative tasks, and high transparency of value transformation.
Research on the role of data governance. Abueed and Aga 12 applied PLS-SEM 13 to model how corporate data governance contributes to sustainable knowledge creation in companies listed on the Amman Stock Exchange using survey data from (n = 180) listed companies and judgement sampling techniques. Both Yin and Li 14 explored the relationship between corporate data governance and green technology innovation performance by drawing on knowledge management theory and dynamic capability theory. Gegenhuber et al. 15 proposed a three-dimensional distributed Open Social Innovation (OSI) data governance framework consisting of openness, accountability, and power and applied it to reflect on OSI projects.
Research on building a data governance indicator system. Abraham et al. 7 used a literature analysis of 145 research papers and practitioner publications published between 2001 and 2019 to identify the main building blocks of data governance and decompose them in six dimensions. Five research areas were also identified and a total of 15 research questions were formulated to support future research on data governance. Janssen et al. 4 state that data governance indicators include assessment of data quality and bias, data sharing, data separation, distributed data storage, data collection, and usefulness of data. These indicators provide a comprehensive framework for building data governance for trusted AI systems. A systematic literature review of 191 articles on digital servitization was conducted by Smania et al. 16 The findings provide six dimensions of data relationships in the digital servitization ecosystem: data strategy, data partner management, data reliability, data security and privacy, data interoperability, and data lifecycle. These dimensions cover key areas of data governance and provide a comprehensive data management framework for organizations in the digital servitization ecosystem.
Through the above literature analysis, it is found that few scholars study data governance in terms of the overall application scenarios of manufacturing enterprises, and the research on the data governance index system of manufacturing enterprises still remains in the conceptual discussion stage. Manufacturing enterprises in the process of data governance implementation, there are problems such as insufficient awareness of data governance, fuzzy concepts, unclear boundaries, unclear perceptions, and inconsistent standards, which are serious impediments to data management and application. In order to effectively measure the level of data governance development and industrial maturity of manufacturing enterprises, and to grasp the current status and problems of data governance development in manufacturing enterprises in a timely manner, there is an urgent need for a set of indicator system to transform data governance from conceptual to operable variables. Although some scholars have proposed a conceptual model of data governance, the lack of a systematic scientific argumentation shows the problem of insufficiency. In fact, quantitative measurement is an indispensable part of the process, and purely qualitative methods can only be used for theoretical construction but not for theoretical conceptual validation, and need to be combined with the use of quantitative research and analysis in order to help test the conceptual validity of the theory. Therefore, this paper focuses on manufacturing enterprise data governance research results and policy documents to build a manufacturing enterprise data governance indicator system consisting of 5 level-1 indicators and 23 level 2-indicators, through EFA and CFA to clarify the dimensions of manufacturing enterprise data governance and the specific measurement indexes, to provide reference for the evaluation and enhancement of data governance of manufacturing enterprises, and to help manufacturing industry transformation and upgrading.
Predefined indicator system for data governance in manufacturing enterprises
Data governance needs to be analyzed in detail based on the specific governance subjects, governance objects, governance goals, and cultural environment. Due to differences in industry background, research motivations, focus objects, and target expectations, the data governance indicator systems proposed by different scholars or institutions are different. Each has its own advantages and disadvantages and is not universal, but it can still provide an important reference for research on data governance in manufacturing enterprises. This paper is based on the academic results of “data governance,” based on the content of the “Data Governance Specification,” “Data Governance Standardization White Paper,” and other documents, combined with the actual needs of data governance in manufacturing enterprises, the basic situation of data governance in manufacturing enterprises is divided into five dimensions, including data strategy, data asset management, data infrastructure, organizational guarantee, and institutional protection. In particular:
Data strategy (DS)
DS 17 refers to determining the goals and direction of enterprise data governance in terms of systems and processes, clarifying the task blueprint, and evaluating the implementation status and implementation results of the strategy. There are four secondary indicators under this dimension, namely, strategic objectives, business process management, 18 strategy implementation, and monitoring and evaluation.
Data asset management (DAM)
DAM 19 refers to planning, controlling, and providing data as an important enterprise asset, thereby controlling, protecting, delivering, and improving the value of data assets. There are seven secondary indicators under this dimension, namely, metadata, 20 master data, 21 data quality management, 22 data application, 23 data sharing, 24 data security, 25 and data life cycle. 26
Data infrastructure (DI)
DI 27 is the technical support for enterprise data governance. DI refers to the platforms, tools, and software systems used to standardize the entire life cycle of data, such as data aggregation, storage, development, application, operation and maintenance, and security. There are four secondary indicators under this dimension, namely, database, big data platform, 28 data analysis and mining, 29 and data integration and fusion. 30
Organizational guarantee (OG)
According to the stakeholder theory, 31 manufacturing enterprise data governance involves the participation of multiple entities such as enterprises, industries, and governments. In the process of data governance, attention should be paid to focusing on the responsible parties and their rights and responsibilities, allocation of duties, staffing, communication and collaboration, and leadership and decision-making bodies. There are four secondary indicators under this dimension, which are authority and accountability system, people, data culture, 32 and continuous improvement and optimization. 33
Institutional protection (IP)
Institutional protection is a protective barrier for data governance in manufacturing enterprises. Institutional protection refers to the development and implementation of data management processes by enterprises under the guidance of the government and other management authorities, and the systematic implementation of various data management tasks in order to obtain data that follows a series of standard specifications such as industry standards, national standards, 34 and so on. There are four secondary indicators under this dimension, which are standard norms, management system, laws and regulations, 35 and policy leadership.
In this study, a manufacturing enterprise data governance indicator system is established based on the above five dimensions. As shown in Figure 1, the system contains five first-level indicators and twenty-three second-level indicators. Data governance indicator system for manufacturing enterprises.
Design of questionnaire scale
Questionnaire scale on the degree of data governance implementation in manufacturing enterprises.
Exploratory factor analysis
This study uses exploratory factor analysis 36 to study the measurement items and factor structure, and initially verifies the structural dimensions of the manufacturing enterprise data governance indicator system. The research tool is SPSS27.0, 37 and the 23 items are selected through the model generation method to select secondary indicator items.
Sampling and measurement
Basic demographic characteristics (n = 109).
Project analysis
Item analysis method using extreme group analysis method, according to the sum sorting high and low grouping up and down 27% of the standard will be divided into two groups of item scores, test the difference between high and low groups in the item, according to the results of the test of significance to retain those who identify the effect of a better item, otherwise consider deleting the item. The results of the post-discrimination analysis showed that all 23 questions reached the level of significance, that is, the discriminant validity of the 23 questions of the scale was good.
Result analysis
Before performing the EFA test, the fitness test of the data was first performed. According to the data requirements of the exploratory factor analysis method, the sample data were to pass the KMO measure and the Bartlett’s Spherical Test, with the rule of determination being KMO >0.7, and for the Bartlett’s Spherical Test, a significance level of p < 0.05 was required. The KMO value of the data collected in this study was 0.885, while the moderateness of the correlation matrix between the variables according to the Bartlett’s spherical test (χ2 = 1843.002, p < 0.001) indicated the presence of common factors between the variables, that is, the sample data were suitable for exploratory factor analysis.
Results of EFA of data governance index system of manufacturing enterprises (n = 109).
Reliability test of the data governance secondary indicator scale for manufacturing enterprises (n = 109).
Confirmatory factor analysis
The quality of a set of measurement models includes two aspects: external quality and intrinsic quality. External quality is suitable for measuring the overall fitness of the model and is mainly measured by various fitness indicators of the model. Intrinsic quality is reflected in the reliability and validity of measurement items, where validity refers to the convergent validity and discriminant validity of measurement items. In this paper, AMOS28.0 was used for validation factor analysis, and the model was evaluated using the values of various types of fitness indicators, and AVE 40 and Composite reliability (CR) 41 were calculated on the basis of the validation factor analysis to determine its convergent validity and discriminant validity.
Sampling and measurement
Basic demographic characteristics (n = 309).
Second-order confirmatory factor analysis
Use the five first-level indicators as the first-level latent variables and the overall “manufacturing enterprise data governance” (DGME) as the second-level latent variables to conduct a second-level CFA test. At the same time, the items corresponding to the 23 secondary classification indicators were used as measurement items to test whether the sample data supports the theoretical model obtained by EFA. As shown before, the model is evaluated by analyzing the fitness index values obtained. The fit of the theoretical model to the questionnaire sample was assessed using absolute fit indices such as chi-squared freedom ratio (χ2/df), 42 root-mean-squared error of approximation (RMSEA), 43 goodness of fit index (GFI), and adjusted goodness fitness index (AGFI). The degree of fit improvement of the theoretical model relative to the null model is judged by normed fit index (NFI), incremental fit index (IFI), and comparative fit index (CFI). The degree of parsimony of the model is mainly judged by parsimony goodness fitness index (PGFI) and parsimony normative fit index (PNFI).
Fit indices of this measurement model.
In addition to the above-mentioned fitting index that needs to meet the standards, the load values of the measurement items must also meet certain requirements. When the factor loading value is greater than 0.71 the factor explains 50% of the variance in the observed variable, factor loading values greater than 0.63 mean that the factor explains 40% of the variance in the observed variable, and factor loading values greater than 0.55 mean that the factor explains 30% of the variance in the observed variable. When the factor loading value is greater than 0.55, it means the situation is good. The factor loading levels of the model are shown in Figure 2. It can be found that the standardized factor loading of “data strategy,” “data asset management,” “data infrastructure,” “organizational guarantee,” and “institutional protection” are 0.61, 0.69, 0.70, 0.65, and 0.57, respectively. The standardized factor loadings of the 23 secondary indicators were in the range of 0.79–0.88, which were all at desirable levels. The standardized factor loadings of the 23 secondary indicators range from 0.79 to 0.88, all at ideal levels. Results of validated factor analysis of data governance indicator system for manufacturing companies①. Note 1: The Figure 2 is replicated from the output of AMOS. The oval and rectangle represent the latent variable and the measure term, respectively. e1 to e23 and e24 to e28 represent the residual terms of the measure and latent variable, respectively. The numbers on the arrows are the normalized factor loading values.
Convergent and discriminant validity
Convergent validity
Results of AVE and CR analysis of the five first-order CFA tests.
Discriminant validity
Descriptive statistics and discriminant validity of data governance primary indicators of manufacturing enterprises② (n = 309).
Note 2: In Table 8, *** indicates p < 0.001, the numbers in [ ] on the diagonal line are the AVEs of the corresponding primary indicators, and the squared correlation coefficients of the scores of the corresponding paired primary indicators are in the upper right corner of the matrix. AVEs are greater than the numbers above their same column and the numbers to the right of their peers, indicating that each primary indicator has sufficient discriminant validity.
Comparison of competitive models
In order to further confirm the optimality of the measurement model, the second-order 5-factor model was used as the baseline model, and four competing models were formed by combining the intrinsic relationship of the five factors and the results of empirical research. One of the competing models is the first-order five-factor model, which treats the five factors as five latent variables that are only correlated with each other and do not belong to “manufacturing enterprise data governance.” The second competitive model is a single-factor model, which skips the first-level indicators and attributes the 23 second-level indicators to the latent variable of “manufacturing enterprise data governance.” The third competitive model is a first-order four-factor model, that is, organizational guarantee and institutional protection are merged into one factor, and then are only related to each other with data strategy, data asset management, and data infrastructure. The fourth competitive model is a first-order three-factor model that combines data strategy, organizational guarantee, and institutional protection into one latent variable. Furthermore, it is composed of three latent variables that are only related to each other and do not belong to one high-order latent variable.
Comparison of the fit indexes of the baseline and competitive models.
Research conclusion
This paper analyzes the literature and the implications of relevant policy documents to summarize the composition of data governance indicators in manufacturing enterprises as “one body, five sides.” That is, data governance in manufacturing enterprises consists of five dimensions, namely, data strategy, data asset management, data infrastructure, organizational guarantee, and institutional protection in that order, with more secondary indicators divided under each dimension. A manufacturing enterprise data governance indicator system containing 5 first-level indicators and 23 second-level indicators was constructed, and a questionnaire containing 23 items was designed based on the second-level indicators. In order to improve the reliability and validity of the scale, EFA was conducted on 109 questionnaire data using SPSS27.0. It is determined that the data governance indicator system of manufacturing enterprises presents a five-factor structure, with a cumulative variance contribution rate of 76.306%. The reliability analysis results also show that the internal consistency reliability of the first-level indicators is higher than 0.79, indicating that the corresponding measurement items of the five first-level indicators of the scale have good consistency. With the help of AMOS28.0, CFA was carried out on 309 pieces of data. First, the second-order CFA found that the overall model showed good fit. Secondly, the AVE and combined reliability were used to test the convergent validity and discriminant validity. The results show that the measurement items corresponding to the five first-level indicators all have good convergent validity and discriminant validity, and the five first-level indicators are jointly attributed to the higher-order factor of manufacturing enterprise data governance. Finally, through comparison with various fitness indicators of the competitive model, it was confirmed that the second-order five-factor concept of manufacturing enterprise data governance is optimal. This conclusion provides reference for data governance evaluation and data governance capability improvement of manufacturing enterprises, and helps manufacturing enterprises digital transformation.
Shortcomings and prospects
The data governance index system of manufacturing enterprises constructed in this paper can improve the data governance level of manufacturing enterprises and provide certain reference for promoting the digital construction of manufacturing enterprises. The shortcomings are as follows: (1) data governance is not a project work, but a persistent work, dynamic changes in the process of practice, in the practice of exploration, there will certainly be other elements of the indicators of data governance on the manufacturing enterprise, to give a specific data governance indicator system model has the suspicion of “cutting the feet to fit the shoes.” (2) Due to the large scope of the manufacturing industry, most of the data in this study were collected from questionnaires distributed by team members attending manufacturing-related conferences, the sample size is too small, and more data will be collected in the future to verify the validity of the model. A complete and practical indicator system needs to determine the weights of all levels and indicators, which is one of the next research directions.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This paper is a collaborative innovation project of The University Synergy Innovation Program of Anhui Province, “Research and Development of Embodied Intelligent Composite Robot for AI Popularization of Science Education” (Project No.: GXXT-2023-108) and General Project of National Social Science Foundation of China “Research on the Theoretical Construction and Operational Mode of Scientific and Technological Security Intelligence in the New Era” (Project No.: 19BTQ089).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
