Abstract
Data quality issues are still a challenge in data migration. Existing methods mainly focus on technical issues. However, the problem of data quality is not only a technical problem, but the main one is the non-technical aspect. This paper aims to identify the role of data governance in the data migration process. The qualitative research method uses two primary references, namely the high-level data migration approach from Kazimir et al. and the DAMA-DMBOK framework. Both are highly compatible with data migration processes focusing on data quality and referencing governance roles. The case study was conducted in a central government working on data migration to the new One Data application. The results show that eight data governance roles are involved in 20 migration processes (65%). Non-technical activities related to data quality involve more than half of the data migration process. This paper proposes the role of data governance for non-technical aspects of the migration process. This paper presents a case study of the high-level data migration process and how data governance comes into play. This research can help the government and practitioners overcome problems that often occur in data migration, especially related to data quality.
Keywords
Introduction
The One Data Initiative, or what is usually called “Satu Data Indonesia (SDI)/One Data Indonesia (ODI),” is one of the Indonesian government initiatives that try to fix problems in the administration and management of government data (Rahmatika et al., 2019). Presidential Regulation No. 39 of 2019 states that ODI is Indonesia’s data management policy. ODI is an effort to provide credible, accountable, and up-to-date data by building a government data center that can be used as a reference in every policy-making and implementation. Through ODI, all government data and data from other related agencies can end up in the One Data Indonesia Portal (data.go.id). A data portal is a place with data so that it is possible to search, explore, connect, download, and use data (Regulation of the Minister of National Development Planning/Head of Bappenas No. 16 of 2020 concerning Electronic-Based Government System Data Management). ODI application is a platform for managing and sharing data between regional apparatus within the government. The implementation of ODI inevitably has to be followed by all Ministries, Institutions, and Regional Governments. Apart from that, the implementation of One Data is also expected to accelerate the implementation of the Electronic Based Government System (SPBE/E-government), which is currently being finalized, both in terms of regulatory aspects and operational stages, by several related agencies which include, among others, the Presidential Staff Office, Ministry of National Development Planning/Bappenas, Ministry of State Apparatus Utilization and Bureaucratic Reform, Ministry of Communication and Information, and State Administration Institute of the Republic of Indonesia.
Each Ministry/Institution/local government agency (M/I/LGA) must have a “One Data” Portal Application in its agency, which is integrated with the One Data Central Portal Application. The data provided must comply with data quality requirements, standards, metadata, and architecture, which each agency must also determine. In current conditions, the government has many applications that hinder the development of data centers (Soegiono, 2018). For this reason, the application needs to be streamlined, and data must be transferred to the Satu Data portal application to make it more effective and efficient. To manage this data, one of the M/I/LGA needs to migrate data.
Previous research stated that: “As an industry, we face terrible data migration. Research results show a range of 40% to more than 50% of data migration projects over time, one over budget or failing altogether.” (Morris, 2020). According to Moris, this problem occurs due to IT people think of data migration as a purely technical issue. When it should be, data migration is a business, not a technical issue. And data migration analysts only have technical understanding but do not have the skills to discuss with businesses, should have because they have to collaborate with business units.
On the other hand, according to (Azeroual and Jha, 2021), data migration failures are due to a lack of business involvement, a lack of knowledge about old data, and data quality issues. This results in 83% of data migration projects exceeding their deadlines and budget and sometimes failing. A key component of many projects is migrating legacy data to a new database environment, including necessary data cleanup and reformatting. This component is a significant effort. This effort will require collaboration from data architects/analysts familiar with legacy data models and target data models, DBAs, business users, and developers familiar with legacy applications (Dama International, 2017a).
There are still data quality problems (Ouafiq et al., 2022). It is vital to check data quality first before data migration. The inconsistencies between data sources, such as format, syntax, and semantic inconsistencies, as well as poor ETL processing and data mapping (AbuHalimeh, 2022). These data problems impact the data migration process, which becomes complicated. A key component of many projects is the migration of old data to a new database environment, including the necessary cleaning and reformatting of data. This is a significant effort. Getting good quality data requires the data governance (DG) role (AbuHalimeh, 2022). This will require collaborative efforts from data architects/analysts familiar with legacy data models and target data models, Database Administrators (DBAs), business users, and developers familiar with legacy applications (DAMA International, 2009).
In previous studies, it is still rare to find research that discusses high-level data migration, mostly discussing technical data migration (Amin et al., 2021; Matthias et al., 2021). Only one study discussed the high-level data migration process involving non-technical activities, namely the approach designed by Kazimir et al. (Kažimír et al., 2012), but has yet to identify a Data Governance role. There has been no research on the role of data governance in the data migration process. The purpose of this research is to identify the Data Governance role in the data migration process.
In addressing the challenges highlighted by previous research, this study focuses on unraveling the intricacies of data migration, particularly emphasizing the often overlooked role of Data Governance (DG). The prevalent issues in data migration projects, as underscored by Morris (2020), are frequently rooted in the misconception that data migration is solely a technical concern. Such a perspective leads to a lack of collaboration between technical experts and business units, exacerbating the risk of projects going over budget or failing outright. The importance of understanding data migration as a business issue rather than purely technical cannot be overstated.
As Ouafiq et al. (2022) have pointed out, persistent data quality problems persist, necessitating a thorough examination before the migration process begins. Inconsistencies across data sources, encompassing format, syntax, and semantic variations, coupled with challenges in ETL processing and data mapping, contribute to the complexity of data migration (AbuHalimeh, 2022). Recognizing these hurdles, the data migration process becomes a multifaceted endeavor that extends beyond technical aspects.
A critical aspect of successful data migration projects is the migration of legacy data into a new database environment, entailing comprehensive data cleaning and reformatting. This monumental effort requires the involvement of various stakeholders, including data architects/analysts familiar with legacy and target data models, DBAs, business users, and developers well-versed in legacy applications (DAMA International, 2017a). The significance of data governance in ensuring the quality of data cannot be overstated in this context (AbuHalimeh, 2022).
Despite the wealth of research on technical aspects of data migration, high-level discussions on data migration, especially those involving non-technical activities, remain scarce. A notable exception is the approach proposed by Kazimir et al., which delves into non-technical aspects but has yet to explore the specific role of Data Governance. The existing research gap underscores the novelty and significance of this study, aiming to fill the void by explicitly identifying and elucidating the crucial role played by Data Governance in the data migration process. The ultimate goal is to contribute valuable insights that can guide organizations in navigating the complexities of data migration and ensuring the success of such endeavors. The primary research question guiding this study is: “What is the role of Data Governance in the data migration process, and how does it contribute to the success of data migration projects?” To address this overarching question, this research was explored using the following systematic writing: relevant theories (data migration, data governance, data governance roles based on DAMA-DMBOK); methodology; results and discussion; conclusion.
Data migration and data governance
“Data migration is the selection, preparation, extraction, transformation and permanent movement of appropriate data that are of the right quality, to the right place, at the right time, and the decommissioning of legacy data stores, to deliver the business transformation aspirations of the organization.” (Morris, 2020). The migration process approach that is the primary reference for this research is a high-level data migration approach from Kazimir et al. (Figure 1). Kazimir et al. divides the stages into six stages: define migration approach, plan and conduct data cleansing, design migration system, construct and unit test migration system, and convert data into production. A high-level approach to data migration (Kažimír et al., 2012).
Based on the DAMA Data Management Body of Knowledge (DAMA-DMBOK), data governance exercises authority/authority and control (planning, monitoring, and enforcement) over data assets. Another definition for data governance is that “data governance refers to the overall rights and responsibilities of decisions related to data management” (Al-Ruithe et al., 2018). Cheong & Chang stated that data governance is how companies manage the quantity, consistency, usability, security, and availability of data (Cheong and Chang, 2007). Data governance includes the responsibilities of legislative functions (defining policies, standards, and Enterprise Data Architecture), judicial functions (issue management and escalation), and executive functions (protect and serve, administrative responsibilities) (DAMA International, 2017a). In addition, data governance encompasses governance mechanisms, organizational scope, data scope, domain scope, antecedents, and outcomes of data governance implementation. Data governance aims to maximize the value of data assets while also addressing data-related risks (Abraham et al., 2019), improved data quality (Abraham et al., 2019), increased security (Felici et al., 2013), cheaper data-related costs (Abraham et al., 2019), and better decision-making are all advantages of these efforts (Al-Ruithe et al., 2019).
One of the scopes and focuses of data governance is (1) establishing and enforcing policies related to data management and Metadata, access, use, security, and quality; (2) Establish and enforce Data Quality and Data Architecture standards (DAMA International, 2017b). Data Governance involves many organizations and individuals, commonly called data stewardship. Each business has its own set of requirements and priorities. As a result, each corporation has a unique approach to data governance functions and operations, as well as individual roles and responsibilities. Data stewardship is the assigned accountability for business responsibilities in data management. The following are data stewardship activities that are closely related to data migration activities (DAMA International, 2009):
Creating and managing core Metadata (business glossary, valid data values, and other critical Metadata); Documenting rules and standards (business rules, data standards, and data quality rules); Managing data quality issues (identification and resolution of data related issues or in facilitating the process of resolution); Executing operational data governance activities (ensuring that data governance policies and initiatives are adhered to); In data stewardship, there are roles related to data migration as follows: (1) Data Governance Council (DGC): The highest authority for data governance in an organization. Includes senior managers serving as executive data stewards, along with the Data Management Leader and the CIO. (2) Enterprise Data Stewards (EDS) have oversight of a data domain across business functions. (3) Business Data Stewards (BDS): knowledge workers and business leaders (business professionals) recognized as a subject matter expert who is assigned accountability for the data specifications and data quality of specifically assigned business entities, subject areas or databases. They work with stakeholders to define and control data. (4) Data Owner (DO): BDS who has approval authority for decisions about data within their domain. (5) Technical Data Stewards (TDS): IT professionals operating within one of the Knowledge Areas, such as Data Architects, Database Administrators, Data Quality Analysts, or Metadata Administrator.
Technical Data Stewards involved in data migration activities are as follows: (1) Database Administrator (DBA) (2) Software Developer (SD) (3) Data Architect (DA): responsible for data architecture and data integration. (4) Data Modeler/Data Analyst (DModeler): responsible for capturing and modelling data requirements, data definitions, business rules, data quality requirements, and logical and physical data models. (5) Data Quality Analyst (DQA): responsible for determining the fitness of data for use. (6) Metadata Administrator (MA): responsible for integration, control, and delivery of metadata.
Methodology
This paper aims to identify the role of data governance in the data migration process. The research method is qualitative. The research process has four main steps. The first is assessing the as-is data migration process, policy, role, data, and application. Then, an assessment of problems that occur in the data migration process. The third is designing and mapping between high-level data migration processes based on Kazimir et al., current data migration processes, data problems, and proposed roles. Finally, validating the data governance’s roles in the data migration process using the Delphi method. The case study was conducted in one of the ministries in Indonesia which migrated data to open data to comply with Indonesia’s One Data regulation.
This study aimed to identify the role of data governance in data migration, focusing on challenges related to data quality. The qualitative research methodology involved assessing existing data migration processes, identifying problems, designing high-level data migration processes, and validating the roles of data governance using the Delphi method. The study was conducted as a case study in Indonesia, where data migration to open data aligned with Indonesia’s One Data regulation. The practical application allowed for a real-world assessment of the proposed methodologies and validation of data governance roles.
Results and discussion
The first stage is to conduct an assessment of the current condition. Ministries are required to comply with presidential regulation No. 39/2019 concerning Indonesia’s One Data. The Indonesia’s One Data or Satu Data Indonesia (SDI) is Indonesian data governance, namely, an effort to provide credible, accountable, and up-to-date data by building a government database that can be used as a reference in every policy-making and implementation (Indonesia, 2019). Through SDI, all government data and other relevant agencies can lead to Indonesia’s One Data Portal (data.go.id). Data.go.id is the official portal of Indonesia’s One Data as a manifestation of the operationalization of open data release and utilization, which is not limited to ministries, agencies, or local governments, but also all other agencies that produce data related to Indonesia. One Data Portal is a platform for managing and sharing data between Regional Apparatuses in the Government Environment (Indonesia, n.d). The application for managing open data in the ministries in the case study is called the One Data application. The implementation of SDI must be complied by all Ministries, Institutions, and Local Governments.
The results of the current conditions in the ministry on the policy aspect found two regulations concerning the organization and work procedures of ministries and one ministerial regulation regarding one data at the ministerial level. However, there needs to be a policy derivative regarding the implementation of data migration. There are six regulations on data standards and master data, but they are not up-to-date and accommodate all data in ministries.
While for the migration process, proper planning is an essential requirement of data migration projects, some risks may affect the project to be over-priced, lose data, exceed deadlines and or even fail miserably (Zaw, 2019). The current data migration process has 11 stages as shown in Figure 2. Based on the high-level data migration process approach, there are four data migration processes in organization that still need to be implemented: processes 1.3, 2.5, 3.4, and 5.2. Although they have been carried out, other processes have yet to be carried out by the right roles and are well documented, making migration work difficult. Based on the DAMA-DMBOK function (DAMA International, 2017a), the data security function has not been carried out, nor has it been identified in the approach of Kazimir et al. Data security is vital for trust, managing access rights, and data security to be migrated so that data remains safe during migration (Onkar Raut, 2022). The as-is data migration process (Source: Author).
On the data aspect, the data quality needs to be checked on data migration (Chirkova et al., 2021; Ouafiq et al., 2022). Data problems occur, such as the data standard being outdated, lack of metadata, incompatibility in the master data, and data duplication. The data migrated to the open data application (One Data application) at the ministerial level in the case study comes from three different applications and databases (Figure 3). There is also a change in the data model from the old One Data application to the new one, which has no documentation yet. Data quality work is still carried out manually and incidentally, and there are no policies and procedures related to changes to data standards. The organization still needs to get a data quality and migration tool (Cheng et al., 2022). Also, there are no reference and master data tools, data integration and interoperability tools, or data security tools that can support the data migration process. The software developer develops only a data migration tool using java. One data application system architecture (Source: Author).
Analysis and Mapping of data Governance role in data Migration Problems.
Table 1 highlights the challenges faced during data migration processes and recommends appropriate data governance responsibilities. It emphasizes the need for policies, roles, data owners, and lifecycle procedures in Process No. 1, maintaining consistency and coherence in Process No. 2, defining and enforcing data policies in Process No. 3-5, and addressing data quality issues in Process No. 6, emphasize the importance of data governance in developing data standards, establishing metadata, and identifying data owners. It also highlights the need for data migration tools and infrastructure capacity, and ensures data security across all processes.
The next step is mapping the process between as-is and The Kazimir et al. and DG role based on DAMA-DMBOK (Figure 4). Roles are categorized using the RACI Matrix: Responsible (R), Accountable (A), Consulted (C), and Informed (I). The mapping result is the Pusdatin team and software developers as executors of the current data migration. However, these roles and responsibilities have yet to be defined in detail and have not formally involved business units and team data governance. The four processes of the approach of Kazimir et al. have yet to be implemented, namely Processes No. 1.3, 2.5, 3.4, and 6.2. The mapping of high-level process and proposed data governance role (Source: Author).
The analysis results of DG activities and roles based on the DAMA-DMBOK required for data migration are seven DG activities and eight DG roles. The activities and roles are as follows (1) define the DG Operating Framework by DGC and team, BDS, and TDS (2) Develop Goals, Principles, and Policies by DGC and team (3) Underwrite Data Management Projects by DGC and team (4) Engage in Issue Management by DGC, DO, BDS, DQA, and TDS (5) Sponsor Data Standards and Procedures by DG team, DO, BDS (6) Develop a Business Glossary by DG team, DO, BDS, and MA (7) Co-ordinate with Architecture Groups by DG team, DA, and DModeler. These roles will be proposed to be involved in a high-level data migration process. They are involved in 20 processes (65%) of a total of 31 processes. Eleven processes do not involve data governance. According to the DG role, the seven DG activities are carried out in 20 processes.
The last stage is validating the DG role mapping design results based on DAMA-DMBOK at the high-level data migration process. Three experts carry out the validation and must have at least a Master’s degree and at least ten years of experience: a government agency director who understands data management and DG, a DG and data management consultant, and an IT manager and practitioner from a state-owned telecommunication company who regularly performs data migration. Figure 5 shows the role mapping design that experts have validated. The final mapping of DG Role for a high-level approach to data migration processes Kazimir et al (Source: author).
Figure 1 presents a structured model for data migration, emphasizing collaboration between technical experts and data owners. The model assigns roles at each stage and sub-stage, ensuring a holistic approach. It integrates technical and business expertise, ensuring collaboration for successful migration. The model also considers data quality throughout, with roles like Data Quality Analyst and Business Data Steward actively involved in data cleansing, testing, and resolution. The model’s adaptability allows for a tailored approach to data governance, evolving based on the organization’s existing processes and challenges. This model is unique in its granular assignment of roles and its ability to adapt to unique organizational contexts.
The final aspect is enhanced collaboration and accountability. The delineation of roles enhances collaboration and accountability by clearly defining who is responsible and accountable for each component of the data migration process. This explicit assignment of roles helps minimize ambiguity and ensures that stakeholders are aware of their contributions and responsibilities. The model’s novelty lies in its detailed and stage-specific assignment of roles, acknowledging the diverse expertise required throughout the data migration process. This approach ensures a more comprehensive and collaborative data governance framework, addressing technical and non-technical aspects and adapting to the unique challenges and context of the organization undergoing data migration.
The results of the validation were the addition of C (Consulted) and I (Informed) roles in processing 2.1, placing BDS on processes 1.1, 1.2, 1.3, 1.4, 4.2, all stages 5 and 6.3, and adding data architects and data modelers as roles consulted in processes 1.1 and 1.5. The approach of Kazimir et al. is fit for data migration. This approach takes a high-level approach to provide a detailed description of high-level processes that are often not carried out in the data migration process, even though it is the key to successful data migration.
Conclusion
The key to successful data migration lies in understanding how data migration will transition the organization into one with a robust data governance framework, articulating roles and responsibilities for data management and accountability, and creating a proactive data quality assurance culture. The study result shows that the current data migration process does not involve DG. In addition, there are no formal regulations, documentation, or business processes regarding data migration to One Data Portal. In the data migration process, there are four processes from Kazimir et al. which still need to be implemented in the ministry and add a data security function based on DAMA-DMBOK. The results showed that eight DG roles determined more than half of the success of the data migration process (65%). Non-technical activities are required in the data migration process. Applying DG to data migration will result in more effective business and technology integration and better organizational collaboration and productivity. DG’s role will ultimately improve data quality and data migration success. The limitation of this study is that it is limited to one ministry. Future research can conduct research on data migration processes in other governments and state-owned or private companies that routinely migrate data to obtain more in-depth results and analysis.
In conclusion, this study underscores the pivotal role of data governance (DG) in ensuring the success of data migration processes. The findings reveal a significant gap in the current data migration practices, as the existing process lacks integration with DG principles. The absence of formal regulations, documentation, and established business processes related to data migration to the One Data Portal further highlights the need for a more structured and governed approach.
The study by Kazimir et al. highlights four essential processes for data migration in a ministry and recommends incorporating a data security function. The research highlights the importance of data governance (DG) roles, with 65% of data migration success attributed to their effective execution. Non-technical activities also play a crucial role in data migration, enhancing business and technology integration, organizational collaboration, and productivity. The study’s limitations include focusing on one ministry, but it contributes to the growing body of knowledge on data migration and serves as a foundation for informed decision-making.
Footnotes
Acknowledgments
We would like to thank Direktorat Riset dan Pengabdian Masyarakat (DRPM) from University of Indonesia for funding this research through the PUTI Q2 2024-2025 Nomor: NKB-596/UN2.RST/HKP.05.00/2024.
Author contributions
The first five authors are the main contributors and contributed equally to this paper. Each author's contribution is as follows: Alivia Yulfitri: data collection, analysis, validation, and paper writing. Dana Indra Sensuse: supervision, methodology review, conception and design of the study, and funding acquisition. Deden Sumirat Hidayat: Identification of research gap based on literature review, analysis, validation, and paper writing. Ryan Randy Suryono: Draft review, verification, and correction. Kautsarina: Draft review, validation, and review. Anton Satria Prabuwono: Supervise and design the analysis.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Direktorat Riset and Pengembangan, Universitas Indonesia; PUTI Q2 2024-2025 No: NKB-596/UN2.RST/HKP.05.00/2024.
