Abstract
AI solutions can significantly leverage open government data (OGD) ecosystems in public governance. For that, it is important to design effective and transparent governance mechanisms that create value in an OGD ecosystem through AI solutions. This article develops a conceptual model for a systematic design of an OGD governance model, which adopts a platform governance approach and integrates the governance needs derived from the use of AI. The purpose of the conceptual model is to systematically identify and analyze the interrelationships among multiple change factors on OGD governance design and to project available AI-based solutions for the OGD ecosystem by assessing the managerial, organizational, legal, technological, moral, and institutional variances. The proposed ‘6-step model’ suggests that an AI-compatible OGD ecosystem design requires (i) identifying contingencies, (ii) identifying data prosumers, (iii) assigning data governance roles, (iv) identifying design values, (v) designing the governance of AI, and (vi) designing the governance by AI. Through the recursive and reflexive analysis of each step, policymakers and system designers can develop reliable strategies in leveraging AI solutions for the use of OGD in public governance.
Keywords
Introduction
The data-driven transformation in public administration is hinged on the theoretical premises of open government data (OGD) (Van Ooijen et al., 2019). OGD refers to open data that are produced or commissioned by public bodies (Ubaldi, 2013). Open data is defined as the data or the content that can be freely used, reused, and distributed by anyone, only subject to the requirement that users attribute the data and that they make their work available to be shared as well. Governments, civil society organizations, and private sector representatives consider OGD as a building block for open government, as they see it as a key enabler of improved service delivery, transparency, and public engagement and as a result of better relations between governments and citizens (Ubaldi, 2013).
OGD initiatives are expected to foster democratic and economic processes by promoting transparency, participation and collaboration, and provide opportunities for the development of new products and services (Ruijer & Meijer, 2020). However, current OGD practices suffer from technical, social, institutional/organizational, legal/ethical, economic, operational, political/policy/strategic challenges (Hossain et al., 2016). Additional governance challenges emerge when considering the actual use of OGD (Reggi & Dawes, 2016). On the supply side, OGD programs are often designed not for citizens but for technical experts and intermediaries, and the lack of institutional processes for dialogue prevents the integration of public feedback into existing strategies and programs (Janssen et al., 2012). On the demand side, the lack of incentives, interpretive tools, and contextual and technical knowledge among users (e.g. lack of citizen skills, digital divide) can prevent meaningful data use (Criado & Gil Garcia, 2019; Barry & Bannister, 2014).
Recently, the governance challenges of OGD have begun to be assessed in the larger institutional landscape where government organizations operate. This approach has been conceptualized by the ecosystem metaphor to assess the complex dynamics among different actors and concerns in the public governance domain (Maretti et al., 2021; Bonina & Eaton, 2020; Safarov, 2019; Millard, 2018; Lee et al., 2018; Donker & Van Loenen, 2017; Van Schalkwyk et al., 2016; Dawes et al., 2016; Reggi & Dawes, 2016). In this article, OGD governance refers to the formal and informal arrangements that determine how public decisions are made and how public actions are carried out in an OGD ecosystem.
In a similar vein, the advancements in big data and artificial intelligence (AI) solutions call for the reevaluation of effective models for OGD governance. Studies show that the combination of AI with OGD has a huge potential to improve efficiency, innovation, and crime prevention in public governance, but AI is hardly used by the public to create value from open data (Gao & Janssen, 2020). Furthermore, risks of data privacy, and arriving at biased or wrong conclusions undermine the usability of AI solutions in OGD ecosystems (Gao & Janssen, 2020). Therefore, it is important to design effective and transparent governance and control mechanisms for policymakers to create value in an OGD ecosystem through AI solutions.
Yet, design challenges for AI-compatible OGD governance are not limited to the OGD ecosystem but also stem from the socio-technological dimensions associated with the use of digital technologies in the public governance domain. A particular challenge for the use of digital technologies in public governance processes is the ‘governance of’ and ‘governance by’ configurations (Tan et al., 2021; Ølnes et al., 2017). ‘Governance of’ refers to the design choices associated with digital technologies in accessing and using the underpinning data infrastructure. The ‘governance of’ dimension focuses on how the data infrastructure affects the usage of the technology and links the data acquisition and data processing mechanisms in overall public policy processes. The rules and standards in acquiring and processing the available data, and how AI affects data processing and exploitation stages fall under the ‘governance of’ considerations. As such ‘the governance of’ AI is closely linked to the supply side of OGD ecosystems. ‘Governance by’ refers to the use of AI in policymaking and policy implementation. Unlike the technical emphasis of the former approach, ‘governance by’ prioritizes the techno-social power dynamics and control mechanisms in the use of digital technology in public policy processes by looking into the role ascriptions of automated systems and human agents in the overall public governance. ‘Governance by’ configurations closely linked to the demand side of the OGD ecosystem. A model for OGD governance that is compatible with AI solutions should comprise both aspects of AI governance.
In the literature, we lack a comprehensive, methodical tool that brings these wider dimensions together in designing effective OGD governance models. Hence, the central research question is ‘how can public sector organizations design an AI-compatible OGD ecosystem for public governance?’ Consequently, the central objective of the paper is to provide a design artefact for OGD governance that complies with the particularities of AI governance. This central objective is divided into three consecutive sub-objectives. First, by elucidating the theoretical approaches to the OGD ecosystem, to pinpoint the design principles for OGD ecosystem governance. Secondly, based on the theoretical and empirical cases, to identify the key challenges of AI governance, namely the ‘governance of’ and ‘governance by’ configurations for AI-based solutions in the public sector. Thirdly, to develop a systematic decision-making tool to design a governance model for an OGD ecosystem, which integrates the governance needs derived from the use of AI. The purpose of this design artefact is to systematically identify and analyze the interrelationships among multiple change factors on governance design and to project the available design options based on the managerial, organizational, legal, technological, moral, and institutional variances in the OGD ecosystem.
Methodology
Design science research is a major component of information sciences (Hevner et al., 2004, March & Smith, 1995). Design science research creates and evaluates IT artefacts intended to solve identified organizational problems (Hevner et al., 2004). Such artefacts may include constructs, models, methods, and instantiations with an embedded solution to an understood research problem (Peffers et al., 2007; Hevner et al., 2004).
The research objective of this paper is to develop a design artefact that can help policymakers to design an AI-compatible OGD ecosystem for public governance. Design science research methodology (DSRM) is a specific framework developed by Peffers et al. (2007) for the design of such artefacts in information sciences. DSRM incorporates principles, practices, and procedures to meet three objectives of the design artefact: it is consistent with prior literature, it provides a nominal process model for doing design science research, and it provides a mental model for presenting and evaluating design science research in information sciences. To achieve the first objective, the proposed design artefact needs to be consistent with the concepts in prior literature. In this paper, the first objective is achieved by elucidating the theoretical and methodological approaches to OGD governance and AI governance to pinpoint the key elements for the research artefact. Section 3 provides an overview of different OGD governance models in the literature and concludes by highlighting the commonalities in the design of OGD governance. The models are selected based on their differences in the design approaches.
The second objective of DSRM is to provide a nominal process that describes research entry points in the design of the artefact. Peffers et al. (2007) argue a nominal process helps researchers ‘to understand the essential elements of empirical information science research’. In this paper, this second objective is achieved by analytically categorizing different essential elements in the design of OGD governance, and by analyzing the key components of AI governance through the categories of ‘governance of AI’ and ‘governance by AI’. The essential elements that are used in the design of OGD governance models are presented in Section 3, and the essential elements that are used in the design of AI governance are presented in Sections 4 and 5.
The third objective of DSRM is to provide a mental model for the presentation of research outcomes. A mental model refers to “small-scale [model] of reality … [that] can be constructed from perception, imagination, or the comprehension of discourse. [Mental models] are akin to architects’ models or to physicists’ diagrams in that their structure is analogous to the structure of the situation that they represent” (Johnson-Laird & Byrne, 2000). A mental model should provide us with some guidance, as reviewers, editors, and consumers, about what to expect from design science research outputs (Peffers et al., 2007). The mental model is constructed through the operational integration of essential elements identified in Sections 3, 4, and 5, and by defining the decision-making processes in designing an AI-compatible OGD ecosystem for public governance (see Fig. 1).
The final objective, the evaluation of the model, is achieved by demonstrating the application of the model through a project conducted in Belgium. The main task of the project was to construct an OGD governance model for the Belgian federal government that can utilize AI solutions such as predictive analytics in the fight against taxation and social security fraud. The project was conducted from 2020 to 2022 through a team of Belgian researchers from political sciences, legal studies, and information sciences. Section 6 elaborates on the different methods used to collect and integrate empirical data in the design of the governance model in Belgium.
By following this methodological approach, the structure of this paper is as follows. Section 3 categorically presents the existing theoretical approaches to OGD ecosystem governance. Sections 4 and 5 elaborate on the governance challenges associated with AI solutions in public administration following the ‘governance of’ and ‘governance by’ dimensions. Section 6 presents the systematic decision-making tool for the OGD ecosystem design in public governance and elaborates on the design choices through lessons learned from a similar project conducted for the Belgian federal government. The conclusion section summarizes the main findings of the paper and shares recommendations for further research.
Conceptual approaches to OGD governance
The literature provides different conceptual models and approaches to the OGD governance design. Based on the existing models in the literature, seven different approaches to OGD governance design are identified. The end of the section presents the common denominators of these approaches and highlights the key determinants.
The first approach to OGD governance design is based on policy processes in the OGD ecosystem. Reggi and Dawes (2016) identify two policy cycles in the governance of OGD. One policy cycle addresses the innovation potential of OGD, and the other addresses how OGD might support democratic values of participation and accountability. The model integrates the diverse goals of actors in the open data ecosystem and allows the assignment of different role definitions for the intermediaries between data providers and the beneficiaries of OGD products for innovation, participation, and accountability purposes. This model allows the integration of techno-social power dynamics in the assessment of governance design and tracing the influence of actors on the policy processes.
The second approach to OGD governance design is based on contingencies. For instance, Lee et al. (2018) distinguish the external and internal contingencies for decentralized and centralized governance approaches in terms of the architecture design of OGD governance. External contingencies refer to the environmental context of a public sector organization, such as regulative framework, market structure, social and economic dynamics, political and institutional factors. Internal contingencies refer to the design choices associated with the platform governance conditions such as degree of control, type of control and strategies for governance. The strength of the model is that it creates a dynamic link between the policy goals and the underlying system infrastructure in the data governance. As such, the model allows estimations of policy outcomes based on the changes in the data governance architecture and changes in the external and internal contingencies (e.g. organizational, regulation, or policy changes).
The third approach in the literature is based on system design thinking. Systems thinking is a holistic approach that focuses on the way that a system’s constituent parts interrelate and expounds on how systems work over time and within the context of larger systems (Abbas et al., 2018). Following the systems thinking approach, Millard (2018) defines open governance as linking and integrating the worlds inside the government as well as linking and integrating these with the worlds outside government for the specific purpose of creating public value. Accordingly, he identifies three main components under open governance systems: open assets, open services, and open engagement. Being built on the intersections of open assets, open service, and open engagement, open government is expected to act as a platform, where the government can support a range of actors to collaborate with each other, as well as with the government itself, by facilitating and orchestrating engagements, managing assets, and providing tools to generate public value (Millard, 2018).
The fourth governance approach to the OGD ecosystem focuses on the operational processes in the production and reuse of open data. In this approach, open data is linked to an open governance strategy in which the government builds an open system that interacts with its environment. In an exemplary model developed by Maretti et al. (2021), the governance choices at the macro-level pertain to the operational processes that are the basis of the open data system, including the digital strategy of the public administrations and the problem of protection and use of data. At the meso-level, governance choices pertain to the administrative processes of digital public administration. At the micro-level, governance choices pertain to the concrete use and reuse of open data in formal and informal teams among organizations in public administration and civic hackers. In this model of OGD governance, the ultimate governance design purpose of the government is to structure a participatory system that creates economic, political, social, operational, and technical benefits for participants.
The fifth approach to governance design of an OGD ecosystem is based on an organizational or institutional perspective. This approach relies on the platform theory from strategic management and information systems in the identification of the governance design constructs and distinct approaches for the implementation of OGD (Bonina & Eaton, 2020). The conceptual model developed by Bonina and Eaton (2020) for the governance of an OGD ecosystem brings together those constructs (i.e., core architecture, peripheral architecture, platform owner, contributors, developers, tools, rules) to analyze the organizational tools and resources in the supply and demand sides of open governance, as well as the institutional factors affecting the overall platform performance. In another study, Safarov (2019) summarizes these institutional dimensions as policy and strategy, legislative foundations, organizational arrangements, relevant skills and educational support, and public support and awareness concerning open data.
The sixth possible approach to governance design is based on the management perspective. In an exemplary model for open data assessment framework, Welle Donker and van Loenen (2017) identify governance as a framework of policies, processes, and instruments that structure the interaction between public sector entities and/or private sector entities to enable parties to reach their common goals. In their model, they identify five elements (i.e. vision, leadership, communication, self-organizing ability, and long-term financing) for assessing data governance in open data ecosystems. These elements are also influential on the data supply, and along with the user characteristics, they constitute the input to the ecosystem for the successful reuse of OGD. Welle Donker and van Loenen (2017) underline those other important aspects such as the legal and policy frameworks and draw a clear demarcation between public tasks and private sector activities that additionally affect the performance of the overall ecosystem.
The last approach to governance design is based on the commons approach. Commons are based on the principles of bottom-up self-organization, the freedom of collective agency, polycentrism (multiple loci of governance), and subsidiarity (management at the lowest feasible level) (Bollier, 2019). According to Ostrom (1990), the world of natural resources is divided through the axes of rivalry and excludability, where the common goods refer to the rivalrous and non-excludable resources. More recently, the governance of online communities has attracted the attention of researchers (Hess & Ostrom, 2007), which treat open data as a common good. Fuster Morell (2014) identifies eight critical aspects that define the direction of online creation communities (OCC) governance: collective mission or goal of the process; cultural principles/social norms; design of the platform of participation (where regulation is embedded in the code); self-management of contributions; formal policies applied to community interaction; license; decision-making and conflict resolution systems concerning community interaction; and infrastructure provision. These eight dimensions are linked to each other through the infrastructure provision. According to Morrell, infrastructure provision can be modeled across two axes: open versus closed to community involvement in infrastructure provision, and freedom and autonomy versus dependency on infrastructure. Based on the empirical analysis of fifty statistical cases and four case studies, Morrell clusters four provision models for OCC governance: corporation service (which is the case of Flickr), mission enterprise (wikiHow), autonomous representational foundation (Wikipedia), and assembly or assemblarian self-provision models (openesf.net).
Despite some overlapping dimensions, each conceptual approach emphasizes unique aspects based on the scope of governance activities. Notwithstanding the conceptual differences, we can highlight the following common denominators from the existing models on the design of OGD governance;
OGD governance design depends on contextual factors such as the regulatory framework, organizational capacities, organizational culture, policy domain, ethical principles, public policy objectives, etc. The available design choices for OGD governance are contingent upon these contextual factors. OGD governance is most suitable for a platform ecosystem model where government and non-governmental actors can share and reuse the data through the platform. Not only the capacities of actors involved in the platform ecosystem but also regulations, institutional design, and the market structure influencing OGD are important for the effectiveness and sustainability of the governance design. Policy goals, principles, and strategies on data governance, as well as the managerial, technical, and administrative capacities in the ecosystem determine the characteristics of the platform ecosystem and the system architecture design for the use of digital technologies. OGD governance design needs to address the role of the actors in the platform ecosystem separately as individuals and in communities, as well as holistically assess all the constituent parts of the ecosystem in setting the rules of engagements in the use, reuse, and share of data.
This section expands on the theoretical basis of OGD ecosystem governance with the governance requirements resulting from the configurations of AI-based systems in public administration. Particularly, the section elucidates the governance of AI-based solutions in addressing data acquisition and data processing challenges in the OGD ecosystem.
Data acquisition
The AI system acquires data by either conducting a primary data ascertainment through sensor systems and human data input, or by accessing available databases (e.g. machine logs, clouds, or Internet databases) in a secondary data ascertainment (Wirtz & Müller, 2019). A data acquisition system samples the data input and transforms it into machine-readable data, while the software processes the acquired data for storage or presentation. The data feeding the system and the medium technologies and storages integrated into the system are key governance considerations in data acquisition. Challenges concerning the database size, data integration, data quality, and data standards (i.e. how and what data is collected, and what format it is stored in) can undercut the effectiveness of data processing and AI predictions.
For a successful implementation of AI strategies and programs, organizations must have access to a base set of data and must maintain a constant source of relevant data to ensure that AI can be useful in the selected policy domains. The input data can be in a multitude of formats such as text, audio, images, and videos. The wide range of sources to collect and store these data adds to the governance challenges. For successful predictions, all the relevant data must be integrated in a manner that the AI can understand and transform into useful results. A technological challenge for AI-based systems is to analyze unstructured data. For example, in healthcare, medical imaging represents a large share of relevant data, which even the most advanced AI-based systems (e.g. Watson) cannot read directly (Sun & Medaglia, 2017). This means that depending on the data source, the AI system may need to be complemented with human experience.
Data quality is another core challenge in data acquisition. AI performs best when it has a good amount of quality data available to it. Therefore, AI solutions built on big data augment the performance of AI-based predictions. But big data pools contain different types of data from different origins that need contextualizing for analyses and reports. Greg Hanson from Informatica argues that for well-curated data, enterprises should build a catalog of data assets and use engineering tools with AI built in the backend (Ross, 2019).
In the public sector, another challenge for data quality is the rules and standards employed in publicly available data. For example, GDPR obliges the purpose limitation principle on data acquisition, which may limit the pattern recognition functions of machine-learning systems. Therefore, not only the quality of data, but the variety of available data can affect the performance of AI applications.
The performance of AI is also related to the quality of the training data. Here, bias embedded in the training data is one of the biggest challenges that AI faces (Rosso, 2018). Often, data sources are laced with racial, gender, communal, or ethnic biases (Manyika et al., 2019). Biases embedded in the training data could easily lead to discriminatory and unjust consequences in policymaking and implementation processes. For example, the “Correctional Offender Management Profiling for Alternative Sanctions” (COMPAS), a system to predict whether a defendant would re-offend, was found to be as successful (65.2% accuracy) as a group of random humans (Dressel & Farid, 2019) and to produce more false positives and less false negatives for black defendants (Müller, 2020).
Furthermore, data privacy in data acquisition, and how to achieve an appropriate balance between privacy and data acquisition is another pressing issue in AI adoption (Begg, 2009). Hence, the ethical challenges (e.g. protection of privacy rights, self-sovereign identity, transparency in data provenance) accompanying data acquisition processes need to be addressed for the wider adoption of AI-based systems in public services.
Data processing
The acquired data is stored in data servers for data processing. For security or privacy issues, some organizations may need to store data on in-house data servers. Those organizations need to maintain the cost of in-house data servers and a technical support team. Cloud-based alternatives, centralized government data silos, or digital crossroad data centers present more cost-benefit-friendly solutions and effective upscaling of AI-based systems. However, the interoperability considerations and administrative burden in data acquisition can undermine the appeal of these alternative data storage options for organizations.
The acquired data is processed by algorithms and machine-learning techniques. The quality of the human resources and available software solutions in data analysis are some key considerations for public sector organizations. The massive computing power necessary to process big data to build an AI system, and to utilize data-intensive machine learning systems such as deep learning, can also technically and financially challenge organizations. The effectiveness of AI-based solutions requires high-end processing power, and the cost of large infrastructure requirements are considered as impediments to the adoption of AI technology (Roberts, 2017). In addition, there is a high demand for a limited number of AI experts, which is associated with the increasing cost of education and salaries (Bughin et al., 2017). Cloud computing environment and outsourcing can mitigate some maintenance costs. Nevertheless, organizations need to plan in advance the cost and technical requirements to maintain higher computational speed requirements along with higher availability of data to scale up their AI-based systems.
The output procured by the machine needs to be presented in an easily interpretable way by the user. A caveat on data visualization is the phrasing of wording and visuals in the analysis results. The confidence intervals of the results may vary depending on the employed algorithm or statistical techniques, but studies on human cognitive bias show that the framing of information and behavioral factor tend to affect the interpretation of results by the users (Kahneman, 2013). There are ongoing technical efforts to detect and remove bias from AI systems, but these efforts are considered in the early stages (Yeung & Lodge, 2019). A mathematical notion of fairness and technological fixes have their limits in overcoming systematic and cognitive biases in each social context (Selbst et al., 2019).
Governance by AI
‘Governance by’ refers to the use of digital technology in policymaking and policy implementation. The following subsections will elucidate the governance challenges associated with AI-based solutions in addressing policymaking and policy implementation challenges in the OGD ecosystem.
Policymaking
Eggers et al. (2017) presume that AI can change the role of humans in the policymaking processes in four ways. First, AI can relieve public workers by taking over repetitive tasks, allowing public servants to focus on more valuable tasks. Second, AI can help to break up a job into smaller tasks and can take over as many of them as possible, improving the efficiency of policymaking processes. Third, AI can replace human agents by automating policymaking processes. Fourth, AI can augment the performance of public servants by complementing their skills and improving the effectiveness of policymaking processes. While each scenario infers diverse efficiency gains in policymaking, their disruptive effect on human agents in administrative and policymaking processes varies drastically.
By the same token, according to Scherer (2015), AI presents two risks in policymaking, namely the loss of predictability and the loss of control. Loss of predictability is caused by big gaps in processing capability between AI and human agents, where AI can process huge amounts of data at high speeds beyond the capabilities of human agents, making the results no longer comprehensible and verifiable for humans. Thierer et al. (2017), therefore, define AI-led information processing systems as ‘black box’ for human end-users, turning their role as data feeders and recipients of results without the ability to control the validity of methods and criteria in policymaking. Loss of control refers to the dislocation of human control in influencing the system operations. The self-learning mechanisms of AI allow it to reprogram itself for process optimization, and the personnel responsible for maintenance and monitoring of the system can partially or completely lose the ability to realign the system with policy objectives.
Furthermore, AI-led policymaking is stranded by the legal, moral, and ethical framework conditions. Human decisions in public policy are political in nature and soft skills such as ethical trade-offs, social rules, empathy, humanity, and conscientiousness have a conclusive influence on the outcome of decision-making processes and their subsequent evaluation (Wirtz & Müller, 2019). For the moment, AI technology lacks these human cognitive qualities and has limitations to take over human roles in public decision-making processes (Andersson et al., 2012).
The use of AI in policymaking also raises concern about its behavioural impact on human-led decision-making processes. For instance, a study on an automated profiling system for unemployment claims in Poland has found that less than 1 in 100 decisions made by the algorithm have been questioned by the responsible clerk (Misuraca & Van Noordt, 2020). Behavioural factors such as lack of time to ponder the details, fear of repercussions from the supervisors, and a belief in the objectivity of the process appear as driving influences behind the behaviour of clerks, making human-led decision-making processes practically automated systems (Kuziemski & Misuraca, 2020).
By taking into consideration these potential challenges, Janssen and Kuk (2016) envisage governance by AI can at best be used only on mundane tasks. Similarly, Eggers et al. (2017) underline that AI is most suited to handle repetitive, highly structured, and regulated work processes. They recommend organizations issue a work structure and process analysis regularly to draw up the respective areas of application for AI in policymaking. However, this caveat does not necessarily pertain to the capabilities of AI technologies, but rather to the task encroachment and accountability risks imposed by advanced AI solutions in public administration, as a growing number of use cases show that AI solutions are becoming more attuned to handle complex and cognitive tasks (Susskind, 2020).
Policy implementation
AI can replace the role of human agents in the delivery of public services, and engage with end-users, not only in policymaking but also in policy implementation. AI-led bots and AI-powered digital interfaces can especially replace public servants for repetitive and predictable tasks. Many authors even highlight the risk of increasing the replacement of human agents in healthcare, education, logistics, and organizational processes, which raises the threat of mass unemployment in various public sector areas (Susskind & Susskind, 2016).
Nonetheless, the wider usage of AI in policy implementation can also create new jobs and roles in public services. Skills in algorithms and the use of AI in systems have already become high-demanded skills in the job market (CEDEFOB, 2020). Not only the technical staff but also public managers are required to enlarge their working capabilities through AI usage and to develop a better understanding of how AI can supplement the workforce to achieve better results faster (Eggers et al., 2017). The challenge is, however, the technical and managerial staff or the people working in the jobs at risk do not necessarily have the required skills and formal training to ease the pressure on human resource management in public administration.
The impact on the workforce is only one governance consideration for AI usage in policy implementation. AI safety and end-user behaviours are other concerns in the governance by AI. AI safety refers to assuring the secure performance and impact of AI (Boyd & Wilson, 2017). These safety concerns not only relate to issues of information security but also to complex and safety-critical situations resulting from circumstances where the AI may learn negative behaviour from its environment and misunderstand its surrounding (Conn, 2017). Taking the necessary precautions and safety measures is important for the scope of AI applications. Bostrom and Yudkowsky (2014) underline the necessity of AI technology to be resilient against adverse manipulation by humans. For AI applications based on reinforcement learning and automated executions, safety measures need to be in place to avoid catastrophic consequences. It is also important to avoid negative side effects to the working environment during the learning process of AI applications (Amodei et al., 2016).
Lastly, studies show that the uptake of new services by AI-bots has not been particularly high, leading the authorities to believe that some form of proactive marketing would be necessary to change citizens’ behaviour (Kuziemski & Misuraca, 2020). Kuziemski and Misuraca (2020) assert that to pursue such projects further, local authorities need to develop user experience and awareness. However, user experience and awareness cannot be enhanced solely by marketing practices, and trust in public administration is important to facilitate the transition. Especially, recent studies suggest that public service processes have a significant impact on citizens’ trust in public administration, and in particular the absence of corruption is a strong institutional determinant in trusting public administration for the use of digital technologies (Migchelbrink & Van de Walle, 2019). Therefore, holistic approaches are needed in service design to leverage AI technology in service provider and citizen interactions.
A conceptual model to design AI-compatible OGD governance
This section presents a conceptual model to design an OGD governance model, which adopts a platform governance approach and integrates the governance needs derived from the use of AI.
The purpose of this tool is to systematically identify and analyze the interrelationships among multiple change factors on OGD governance design and to project available AI-based solutions for the OGD ecosystem by assessing the managerial, organizational, legal, technological, moral, and institutional variances. Through the recursive and reflexive analysis of each step, policymakers and system designers can develop reliable strategies in leveraging AI solutions for the use of OGD in public governance. The methodological description of each step is elucidated below through the practices undertaken during the development of the OGD governance model for the Belgian federal government in the fight against taxation and social security fraud.
Figure 1 presents the six steps in the design of an AI-compatible OGD governance. Each of these steps is elaborated on below.
The 6-step model in designing AI compatible OGD governance.
Step 1: Identifying contingencies
According to Lee et al. (2018), the characteristics of platform governance are influenced by external and internal contingencies. External contingencies refer to the environmental context of a public sector organization, such as regulative framework, market structure, social and economic dynamics, political and institutional factors. For example, in the OGD context, some external contingencies can be related to the legislation and regulations governing the collection and reuse of publicly available data. External contingencies also pertain to the data feeders in the ecosystem. For instance, the sources of data and its formation (single or multiple sources), the role of public and/or private organizations in the ecosystem, and the data quality and control standards are some aspects of market structure.
External contingencies also influence the choices in internal contingencies. Compared to external contingencies, internal contingencies are the more immediate and direct causes of the characteristics of the platform. Internal contingencies are platform strategy, multi-homing strategy, governance configuration, open strategy, and platform maturity (Lee et al., 2018). Platform strategy pertains to the rules on access, control, data provenance, conformance, and monitoring of the quality of OGD. Multi-homing and open strategy respectively pertain to the rules of access and the reuse of OGD by the platform users. Governance configuration pertains to how a desirable behaviour in the platform is achieved based on authority- or trust-based formations. Lastly, platform maturity pertains to the level of participation and the critical mass of data accumulated in the platform.
Hence, the first step in OGD governance design is the identification of external contingencies that may impede the available design choices in platform governance. External contingencies are used to assess the impact of internal contingencies such as existing strategies, internal and external rules, and structures in the data ecosystem. In the Belgian case, 66 expert interviews have been conducted with various stakeholder organizations in the data ecosystem. Additionally, vignette studies are conducted with citizen panels to understand the trust determinants in data sharing and the use of data in predictive analysis. Following the data analysis, various ‘trust’ and ‘interoperability’ conditions are identified as contingency factors in the design of platform governance. It is important to highlight that the contingencies also depend on previous experiences with data-driven decisions and the use of AI applications in government (see the feedback loop in Fig. 1). For example, political scandals and previous cases with biased decisions in AI applications (e.g. SyRi Dutch child benefit scandal)1
Rechtbank Den Haag, 5 februari 2020, Zaak no C-09-550982-HA ZA 18-388, ECLI:NL:RBDHA:2020:865 (ECLI:NL: RBDHA:2020:1878 for the English version).
Step 2: Identifying data prosumers
Data prosumers are the organizations, individuals, and automated agents supplying raw and/or processed data in the platform and reusing the data for public, commercial, social, or academic purposes. In a platform ecosystem, the data prosumers can be public or private, and they could be communities or individual users that can access and reuse the data available in the platform. This step is not only about identifying data providers (and beneficiaries of OGD) but also about assessing the existing capacities among prosumers, and about cultivating the data for the growth of the platform ecosystem. For example, in a system dominated by a few main data providers, the data storage and processing capacities of those organizations would eventually delimit the growth and the big data potential of the ecosystem. Depending on the technological choice in AI-based solutions, the role of data prosumers may vary. For example, in big data-led AI systems, private sector organizations holding large amounts of data may have a more important role in data production. In the Belgian case, a living lab approach through exploratory and cocreation processes is used to identify the key data prosumers and related capacity challenges in the platform governance. 18 participants from the key stakeholder organizations in the taxation and social security domains in Belgium discussed the main challenges and proposed solutions concerning the integration of big data and AI in the data ecosystem of the federal government.
The ‘exploratory’ phase of the living lab has helped identify the challenges in the existing data ecosystem to implement AI-led solutions, and the type of data prosumers needed to address various capacity challenges such as lack of resources and data expertise within the administrations. The co-creation phase was supported by three 4-hour-long collaborative workshops among identified stakeholders and included both plenary and sub-group discussions to maximize free speech for all participants. Two scenario workshops were organized where the scenarios about the possible governance designs are used as a catalyst to stimulate the debates among stakeholders. These two workshops focused on the identification of the main challenges to the integration of AI and advanced algorithms in federal public organizations. Participants’ arguments were thematically analyzed, and the results were directly put to contribution in a third, solution-oriented workshop which gathered the participants from the two previous workshops. The last workshop focused on the identification of solutions to the main challenges highlighted by the scenario workshops. A thematic analysis of the collected data allowed for the construction of the governance model prototype.
Step 3: Assigning data governance roles
Lee et al. (2018) identify four main data governance roles in a platform ecosystem: data committee, data manager, data owner, and data subject. The data committee is responsible/accountable for clarifying the role of data in the platform ecosystem (Khatri & Brown, 2010). It makes decisions about the purpose of data use, desirable behaviours, and appropriate governance mechanisms, aligning business goals. The role is generally taken by the orchestrator of the platform owner. Data manager refers to the role of data management in the platform ecosystem including data collector, data steward, and data custodian. It is responsible for the implementation of data management tasks and verifying the conformance of data governance rules. The role also can be shared with platform users as they can monitor or audit the use of data. The data owner is an individual (or company) who owns data by uploading user content or profile, or by providing the result of analytics jobs to the platforms. The data subjects are the people whose personal data are used in the ecosystem and who are identified or identifiable (Art. 4.1 of the GDPR). The process of assigning data governance roles in an OGD ecosystem requires interactions between public and private stakeholder organizations. The level of trust vested in public and non-public actors as well as ownership of AI technologies may determine the role and power distributions in platform governance. A key challenge for system designers is to create a transparent and inclusive governance mechanism in the distribution of data governance roles that engender legitimacy and trust in public governance processes. In the Belgian case, the cocreation phase of the living lab has been used to develop solutions to the previously identified challenges, and to match the solutions with specific roles in data governance based on the data prosumer roles and contingency factors (e.g. interoperability conditions) identified in previous stages.
Step 4: Identifying design values
Design values stem from the internal and external value propositions associated with the use of OGD and AI in public governance. Internal values represent the value of OGD and AI-based solutions to a particular organization based on organizational, technical, and managerial investment requirements and expectations. External values stem from the expectations of the platform users and beneficiaries on the (re)use of OGD and AI in public policy processes and service provisions. For example, improving the efficiency and effectiveness of policymaking processes concerning tax fraud detection by leveraging machine learning algorithms is an example of an internal value proposition. Alternatively, protecting the privacy rights and business interests of platform users are forms of external value propositions. An effective governance design process needs to identify the value propositions associated with the way OGD is used and align the value propositions with data governance goals, in an order of importance.
Undoubtedly, the values are subjective to the actors involved in data governance and the choice of particular value propositions might incur trade-offs and political costs. Therefore, it is not an easy task for policymakers to create an ordinance among value propositions. It is also possible different prosumers may have contradictive value propositions or expectations in data governance. Therefore, the involvement of the prosumers in this stage is necessary to understand the value hierarchies of prosumers and thereby to design a more legitimate platform governance. In the Belgian case, we utilized three sources of information to map out various design values stemming from citizens, public sector organizations, and stakeholder organizations. Experiments are conducted to understand the impact of trust, privacy, and transparency considerations in the adoption of big data and AI solutions by the Belgian government. Thematic analysis is conducted on interview data with Belgian civil servants and stakeholders to understand the key value propositions for the adoption of big data and AI solutions. Living labs have been utilized to identify value-related challenges and to create an ordinance of solutions for the identified challenges.
Step 5: Designing the governance of AI
Preferences in the design values directly influence the choice of AI-based solutions to process the data in the OGD ecosystem. AI-based solutions (e.g. machine learning, advanced analytics techniques) may have different governance attributions in the system architecture choices, with different value propositions and trade-offs influencing the architecture design of the platform ecosystem. Therefore, the design principles and data governance goals set in the previous steps delimit the choice with AI solutions and type of data (e.g closed or private) brought in data acquisition and data processing processes. Additionally, considerations about technology readiness and maturity, associated organizational and human resource capacities, transparency, explainability, and interoperability of AI solutions are other important issues that can delimit the available choices in AI solutions and their inclusion in the OGD ecosystem. In the Belgian case, the solutions and choices made in the previous steps have been tested through additional experiments with citizen panels and through online surveys with prosumers and key experts using Delphi methodology to test the feasibility, efficacy, suitability, sustainability, and simplicity of governance design choices.
Step 6: Designing the governance by AI
The last step in the governance design of an AI-compatible OGD ecosystem is evaluating the adaptability of policymaking and policy implementation processes with AI solutions. Techno-social power dynamics among data prosumers, control and incentive mechanisms to share and (re)use data in public policy processes, safety, and privacy considerations, and the role distributions among automated and human agents in public governance must be considered. The policy choices on the use of AI solutions in the OGD ecosystem and their policy outcomes are expected to affect, in time, the roles of prosumers and their involvement in platform governance. Keeping a citizen-centric view in addressing the governance challenges concerning the imbalance in access to the platform, ensuring the interpretability of open data, and ensuring the role of citizens as the controllers of how open data is used in AI-led policy processes are vital. Eventually, it is important to envisage effective risk and change management mechanisms to adjust changes in user behaviours. For that, there is a need for periodic re-evaluation of the relevance of the existing policy goals and contingencies by strategic planning and management teams, to ensure the sustainability of the OGD governance design.
The integration of OGD ecosystems with AI-based solutions can significantly improve the effectiveness of data-driven policies in public governance. In this article, first, the existing governance approaches and key design considerations in the governance of an OGD ecosystem, and later governance design challenges derived from the technological properties of AI-based solutions in the public sector domain are identified. Through the synthesis of governance challenges associated with OGD ecosystem design and the use of AI in the public domain, a conceptual model is developed to act as a design artefact for policymakers to develop reliable governance strategies for exploiting AI and OGD in public governance. In the article, lessons learned and practices used in the Belgian case have been utilized to demonstrate the practical implementation of the conceptual model to design an OGD governance compatible with AI solutions. With this tool, policymakers and system designers can systematically analyze the compatibility of available AI solutions to the needs and requirements of data prosumers in an OGD ecosystem. Policy choices with values and contingencies are dependent on the area of application, maturity of existing data governance systems, and institutional contexts.
AI systems in public governance have the potential to benefit all stakeholders, including practitioners, policymakers, and citizens. However, there may be an imbalance in the allocation of benefits if certain groups are not adequately considered during the design and implementation of these systems. For example, citizens who do not have internet access, digital skills, or social capital within their community may be at a disadvantage when it comes to using and interpreting data in a way that reflects their needs and contributes to improved public services. It is important to take steps to address these digital divides, such as providing training and resources to help individuals acquire the necessary digital skills and internet access, in order to ensure that AI systems in public governance are beneficial for all citizens. Additionally, involving a diverse range of stakeholders in the co-creation, co-production and co-design of AI systems in public governance can help to ensure that the citizensâ needs and perspectives are taken into account. To ensure the effectiveness of policy design processes using the proposed model, the implementation of effective feedback mechanisms and periodic assessments with relevant stakeholder organizations are important. For that public sector organizations need to develop the necessary consulting and deliberation mechanisms.
Several issues remain to deliberate further about how AI can best be used to leverage OGD in public value creation. First, AI adoption in the public sector is influenced by various institutional, legal, cognitive, political, and technical factors that create a distinction between available and desirable technological solutions. Concerns about transparency and accountability of machine learning systems, task encroachment on human’s role in public administration, lack of data analytics skills in the public domain, and biased datasets to supplement learning mechanisms put a strain on the use of advanced AI solutions in the public sector. These ethical and moral concerns inevitably limit the effective integration of available AI technologies within OGD platforms.
Secondly, both OGD and AI governance are contingent upon the (mis)match of value propositions imposed by various actors participating in platform governance. These value propositions are not fixed and are subjected to changes in user behaviours and consequential events that can undermine the trust in the role of public and private sector organizations in data governance. Previous research shows that inclusiveness, the digital divide in society, and power structures inside administrations are important factors to assess conflicting value dimensions and the use of OGD in public governance (Meijer & Potjer, 2018; Altayar, 2018). Furthermore, the quality of existing datasets in the OGD ecosystem and especially, addressing the gender data gap (Criado Perez, 2019) in publicly available datasets can determine the trust in AI solutions. Such biases embedded in OGD datasets can have drastic implications on public value creation unless appropriate control mechanisms are set in place. We need further research to understand better the effective governance mechanisms in the use of AI and OGD solutions that engender citizen trust and support public value creation.
Third, for the moment, most data-sharing services are derived from centralized servers in the public sector domain and/or big data repositories controlled by profit-based organizations. This centralized constellation might be beneficial for the implementation of AI technologies to improve the efficiency of public service processes, but in the long run, might undermine the wider adaptability of AI solutions in public governance. The inclusion of other digital solutions such as self-sovereign identity and blockchain technologies might improve the quality of OGD and facilitate the adoption of more advanced AI solutions in public governance. However, for the moment, we lack an empirical and theoretical basis on how best to introduce these various technologies to leverage OGD in public governance. Further theoretical and empirical research is needed to understand the governance implications of these decentralized technologies in the use of OGD and AI in public governance.
Footnotes
Funding
The research is funded under the BRAIN-be.2.0 program of the Belgian Federal Science Policy (BELSPO). The research has received funding from the Belgian Federal Science Policy (BELSPO) under the contract number B2/191/P3/DIGI4FED, as part of the BRAIN-be 2.0 program.
Author biography
Evrim Tan is a post-doctoral researcher at the KU Leuven Public Governance Institute. His research focusses on the use of digital technologies such as blockchain and AI in public governance processes. He is the research lead of several projects funded by the EU, Belgian federal government and KU Leuven research fund investigating the application of new digital technologies in different areas of public services. He is one of the Belgian representatives in European Blockchain Partnership, member of the European Blockchain Services Infrastructure for Belgium (EBSI4BE) and member of the academic advisory board of International Association for Trusted Blockchain Applications (INATBA).
