Abstract
The last decade as witnessed the development of Open Government Data programs with the promise of improved transparency, accountability and innovation. Capitalizing on those benefits translates into the development of better public policy and the promotion of economic development. Research in the domain has emphasized on technical issues, and we still lack a clear understanding of the main conditions to promote successful Open Government Data programs. Using the experience of the US Federal Government, including projects in 5 federal agencies, we contribute to the literature by stressing the importance of OGD policies, stakeholder communities and data management practices. Future research should focus on the understanding on governance and leadership models that enable effective implementation of the programs and engagement with relevant stakeholders and domain specific communities.
Introduction
Access to government information to promote transparency and accountability has a long tradition in the US (AbouAssi & Nabatchi, 2019; Afful-Dadzie & Afful-Dadzie, 2017; Parks, 1957). About 10 years ago, however, President Obama’s Memorandum on Transparency and Open Government (Obama, 2009) and his Executive Order on “Making Open and Machine Readable the New Default for Government Information” (2013) have created a new movement that promotes additional values such as collaboration and innovation through new uses of data (OMB, 2013). The main rationale behind the policy was to promote an environment where government agencies, working together with entrepreneurs, developers and citizens would constitute an innovation ecosystem that would produce new data-driven products and services, promote economic development, and contribute to the solution of important problems affecting US Citizens (Dawes et al., 2016; Pollock, 2011; Styrin et al., 2017; Zuiderwijk et al., 2014).
As a result, Federal Agencies in the US have made about 247,000 datasets available to the public through the data.gov portal,1
Number of Federal datasets and other documents available at data.gov by February 2019.
Although these new developments are promising, understanding the conditions for the creation of successful open data programs for data-driven innovations is still a work in progress (Khayyat & Bannister, 2017). Moreover, research in open government data has favored the understanding of technical factors, paying less attention to other important aspects (Afful-Dadzie & Afful-Dadzie, 2017). In this paper then, we contribute to the literature by responding to the research question: what are the main conditions to promote open government data programs? To answer the question, we use the experiences from US Federal Agencies and their reflections with open data users and NGOs on the conditions to develop the US open data program: the US Department of the Treasury, the US Department of Transportation, the US Department of Energy, the US Department of Agriculture and the US Department of Labor. Primary data includes a series of Roundtables with the Office of Management and Budget (OMB), program managers from each agency and other stakeholders, facilitated by the Center for the Open Data Enterprise.2
See
The paper is organized in 6 sections. Following this introduction, Section 2 provides an overview of main concepts and principles used as a framework for comparison among cases. Section 3 includes the research methods. Section 4 introduces the main findings of the research. Section 5 includes a discussion of the cases presented in the paper. And the sixth and last section includes conclusions and recommendations.
Data management and governance in the public sector constitute a profession and a field of study positioned at the intersection of the technical, managerial and policy domains (Andersen & Dawes, 1991). However, as suggested by a recent literature review, the Open Government Data (OGD) literature has been biased towards technical issues (Afful-Dadzie & Afful-Dadzie, 2017). In this section of the paper, we start with a definition of OGD, and continue with a discussion of a framework to understand success on OGD programs that considers influential factors from the technical, managerial and policy domains. Concepts briefly presented in this section are intended to establish a framework to analyze experiences in the US Federal Government’s open data program.
The nature of Open Government Data programs
Open Government Data programs are value-oriented initiatives involving goals and objectives associated with the improvement of government services (Meijer & Thaens, 2009), promoting innovation and scientific development (Bean, 2018; Janssen et al., 2017), increasing transparency and accountability (Peled, 2011), as well as promoting economic growth (Zeleti et al., 2016). Data is conceptualized as a key resource that will create value through its use and reuse. OGD is commonly defined as “data produced with public resources and made publicly available with a license that allows for reuse and repackaging in innovative applications” (Janssen et al., 2012).
Benefits from open data usually are expected to come from innovation social and commercial value, improved government transparency and accountability or increased participation (Attard et al., 2015). Perceptions of potential benefits from OGD are not always aligned. Some scholars believe that opening government data can result in a more equitable and sustainable society (Davies & Perini, 2016), while a few of them fear that opening data would benefit the already empowered sections of the society, widening the gap between the privileged and the marginalized population (Meijer et al., 2014).
Previous research suggests the existence of a multitude of challenges and impediments to OGD initiatives as well as a several success factors (Conradie & Choenni, 2014; Janssen et al., 2012; Zuiderwijk et al., 2014). In general, the literature identifies three broad categories of barriers to OGD programs: Data access, data use and deposition impediments (Zuiderwijk et al., 2012). Data access impediments refer to problems associated with data availability and findability, data use impediments are related to problems with usability, understandability, quality, compatibility and metadata, and data deposition impediments refer to difficulties in the interaction with the data provider and challenges in releasing the datasets. A main issue in this last category involves communication issues between the end-user and the data hosting agency that can cause problems.
Other scholars include organizational and policy factors such as the lack of resources, legal issues, rigid hierarchical structures, or political control as additional barriers to open government data programs (Sandoval-Almazan & Gil-Garcia, 2016). Managing access rights for government data can be a difficult task. Depending on the type of information, mistaken decisions can lead to violations of privacy, business secrecy, national security, or a combination of all (The Open Data Institute, 2015).
Towards a framework to understand OGD success
The literature identifies six conditions for successful use of OGD: quality of data, legislation and policy, skills, infrastructure, availability and privacy (Safarov et al., 2017). In the following paragraphs, we will discuss key issues related to these conditions.
Data quality is a major potential factor regarding open data initiatives’ success from both practitioner and academic perspectives (Martin et al., 2017). In fact, data quality elements such as completeness, timeliness, machine-readability or accessibility are key components of practitioner-oriented definitions of OGD.3
See
Data quality is an important area of research in the information systems literature since many years ago. Traditional dimensions include intrinsic factors of data quality such as accuracy, or believability, contextual elements such as relevancy or timeliness, representational factors such as interpretability and access issues such as availability (Wang & Strong, 1996). The OGD literature adopts these classic dimensions, in particular intrinsic and contextual factors of data quality (Martin et al., 2017). Given its relevance for interpretability, metadata quality is a recurrent topic of data quality in the OGD literature (Martin et al., 2017; Safarov et al., 2017).
Legislation and Policy refers to plans, strategies, practices, laws and regulations to control the creation, processing, transfer, distribution, use and destruction of data (Braman, 2009). OGD policies and regulations also shape the interrelationships between the different actors and stakeholders (Dawes et al., 2016; Styrin et al., 2017). Government information policy can be categorized in three ideal types: Value-oriented policies – operationalizing fundamental principles of information flows in society, e.g. freedom of information –, instrumental policies – employing information as means to achieve other policy goal, e.g. environmental policy –, and managerial policies – specifying rules and procedures for managing information, e.g. information security (Dawes, 2017).
Information policies look for a balance between the meta-principles of ‘stewardship’ and ‘usefulness’ (Dawes, 2010). Information stewardship emphasizes the responsibility of public officials and government organizations for handling information with care and integrity, while information usefulness “recognizes that government information is a valuable asset that can generate social and economic benefits through active use and innovation.” (p. 380), and thus leverages public access to government information, encourages public-private partnerships, and allows information reuse for new purposes. In this sense, OGD policies look to establish a balance between openness and other information values such as privacy or national security, rather than allowing one to dominate the other (Dawes & Helbig, 2015; Etzioni, 2000). In the context of OGD, information openness conflicts with other information values such as privacy at the individual level, business secrecy at the corporate level, and even national security. There is a continued contention about what are the data that can be fully open, what data represent risks of privacy violations or national security. These discussions are not likely to be over, and different stakeholders will be continuously pushing the conversation in the direction aligned to their values and interests (Galvin, 1997).
The third condition for OGD success involves the necessary skills to curate and publish datasets, as well as the necessary skills to re-use them in innovative applications. Such skills are built around actors and stakeholder communities (Gascó-Hernández et al., 2018). In the context of OGD, the term ecosystem has been used to describe such communities (Harrison et al., 2012; Najafabadi & Luna-Reyes, 2017; Styrin et al., 2017). The OGD Ecosystem perspective is trying to capture the big picture in which all the important open-data actors are interacting in an interrelated system. It captures interactions between OGD community actors in a chain of value-creating interactions, developing the necessary skills to take advantage of open data. This chain starts from the open datasets residing in a governmental agency, all the way to the end-user beneficiaries in the society, and feeds back to the actors in the governmental sector. Through this feedback loop, the aforementioned OGD goals can be met (Helbig et al., 2012; Zuiderwijk et al., 2014).
In other words, instead of a one-way perspective on the OGD initiatives, scholars have employed the OGD Ecosystem perspective to suggest that as the society and the government interact, the benefits for both the government and the society can be leveraged (Harrison et al., 2012). Visions of innovation in these open ecosystems involve what Pollock (2011) called “data cycles”, implying the processing and use of OGD by some community members in the ecosystem, and then making the new versions of the data resources available to the community as open data again to be improved and re-used by other members in the community. There is a learning curve for all actors in the open data community. Thus, as more open data applications are created and used, the more knowledgeable people become on how to take advantage of them (Lee et al., 2016).
Technical structures and infrastructures constitute yet another important condition for the success of OGD programs (Safarov et al., 2017). They include technical platforms, software, hardware and standards that both enable and at the same time constrain ways in which information can be made available and be exchanged among different actors and stakeholders. Basic exchange of data over the Internet, for example, is enabled by hardware and data exchange protocols and standards. Technical infrastructures also enable (and constrain) government information policies. Many OGD projects, for instance, are enabled (or challenged) because of the status quo of the existing technologies and standards. Socrata, CKan, DKan or OpenDataSoft are examples of such platforms that provide different capabilities for the final user and also a variety of configurations for managing open datasets on the administration side.
A simplified framework to understand Open Government Data programs.
In this paper, we propose that OGD programs are at the intersection of the four components represented in Fig. 1 and discussed in this section: Technology, Community, Policy and Data. Such components constitute basic conditions for OGD success. Technology includes the structures and infrastructures required to establish the OGD program, community involves a network of stakeholders and government organizations developing the necessary skills to take advantage of OGD, policy includes the laws and regulations to establish a balance between stewardship and usefulness, and data involves the aspects associated with data quality and accessibility.
The research method used in this paper is based on the Case Study Approach (Yin, 2003). Case studies are rich, empirical descriptions of particular instances of a phenomenon, which provide the required empirical evidence for theoretical constructs, propositions and the development of midrange theory (Eisenhardt & Graebner, 2007). They aim to uncover the dynamics of a single phenomenon or a set of interrelated phenomena in a single setting. However, unlike the laboratory experiments that aim to isolate the phenomena from their context, case studies observe the phenomena in the context of the real-world environment.
In case study research, the data analysis starts as soon as the data collection is started, in an iterative manner, to make sense from the data as soon as possible, but also to more effectively collect data on emerging themes. This alteration in the data collection method is legitimate as the researcher aims to understand the case “in as much depth as is feasible” (Eisenhardt, 1989). The cases are selected based on their theoretical relevance and the concepts and constructs from the existing theory guides the coding process (Eisenhardt, 1989; Glaser & Strauss, 1967). Case study research is most appropriate either where the theory is in its early stages or the researcher aims to provide a fresh perspective to an established research topic (Eisenhardt, 1989; Yin, 2003). This type of case-based research typically answers ‘how’ and ‘why’ research questions in unexplored areas particularly well, while they are ill-equipped to address ‘how often’, ‘how many’, and relative empirical importance of constructs (Eisenhardt & Graebner, 2007). It can produce “concepts, conceptual frameworks, and propositions or possibly mid-range theory” (Eisenhardt, 1989).
The case studies in this research involve US Federal Agencies that engaged in Open Government roundtables with the Office of Management and Budget and facilitators from the Center for Open Data Enterprise. Primary data for each case was the report produced as a result of the roundtable. Each roundtable included representatives of the agency OGD program, representatives of OMB, representatives from key NGOs involved in the promotion of OGD use and other stakeholders interested in a specific domain area in each agency. Data from each report was completed with data from agency websites and open data portals. Reports were scanned to include in the research those cases that included the most interesting information and projects. The studied cases in this paper are the US Department of the Treasury, the US Department of Transportation, the US Department of Energy, the US Department of Agriculture and the US Department of Labor. In addition to these five Federal agencies, we used published reports, previous research and government documents to better understand the evolution of the OGD initiative in the US with a focus on Technology, Policy, Community, and Data. Those categories guided data gathering and analysis across all cases.
Findings: Experiences on Open Government Data in the US
In this section of the paper we will introduce our main findings including the experiences of the five agencies included as cases. We will start by describing the context of the OGD initiative in terms of the main categories described in the previous sections.
Although the OGD movement has only emerged in the last couple of decades, open government has a longer tradition in US history, both through laws and regulations as well as through other policy documents such as executive orders. In terms of Federal Statutes (see Table 1), OGD can be traced back to the Constitution and the Bill of Rights, which established a principle of openness by both Congress and the Executive Branch. The Congressional Record, established in 1873, constitutes a daily registry of Congress activity. The Federal Records Act established principles for data management at the Federal level, including explicitly electronic records as of 2014. The 60’s and the 70’s witnessed the publication of several Freedom of Information Acts with the purpose of increasing transparency in the government. These acts where amended in 1996 to include the use of the Internet in the process of requesting access to government information.
Selected US statutes related to Open Government Data
Selected US statutes related to Open Government Data
The Government Performance and Results Act of 1993 was the first effort to introduce data-driven performance management and decision making. Another important statute was the Data Quality Act of 2001, which gives the Office of Management and Budget the capacity of developing guidelines to improve data quality in Federal Agencies. The Digital Accountability and Transparency Act (known as DATA) has the goal of making government expenditures more accessible and transparent. Finally, the latest approved statute relevant for OGD was the Foundations for Evidence-Based Policymaking Act, which gives the force of law to the directive on OGD and requires federal agencies to establish senior leaders on evaluation and data management to promote the curation of datasets to promote innovation in the evaluation of public policy. The Act also emphasizes on data management practices and data quality, as well as on analytic capabilities of each agency.
Selected directives and OMB Guidance related to Open Government Data
In terms of executive directives and OMB guidance related to OGD, the history is not as long (see Table 2). The selected directives included in the table are exemplars of directives that show the tension between openness and privacy or national security, which are all important social values. We would also like to note that, although the Open Government Directive (M-10-06) had a wide scope on Transparency, Participation and Collaboration, the Open Data Policy (M-13-13), and Open Data Action Plan of 2014 emphasized on entrepreneurship and economic development, which appears to continue being the main emphasis of current OGD policy (Howard, 2017). The President’s Management Agenda of 2018 has resulted in a process to develop a Federal Government Data Strategy to improve coordination in the use of data to serve the Public.4
See
In terms of technology, the OMB has developed a metadata standard and a web platform to provide links to all open government datasets that use the same metadata standard. Data.gov – the federal government open data repository – is continuously harvesting the worldwide web, looking for datasets stored in agencies websites and then publishes links to all open data looking websites and portals to improve accessibility and findability of data (see Fig. 2). Data.gov also includes a set of Application Programming Interfaces (APIs) that developers can use to connect to the datasets or their metadata. Another important resource developed by the Federal Government is the repository of OGD tools and best practices called Project Open Data.5
See
See
Data.gov, the Federal data repository.
In terms of community, the US Federal government is also a member of the Open Government Partnership (OGP), a global multilateral initiative that group 75 national governments and 15 sub-national governments.7
See
See
See
In the following sections, we briefly describe experiences of 5 federal agencies. There are common features to all projects (see Table 3), the most obvious are related to the importance of EO 13642, M-13-13 as key enabling policies. In addition, leadership from OMB, and the National CIO have been also key for the implementation of the OGD directives. The Project Open Data platform is an example of the ways in which OMB and the CIO keep leading the federal OGD initiative. Another commonality of all projects is related to the interest in using open software, open standards and open APIs. All projects in the table subscribe to these same principles. There is also an interest in engaging with relevant communities, although to different extents in each agency. Engagement is present in simple forms such as a feedback form in the data repository, but also involves hackathons, workshops or roundtables.
Key dimensions of OGD across US federal agency projects
The US Department of the Treasury is one of the federal agencies in the US with the longest tradition in opening data. In 1789, the Department of the Treasury for the first time published the Monthly Treasury Statement (MTS), a summary of the receipts and outlays of the Federal Government. The Department of the Treasury publishes other periodical reports such as the Daily Treasury Statement and the Monthly Statement of the Public Debt, among others. A main source of data for these periodic reports is the Central Accounting System, which in turn receives information from all Federal Agencies. Some of these reports are mandated by Congress, which uses them to monitor actual spending, find data discrepancies and make better analysis and decisions on budget proposals every year.
Government financial data has the potential of creating value by promoting a more transparent and accountable government, making possible for users and innovators to monitor the different ways in which tax money is being invested and spent. Spending data can be useful for all types of recipients of federal funds, including state and local governments, NGOs and private contractors, who can better understand trends on federal spending, sources of grants and other federal funding.
The Digital Accountability and Transparency Act of 2014 (DATA Act) is the first open data law in the United States to make federal spending data open. DATA expands the Federal Funding Accountability and Transparency Act of 2006, with the main purposes of (1) disclosing federal agency expenditures linking them to specific programs, (2) establish government-wide standards to report financial information, (3) simplify reporting requirements and (4) improve the quality of the data.10
The full text of the act can be found at
At the technical level, and following the mandate of the DATA Act, OMB has successfully developed the DATA Act Information Model Schema (DAIMS). DAIMS is a set of standard definitions that are already being used by federal agencies to report their spending information at the USASpending.gov platform (see Fig. 3). The development of both platforms has been a participatory process involving feedback from federal agencies and open to feedback from users of the platforms themselves. The community around this project includes state and local governments, NGOs and private companies that are recipients of federal funds.
Beta USASpending.gov site. Current working site can be visited at www.usaspending.gov.
The US Department of Transportation was established in 1967 with the mission of “serving the United States by ensuring a fast, safe, efficient, accessible and convenient transportation system that meets our vital national interests and enhances the quality of life of the American people, today and into the future.”11
See
Transportation data has the potential of creating economic value in many different forms, including logistics and route planning, as well as important social value by contributing to policy makers’ understanding of safety and the development of a more robust transportation system. The Department of Transportation open government plans reflect a clear understanding of the value of the data and the importance of building standards and platforms to improve data sharing.
Unlike the Department of the Treasury, there is no specific legislation regulating opening data in any of the areas of the Department of Transportation. In this sense, the Department has only been guided by the presidential initiative and guiding memoranda for open government data. Following this guidance, the Department of Transportation has developed its Open Government plans until 2016, last year that the plan was formally required by a presidential executive order. Because of the emphasis of the Department of Transportation mission on safety, it has played a key role in many of the projects and communities promoted by president Obama.
The Department of Transportation has been involved in many different technical projects associated with OGD. For example, the Department of Transportation was in charge of leading the development of the National Address Database, which was conceptualized to be a key resource to manage emergency services and to develop the new generation of 911 services. The project was one of the commitments of the Open Government Partnership. Another technical projects in which the Department of Transportation has been involved include the Permitting Dashboard, an online tool to track permit and review processes of large infrastructure projects.12
See
See
See
Collaborative Geo-spatial platform of the US Federal Government.
In terms of community engagement, the Department of Transportation has been one of the leading agencies in building a community about safety data. The community was very active in the development of public engagement events named “Safety Datapalooza,” as well as the development of the safety themes in data.gov. The community has been successful in publishing relevant datasets for citizens such as the dataset on product recalls. The level of activity of the community has declined in the last couple of years, but the data and resources are still accessible through the data portal.
The Department of Energy was founded in 1977, consolidating in a single agency all energy programs in the federal government. The Department was created as a result of a series of re-organizations that responded to the Energy Crisis of the 1970’s. The mission of the Department of Energy is “to ensure America’s security and prosperity by addressing its energy, environmental and nuclear challenges through transformative science and technology solutions.” The Department of Energy has also a longstanding tradition with information dissemination through the Energy Information Administration, which has the commitment of gathering, analyzing and disseminating energy data to improve policy making in the area of energy. Information and analysis is commonly used by industry, academia and nonprofits.
Data made available through the Department of Energy projects and partners is valuable to guide policy-making in Energy production, and creates value for private organizations that work in the development of energy technologies as well as in the production and transmission of energy. As it has already demonstrated by projects in the Department of Energy, data is also valuable to help final users (consumers) in improve their energy use behavior, ultimately reducing also the environmental impact of our daily activities.
In terms of specific information policy relevant for the Department of Energy, efforts also respond to the Executive Orders and OMB memoranda to open data and smart disclosure. Smart disclosure was another initiative of President Obama that looked to open data from both public and private sources in cases where these data could help consumers to make better decisions, being careful on protecting consumer privacy as well as industry proprietary data.15
For a description of Smart Disclosure, see OMB, “MOU for the Heads of Executive Departments and Agencies – Informing Consumers through Smart Disclosure,” and Sayogo et al., “Going Beyond Open Data.”
OpenEI wiki platform.
The Department of Energy has used technical infrastructures and platforms in innovative ways. Data is organized around communities associated with different types of energy sources such as geothermal, solar, wind, etc. For example, the Department of Energy has supported the National Renewable Energy Laboratory, to develop OpenEI – a wiki platform to share energy information and data. Another community is organized around the regulatory process for alternative energies: The Regulatory and Permitting Information Desktop (RAPID) Toolkit. The RAPID toolkit is another wiki environment to share permitting guidance, regulations, contacts, and other relevant information for energy projects. Finally, the US Energy Information Administration also uses open data and an API to make energy data more useful and accessible to a variety of users.
Besides the communities described above, the Department of Energy has played a leading role in one of the most interesting and successful cases of open data in recent years: the green button initiative. The green button initiative is an industry-led smart disclosure initiative. The main assumption of the Green Button initiative is to extract value from individual energy consumption patterns, using them to give consumers specific recommendations on how to change consumption patterns and save energy. The value can be extracted only if all individual data is aggregated in a single repository with data from other individuals and applying analytical methods that help to understand patterns. The Department of Energy constituted itself in one of the facilitators of the process, contributing to find solutions to overcome problems related to consumer privacy and utilities proprietary data. Consumers download their personal energy usage data from their Utility company portal, and then connect their data with a solution provider, who aggregates individual data to give advice to their clients.
The Department of Agriculture (USDA) was founded in 1862 by President Lincoln with a very wide mission in the areas of food, agriculture, economic development, science, and natural resource conservation. USDA research and reporting has played an important role in supporting universities, research centers and farmers in developing better agricultural practices and innovation. Data and information resources associated with the mission areas of the US Department create value for the development of rural areas, increasing food safety as well as resiliency in case of natural disasters. Agricultural research data has a long tradition of being transformed into practice in the US through the extension function of the land-grant universities in the US, which were created in the late 19
Data gathering and dissemination initiatives at the Department of Agriculture have been motivated by a series of laws and regulations related to the land-grant institutions such as the Smith-Lever Act of 1914, which establishes the extension services in land-grant universities, and the USDA Reorganization Act that creates NIFA in 1994. More recently, just like any other Federal Agency, open data initiatives respond to executive orders of President Obama. The Department of Agriculture OGD activities have been also oriented by President Obama participation and commitments in the development of the Global Open Data for Agriculture and Nutrition (GODAN), an international initiative to promote open data to ensure world food security.16
See
The USDA has developed a series of technical platforms to share OGD relevant for its mission. Three platforms are at the core of their OGD program. The first platform, named Discovery Tool for New Farmers,17
See
See
Ag Data Commons at the National Agricultural Library.
These three platforms also suggest the type of community engagement involved in the USDA OGD initiatives. The discovery tool represents engagement with farmers and final users of USDA data, and the AG Data Commons reflects engagement with the research community and other funding agencies. Each platform is at the center of two different types of OGD ecosystems. It is also important to note that USDA is involved in an international community of open data through the Global Open Data for Agriculture and Nutrition.
The Department of Labor was founded by President William Taft in 1913 as a result of pressure by organized labor to have a voice in the cabinet.19
See
Data gathering and reporting is an essential part of the Department of Labor mandate. In this sense, the Open Data Directive constituted only a catalyzer of activities that were already part of the activities of the Department. Besides the information published at Data.gov, like any other federal agency is doing, the Department of labor keeps a couple other interesting technical platforms of open data. The Bureau of Labor Statistics page is of course one of these specific projects at the Department.20
See
See
My Next Move Web portal, a project of the Department of Labor Employment and Training Administration.
Natural users and potential members of the community associated with these technology platforms include employers, employees/job-seekers, and intermediaries such as counselors and educators. National and international users of these communities use the technical platforms described in the previous paragraph in an intensive way. However, the most explicit effort of the Department of Labor to create a community is through the WorkforceGPS platform.22
See
The cases introduced in the previous section represent a set of examples of the type of policies, projects and communities that have been formed or strengthened through the OGD initiative in the US. Although some of these practices can be traced back to movements that promoted a more transparent, accountable and democratic government in the late 19
The new regulation, on the other side, introduces the figure of the Chief Data Officer in all federal agencies, and the creation of an Advisory Committee on Data for Evidence Building. Leadership provided by senior managers of this nature is likely to revitalize the program and promote better data management practices and policy innovations. Moreover, the formulation of the federal data strategy – which includes the promotion of OGD – will contribute to create momentum in this new round of innovation.
Another finding from the reflection on the 5 projects is that the OGD is not a monolithic program, but a diverse number of communities and groups that are organized around topics and domain-specific problem areas. For example, USASpending.gov is a community of federal agencies, recipients of federal funds, policy makers, watchdog organizations and citizens, interested on different aspects of US Federal spending. The DATA Act, DAIMS and USASpending.gov, absolutely key for this community, have little relevance to the community of new farmers gathered around the Discovery platform of the USDA, or the private partners and citizens around the Green Button initiative. In this sense, the OGD Ecosystem can be understood as the aggregate of all small (or not so small) communities. Given the diversity of datasets that can be made public in each federal agency and across all US Federal government, understanding the structure of such communities is a key success factor for OGD programs. One of the weaknesses of the US OGD initiative, however, is the lack of a more coordinated engagement from government agencies with each of these communities to better understand data needs and requirements. Nonetheless, some NGOs have played an important role in promoting these communities, as well as conversations and collaboration among government and stakeholders in the user community: the GovLab,23
See
See
See
As of today, the OGD information policy in the US has promoted the development of an ecosystem in which Federal Agencies have contributed in the development of a shared repository of data resources, data.gov. The Director of OMB, as well as the Federal Chief Information Officer have played a key leadership role in the development of the necessary technology standards for the development of data.gov, and the General Services Administration have successfully managed the data portal. In the development of this repository, the US Federal Government has consistently favored open standards and open source software.
Besides their contribution to the data.gov repository and their inventories of data assets, each federal agency has engaged in specific OGD projects. The exemplars included in this report suggest that projects share common features, such as the preference for the use of open standards and open platforms.
In this final section of the paper, we would like to summarize our main contribution by answering the question guiding the research: what are the main conditions to promote open government data programs? In addition, we would like to offer our own reflection on the road ahead for the open data policy in the US and its practical implications. We finish the paper with some recommendations for future research.
Four are the conditions for successful OGD programs: Policy, Community, Data Quality and Technology. Policy is most effective when introduced as a combination of formal laws and executive directives. Effective policy also provides governance and leadership to make decision and execute the plan. In terms of community, the US experience suggest that it is most useful to think on domain-specific groups with interests and needs to produce innovations and solutions to their main problems. Data quality is at the heart of any OGD application, and it is the main concern of every community. Finally, technological platforms and standards are basic infrastructures that enable the OGD program. Although all cases presented in the paper include four elements to certain degree, the most successful cases are those that involve the most developed policy frameworks as well as those with the most engaged stakeholder communities.
Current developments in regulation, policy and strategy development at the US Federal Government are promising and have the potential of revitalizing the OGD program, which has experienced a period of stagnation in the last couple of years. The definition of governance bodies such as the Advisory Committee on Data for Evidence Building, as well as the creation of the position of Chief Data Officer in all government agencies have the potential of providing the necessary leadership for this renovated program. The federal data strategy may provide the basis for the promotion of the needed data management principles and practices to promote data quality. Nonetheless, all these developments are still on a very early stage, and constitute more of a promise than a reality. The road ahead involves several challenges that need to be overcome in the following areas:
Data management. Quality of published data is a reflection of data management practices. Continuous improvement and updating of these techniques is a challenge for every agency. Human capital. Data management and analysis requires a work force with a combination of technology and data analysis skills scarce in the market. Data quality. Establishing and implementing processes to ensure data timeliness, accuracy and completeness. Data integration and interoperability. Linking data increases value, but requires effort on developing standards and curating data. Lack of resources. No new resources to manage OGD projects have been added to any Federal Agency. At least one of the projects reported here did not advance because of lack of resources. Developing and updating standards. Data and metadata standards are not easy to develop and enforce in environments with multiple stakeholders. Moreover, in changing environments, standards need to be continuously updated. Making data more discoverable and accessible. Making data available in a way that a variety of users can find and use, developing the right filters and visualizations to facilitate navigation. Engaging with data users. Communicating with key users about data requirements and applications, as well as finding effective ways of getting their feedback has been perceived as a challenging task. Data volume and velocity. The amount of data and research being produced in all different projects reported pose a challenge to keep it timely and updated. Data digitalization. There is a number of data resources in older media or in unstructured form that require to be digitized and made available in machine-readable formats.
Future research should focus on the identification of effective governance and leadership models as well as on the impact of data management practices in the quality of data. Finally, research should continue focusing in the identification of better ways of engaging in the development of domain-specific communities of stakeholders and users of OGD. Models promoted by NGOs such as the GovLab or the Center for Open Data Enterprise should be further evaluated and explored.
Footnotes
Acknowledgments
The research reported here was partially funded by the IBM Center for the Business of Government. Any opinions expressed in this material are those of the author and do not necessarily reflect the views of IBM.
