Abstract
Data analytics (DA) use has been frequently considered a transformational practice in the public sector, particularly in terms of its potential for data-driven decision making by local governments. Despite growing interest from academics and practitioners, empirical research on what cities are actually doing regarding DA and how is still relatively scarce, particularly in terms of a focus on specific activities and processes. Based on a socio-technical view, this paper examines a local government’s experience with DA use and how DA can be seen as a process and transformational practice. Results reveal that DA practices go beyond the use of specific technologies and involve collaborative efforts to formulate meaningful problems and prepare data for specific uses. Indeed, problem formulation is an essential step where collective discussions and assessment by technical and nontechnical DA practitioners occurs. Our findings suggest that (1) data analytics viewed as a socio-technical process involves data management and analysis, but also collaborative processes between multiple city agencies, (2) collaborative efforts go beyond data analytics and include data collection, data representation, and problem formulation; and (3) formulating the problem in a collaborative manner could be considered an important first step when using data analytics for decision-making in local governments. Furthermore, collaborative problem formulation in DA seems to be affected by specific variables related to the collaborative effort such as leadership, governance, and technology.
Keywords
Introduction
Data analytics (DA) has attracted the interest of academics and practitioners the last few years. From business (Côrte-Real et al., 2019) to public policy (Gil-Garcia et al., 2018), that interest has been recently nurtured by emerging analytical technologies that can help more effectively handle data and transform it into information for decision-making purposes. However, this potential is not new; it is tied to more than 20 years of information systems literature dedicated to maximizing the value of data in information environments (Davenport & Prusak, 1997). The topic is also noted in the public sector literature, where coverage is more interdisciplinary and relates to public administration concerns like information management and governance (Andersen & Dawes, 1991; Fountain, 2001), information sharing and integration (Gil-Garcia & Sayogo, 2016), and information policy (Braman, 2011), among other.
Increasing attention has been given to local governments, where the need to find “smart” ways to address ever growing public issues is indeed pressing. To respond to specific needs, which can include, for example, emergency preparedness (Chaudhari et al., 2019) and transportation challenges in metropolitan regions (Kaptan et al., 2018), reliance on data and evidence-based decision-making has become an important, even a necessary, trend. That interest is reflected in practical efforts for data-driven policymaking in smart cities (Moustaka et al., 2018), in open data initiatives (Puron-Cid et al., 2016), and to find better ways of using technology and consuming information (Matheus et al., 2018). However, the increasing interest in data analytics by practitioners contrasts with a relative lack of research of specific activities and processes, such as problem formulation and collaborative efforts in local government DA cases, where more consistent uses of data are expected to develop more responsive cities (Cronemberger & Gil-Garcia, 2019).
In addition, DA practices in smaller jurisdictions have yet to be explored as a socio-technical process in the context of collaborative practices and fully framed within data management and the data life cycle. The data life cycle framework, already established for information systems implementation (Ohr et al., 2010) and often referred to through acronyms, such as DMBOK, identifies important practices for making data use more efficient and effective (Harrison et al., 2019). Those practices intend to add structure to the process of organizing and curating data, so the data can then be analyzed to specifically address a given problem. Later stages in that process seem instead to display an overlap in the literature with DA and data science and are commonly mentioned in the context of data warehousing and business intelligence (Larson & Chang, 2016). Practices that focus on collaborative ways of addressing problem formulation, however, seem to be generally overlooked, not explored empirically, or approached as a purely technical issue, instead of as a socio-technical process or a comprehensive strategy. This could be an issue to public organizations that are embracing the development of capabilities for data-driven problem-solving, but where problems are not necessarily approachable from an enterprise perspective, either because of resource scarcity, a lack of adequate data, limited capacity to leverage the existing data, or the “wicked/tangled” nature of the problems they have to deal with (Boyd & Crawford, 2012; Grommé & Ruppert, 2021; Black, 2013; Dawes et al., 2009).
Based on concepts of data management and the data life cycle, this paper extends previous research (Cronemberger & Gil-Garcia, 2020; Iqbal et al., 2020; Hagen et al., 2019; Matheus et al., 2018) and provides a comprehensive socio-technical view of data analytics as a process and transformational practice for public organizations at the local level. The experience of Syracuse, New York, a city that became part of Bloomberg’s What Works Cities and has been engaging in data-driven practices in its policy-making endeavors, is used to illustrate some of the benefits and also the challenges. One of the main findings is the foundational importance of collaborative problem formulation not only as one of the first steps in preparation for data analytics, but also as an effective way to frame an issue even if at the end the intended data analytics process is not feasible for that specific problem. In addition, the case shows that problems are not always established from the beginning of a DA effort; they are emergent in nature and should be seen as the result of a collaborative approach to data analytics.
This paper is organized into six sections, including the foregoing introduction. Section two offers a review of recent literature on data analytics in the public sector. It also includes a description and explanation of the data life cycle and how, jointly with a collaborative approach, it can be used as a framework for studying data analytics as a socio-technical process and transformational practice. Section three describes the research design and methods used in this study, which is based on the case of Syracuse; a small-medium size city in the state of New York. This section also contains a brief description of this case. Section four presents the analysis and main results organized by the main concepts proposed in the literature review and highlights the importance of collaborative problem formulation as a foundational step in data analytics efforts. Section five discusses our findings and provides some implications for research and practice. Finally, section six presents our conclusions and suggests ideas for future research on this topic.
Towards a socio-technical view of data analytics as a collaborative practice
This section reviews recent literature focusing on data analytics in the public sector and the data life cycle as a way to frame our proposal of understanding data analytics as a socio-technical process and a collaborative transformational practice. Although more general and comprehensive than our current study, this literature provides the theoretical and conceptual basis for our focus on collaborative problem formulation for data analytics in local governments. Our overall approach was a narrative literature review (Grant & Booth, 2009), since it is particularly useful for summarizing a body of literature and identifying gaps or inconsistencies (Onwuegbuzie & Frels, 2016). We selected this literature because we wanted to have the right mix of data analytics research from multiple perspectives. By bringing together both established high impact research and emerging research in other domains, we feel that this paper benefits from a comprehensive and multidisciplinary approach to the data analytics topic. Specifically, we are including literature on the use of data analytics in local governments as well as literature on different approaches to understand data analytics, including the data life cycle management lenses as a way to better understand data analytics as a process and practice.
Data analytics in the public sector and the importance of collaborative efforts
Data analytics has been explored in multiple ways and contexts. It is commonly referred to as a group of technologies or a framework for intelligence practices that include statistical analysis, data mining, and machine learning methods (Chen et al., 2012; Gil-Garcia et al., 2018; Iqbal et al., 2020). Those practices can help analyze data in more elaborated ways and help organizations extract further helpful insights (Hagen et al., 2019; Chaudhari et al., 2019). The practice is also centered on the notion of “actionable data” (Chong et al., 2018; Bibri, 2019), which are data capable of being really transformative as an enabler for organizational practices, especially for decision-making processes. This notion is not novel and has appeared quite extensively through different terms and narratives, in the knowledge management literature for the past two decades (Davenport et al., 2001; Pee & Kankanhalli, 2016). Analytics has also been associated with several organizational practices and contexts, including marketing (Deng et al., 2015), human resources (Marler et al., 2017), finance (Cokins, 2014), operations (Dubey et al., 2019) and a variety of related disciplines and subdisciplines. The increased interest in analytics appears to stem from recent advances in data processing technologies, which have been able to define new horizons in terms of problem-solving capabilities (Puron-Cid et al., 2016). Incorporating data-driven approaches into the routine has thus become a modus operandi for the modern organization across multiple sectors, industries and disciplines.
In existing public sector research, data analytics appears to have many definitions. Besides being also referred simply as analytics, frequently in the context of big data, there are few specific situations where adaptations to contextual terminologies such as “policy informatics” (Dawes & Helbig, 2015) or “policy analytics” (Höchtl et al., 2016) also occur. Quite often in those contexts, the technical analytic solutions do not appear to be the most important element, but rather the governance challenges that emerge from their use for public policy and public management issues. Consequently, the use of data and information in public administration research appears to follow its own paradigm, often directing the discussion more toward data governance (Wang et al., 2018) and the socio-technical complexities that emerge from that governance than to the technologies that are involved (Fountain, 2001; Andersen & Dawes, 1991; Wang et al., 2018). Among the complexities inherent in the public sector, inter-organizational information sharing and collaboration (Dawes et al., 2009; Gil-Garcia & Sayogo, 2016) and interoperability (Henning, 2018) remain two of the most persisting challenges, that are yet to be deeply discussed in the context of the relatively more recent data analytics research agenda.
Attention to data analytics in the public sector is not uniform across different levels of government, i.e., local, state, and federal. At the local level, attention has been given to its promise in enabling the achievement of smart city goals (Cronemberger & Gil-Garcia, 2019; Iqbal et al., 2020). The promises and hopes towards analytics go beyond the perception of it as a technological novelty; rather recent research has presented caveats such as the risks of wrongfully embracing it as a simple solution for complex problems (Mergel et al., 2016) or treating it as an aspect of information use that is not central to public administration practices, yet still a potentially relevant tool (Hagen et al., 2019)
In the light of persisting challenges such as data availability (Welch et al., 2016), research on promising uses of data has focused on making it more accessible through open data initiatives (Zuiderwijk et al., 2015) and on governance challenges, particularly those associated with larger or growing repositories (Janssen et al., 2017). In this context, the technical infrastructure for data collection, management and governance has also been studied, including the growing interest in IoT frameworks to make data collection more efficient (Jararweh et al., 2020) and the possibilities of applying emerging technologies such as blockchain (Fan et al., 2020). The studies have also explored technological improvements that make cities smart (Sun et al., 2016) through the development of capabilities (Lam & Ma, 2019) and drivers or forces (Angelidou, 2015). In data analytics research, those capabilities may refer to both “human” and “intangible” resources (Gupta & George, 2016), including concerns with finding the right talent to correctly use data (Ekbia et al., 2015) and assessing the relationships of data analytics with factors such as “organizational culture” and “top management commitment” (Wamba et al., 2017).
Those focuses may be especially necessary in a context where multiple stakeholders may need to collaborate to understand public issues and design a solution (Tucker et al., 2017). While not directly referred to as data analytics in local government literature, previous research suggests that collaborative efforts appear to play a crucial role in helping public leadership and citizens to work with data (Susha et al., 2017; Ruijer & Meijer, 2020; Meijer & Potjer, 2018) or engaging stakeholders in sharing data for the public good (Susha et al., 2019). While most previous research has focused on developing a vision for data analytics use in smart city development contexts or on identifying factors that can be used to build that vision, few studies have considered that different cities may be at different stages of development in their data analytics agenda (Harrison et al., 2018). Different organizational realities, as different local governments across the world are likely to represent, may require specific assessments on existing capabilities. As they do evolve and develop towards becoming more data-driven at their own pace, their ability to leverage data analytics may then follow (Janssen et al., 2017). However, empirical evidence on how local governments collaboratively formulate problems for the use of data analytics is still limited.
Existing research can then be categorized into three streams. The first stream is focused on applied studies, dedicated to techniques and technological applications of analytical technologies to problems that local governments face. Examples include the study of data science practices and algorithms (Engin & Treleaven, 2019), artificial intelligence frameworks (Criado & Gil-Garcia, 2019) and applications that help governments understand a variety of challenges such as urban mobility (Wahid et al., 2018) or healthcare systems bottlenecks (Anisetti et al., 2018). The second stream is more theoretical and focuses on understanding how analytics, as an emerging socio-technical practice, influences both management and governance in organizations. It also examines to what extent the effects produced are in accordance with the goals stipulated by government leadership considering public expectations and existing data handling routines. In this research realm, aspects such as smart governance (Barns, 2018), public value generation goals (Yu et al., 2019) and data management practices (Bibri, 2019) are some of the most covered. A third stream of research gives centrality to collaborative efforts (Broccardo et al., 2019) and its arrangements as an organizational practice. Research in this realm is highly interdisciplinary, relying on a multidimensional view of collaboration, its drivers and forces. According to Bryson et al. (2015), for example, there is an interplay between collaboration processes and collaboration structures that is mediated by leadership, governance, capacity and competencies, and technology. These dynamics appears to be affected by political and power conditions, as well as influenced by strategic contingencies, yet not actually determined by them, suggesting that DA practices could be affected by both socials and technical variables, which is consistent with previous research about the use of information technologies (Orlikowski, 2000).
While these three streams are committed to providing practical and theoretical instruments for the study of data analytics and commonly will overlap in the way they conceptualize data-oriented practices, they do not often provide empirical evidence on the practitioners’ experiences and specific collaborative mechanisms that are leveraged when formulating problems to be potentially solved using DA. The current study could thus be situated at the intersection of the second and the third streams, where collaboration practices and the use of data analytics are expected to be affected by multiple social and technical variables. However, our focus here is on the problem formulation process in local governments. By focusing on the problem formulation and the collaborative practices around it, this paper has an intentionally narrower scope and attempts to specifically contribute to the understanding collaborative problem formulation for data analytics use in local governments.
A socio-technical view of data analytics as a collaborative process in local governments
This subsection argues that the data life cycle can be used to frame data analytics as a collaborative process from a socio-technical view. It focuses on the use of data and the concepts that are related to the different stages in the data life cycle. Due to the variety of applications and contexts in which data can be used, the study of data use spans experiences in business, public administration and a variety of disciplines where certain particularities can be observed. The interdisciplinarity of the topic makes it more challenging to study, but also enriches our current understanding and expands the potential for further research. Data life cycle frameworks are commonly used to comprehensively study practices and processes where data are used. Those frameworks help to design goals, define stages for data management, and also map expected outcomes from data use. Research has approached the topic in two ways. The first approach encompasses comprehensive research, mostly dedicated to refining established models and enhancing the explanatory power of existing theories in the light of new empirical evidence. That is accomplished, for example, through proposing new frameworks that provide a more holistic view of data uses (Ofner et al., 2013). The second approach involves the scrutiny of specific stages of the data life cycle, where research explores and expands the current view on particular concepts and relationships such as data quality (Otto et al., 2012), data collection (Ku & Gil-Garcia, 2018), or the dynamics of data sharing (Cronemberger & Gil-Garcia, 2019).
Overall, existing frameworks outline four similar stages for data use as well as their definitions. Those stages are generally referred to as 1) data collection, generation or creation (Poe et al., 1997); 2) data cleaning and curation (Jagadish et al., 2014); 3) data analysis (Wang et al., 2018) and 4) data management (Babar et al., 2019). As explained before, each of these stages is explored in research to varying levels of depth and examined in specific contexts from distinctive theoretical angles. Such endeavors suggest that, given the multitude of constructs and meanings present in this dynamic and still evolving research domain, examinations of data analytics practices and processes are typically fragmented or not explicitly made. Therefore, we argue that studying DA in the context of the data life cycle in the public sector could help to provide a needed comprehensive perspective. Much is yet to be learned from the different stages and experiences, particularly those in which data use is directly linked or referred to data analytics practices and technologies. Ku and Gil-Garcia (2018), for instance, found that data collection is a critical part of data analytics practices in local governments and argued that only when the necessary and adequate data are collected, can the analysis become feasible and more useful. Hopes of such improved data collection are often associated with technological advancements, including the Internet of Things. However, as expected, not every local government has fully explored these emergent technologies. In addition, according to Gupta and Panagiotopoulos (2019), assessments on operational capabilities, as well as on the extent to which a culture of data collection and use is in place, are still key.
Therefore, we argue that the concepts related to the data life cycle could be very useful to more comprehensively describe data analytics use in local governments. The specific activities, tools, and routines are expected to greatly vary according to the specific characteristics of each local context. For example, some cities may be efficient at collecting data through legacy systems, but may not have structured data management practices in place (Poltie et al., 2020). Others may present solid frameworks for data handling and processing with limited resources, but, in face of increasing external pressure to achieve smart city goals (De Guimarães et al., 2020), they may face capacity issues and struggle to collect data for specific tasks or problems. Being aware of those differences suggests that the capabilities may differ (Pelton & Singh, 2019), thereby requiring different governance arrangements (Bergan et al., 2020) and reveal the need for better adherence to context-specific techniques (Iqbal et al., 2020).
The data life cycle approach helps put the local government experiences with data at different stages in perspective and identify important differences and similarities. In addition, that approach could help to adopt a process view of data analytics, which is adequate to position different local government realities in a broader context. Research on data analytics has focused on both the relevance of data analytics to achieve smart governments and the factors that may influence the success of those data-driven initiatives (Cronemberger & Gil-Garcia, 2019). Across different articles, the general view is that, prior to successful execution of analytics at the technical level, when all the data is in place and ready to be used, organizational factors, such as leadership support and governance efforts to collect data, are critical (Chatfield & Reddick, 2017). Mergel et al. (2016) argue, for instance, that analytical capability development in the public sector can be linked to three goals: a) managing and processing large amounts of unstructured, semi-structured, and structured data; b) analyzing those data into meaningful insights for public operations, and c) interpreting those data in ways that support evidence-based decision making. According to these authors, those capabilities are especially needed in “small jurisdictions” (Mergel et al., 2016). Finally, as mentioned before, collaboration is the key to data analytics success and, therefore, data analytics should be seen as a collaborative process facilitated by leadership, governance, and technology (Bryson et al., 2015; Bowker & Star, 2000).
Based on this review of existing literature, we propose a socio-technical framework that considers data analytics as a collaborative process and a transformational practice, particularly in local governments (see Fig. 1). The proposed framework highlights social and organizational aspects, but also includes those variables related to technical aspects, such as the technology used for data collection, analysis, and representation. Technology infrastructure is not explicitly included but is assumed to be the foundation for the use of DA-related software and applications. Using this view, data analytics practices involve three stages or processes within the data life cycle, namely, (1) data collection and preparation, as the input side of DA; (2) data analysis, and (3) data representations and visualizations, as the output of DA. These processes are practical and operational, and they are the main focus of the first research stream mentioned earlier. As part of the data analytics processes and the outputs that enable decision-making, they are critical to understanding data analytics in a comprehensive manner, but only will happen after the collaborative practices for problem formulation, the main focus of the empirical part of this study as shown in Fig. 1.
Data analytics as a collaborative process in local governments (based on Bryson et al., 2015 & Ofner et al., 2013).
When discussing DA practices as a collaborative process, existing research suggests that these practices are likely influenced by three key constructs that are more generally related to collaboration: Leadership, governance and technology. Collaboration is a foundational practice and plays a central role in enabling inter-organizational knowledge and information sharing, but also DA. Leadership is known to foster the strategic use of information in the public sector. Governance is considered to play a central role in orchestrating socio-technical efforts in terms of rules and arrangements that can maximize benefits from the strategic use of information. Finally, technology acts as both an infrastructure, enabling information exchange at the organizational level, and as an artifact that supports the collaborative effort and, in this case, also facilitates the data analytics process within a local government. Each of these variables are related to collaboration (leadership, governance, and technology) and are expected to influence all stages of the data life cycle in the context of DA in local governments.
Local government realities, however, are likely to vary greatly because specific jurisdictions confront policy issues that are relatively unique. Equally specific are the contexts in which problems are examined and formulated, indeed often a dynamic and complex endeavor that involves public leadership and community involvement (Kitchin et al., 2017; van Zoonen, 2020) and sets out to ultimately establish public value (Cronemberger & Gil-Garcia, 2019). While it is self-evident that problem formulation is a critical first step, the evidence backing up this important stage is still limited and domain-specific (Passi & Barocas, 2019). In particular, collaboration often appears as the critical sustaining factor for living labs (Gascó, 2017) and co-producing processes (Grommé & Ruppert, 2021), while socio-technical aspects of collaborative practices in the early stages are not explored empirically very often in local government research. Similarly, data life cycle frameworks focus on operational aspects of data use and tend to overlook problem formulation and the contextual aspects that may influence that phase, clearly an agenda explored mostly by the systems and simulation modeling literature (Black, 2013; Cronemberger et al., 2017). Acknowledging the foundational role of these data life cycle frameworks and the emerging and dynamic nature of policy issues faced in small jurisdictions, this paper contributes by positioning collaborative problem formulation in local governments as significantly important for effective data analytics efforts.
This study is based on a local government case – the City of Syracuse in the state of New York. The city was selected because it has come to prominence for being engaged more systematically with data analytics practices, particularly due to its participation in Bloomberg’s What Works Cities program. Important milestones include the release of open government datasets and a blog on how they are using data that is available to the public. For this study, data were collected in mainly three ways: Semi-structured interviews, the observation of a session with community leaders, and an analysis of available documents. This section also includes a brief description of the case.
Data collection and analysis
First, interviews with the stakeholders involved in data analytics were conducted. They were the central source of information for the analysis, contextualizing the case and its specific efforts. The profile of respondents included analysts, policymakers, champions, and local government leaders. The questions followed semi-structured procedures, asking interviewees about the importance of specific factors related to a digital government framework (Gil-Garcia & Pardo, 2005). For example, the respondents were asked to elaborate on points that they considered relevant to data analytics use in the scope of data and information attributes, technological capabilities, organizational and institutional environments, and some external factors, such as citizen participation and political forces.
In addition, a session with community leaders involved in code enforcement for housing provided a topic-centered example of how City Hall goes about approaching problems of public interest analytically and with a participatory approach. During that session, attention was given to the discussion, to the interactions and iterations, as well as to how the topic was being approached. The structure of the meeting and the way different people involved in data analytics discussed each issue and formulated the problem was also observed. Overall perceptions were supplemented with document analysis of openly available electronic sources, such as the What Works City website (What Works Cities, 2017) and specific, information about the i-Team (Innovate Syracuse, 2017). The concepts presented in the literature review section were used to guide the exploration of the case. The data analysis considered pre-established categories based on the literature, particularly the proposed concepts. However, new concepts and relationships that emerged from the qualitative data were also considered, as they pertained to the particularities of local governments as the research setting for this study. Specifically, we found that collaborative problem formulation was fundamental for the use of data analytics in local governments
Brief description of the case: Data analytics in Syracuse, New York
Syracuse, New York, engaged in a nationwide program called Bloomberg’s What Works Cities initiative (WWC). Through that program, the city became part of a network of 100 municipalities that committed to “enhance their use of data and evidence to improve services, inform local decision-making, and engage residents” (What Works Cities, 2017). Under the motto of “Build a government that residents can count on” (What Works Cities, 2017), this endeavor focused on fostering “best practices” across selected cities, “helping local leaders identify and invest in ‘what works” (What Works Cities, 2017). The selected cities are granted a formal certification and receive funding and support to foster data-driven initiatives from practitioners and researchers affiliated with such centers as the Government Performance Lab at the Harvard Kennedy School of Government and organizations, such as the Sunlight Foundation, Results for America, Govex (John Hopkins University), and the Behavioral Insights Team.
The WWC program uses a framework that is grounded in four pillars: 1) Commit with goals; 2) Measure; 3) Take Stock; and 4) Act. Under each pillar, several steps to achieve key goals are outlined. Participants are expected to embrace the WWC mission and comply with those standards. Under those guidelines, DA sponsors and champions from each city reconvene at a yearly summit where practices and experiences are shared and discussed. Each city has their individual goals and priorities. For instance, whereas Hayward, CA has a “focus on improving safety and quality of life by curbing illegal dumping,” Athens, GA is interested in measuring and communicating “progress on economic prosperity goals”, and Gainesville, FL wants to work “toward transportation and business life cycle improvements.” Syracuse’s involvement with WWC aimed to “improve open data practices and establish and improve performance management programs to improve results for residents” (What Works Cities, 2017). The mayor emphasized the need to “break down data silos that exist in different city departments” and identify when performance “outcomes are not being met”. As information produced by data analytics initiatives comes to life, taking steps to put it into use includes not only benefiting one specific department, but also ensuring that multiple departments can learn from what others are doing as they start tackling their own data challenges. As the city progresses toward that view, Syracuse has positioned itself as a flagship local government in New York, receiving, in 2019, a $500,000 grant to “buy and then replace all of the city’s streetlights with energy efficient light-emitting diodes”, a technology that is known to reduce greenhouse emissions.
In 2015, the City of Syracuse established the Innovation Team (i-Team). Under the motto “changes call for innovation, and innovation leads to progress,” the Innovation Team focuses on initiatives to address public infrastructure problems and foster local economic development. As observed by Bousquet (2017), “the i-Team has a 360-degree view of city projects, serving as a central hub for coordinating initiatives to produce depth, efficiency, and better outcomes for residents”. Initiatives include the installation of street quality electronic devices that address infrastructure issues, such as street potholes, and the development of an early detection system for water infrastructure problems. Often, those problems are interrelated. For example, whenever roads in bad conditions are found, the Department of Public Works may use that information to assess the quality of the water infrastructure underneath (Syracuse i-Team: Upgrading Infrastruc …). The city has also piloted the use of sensors to collect data on potholes; work that was done manually in the past. These goals and efforts have opened a window of opportunities for using data to frame and understand problems that are brought to City Hall, a process that encompasses either further exploration of available data sources or the pursuit of new sources.
Data analytics practices in Syracuse, New York.
The i-Team is structured into three divisions: (1) the Data division, dedicated to working with departments to understand different ways of working with data to support decision-making; (2) the Innovation division, which focuses on partnering with departments to understand operational challenges and design solutions; and (3) the Accountability division, which monitors the work being performed by the data-driven programs and policies. The i-Team also created a publicly available blog to share information about current and past projects. In one segment of a post, the Innovation Team acknowledged that the City of Syracuse “will need much more than to simply buy technology” to become a smart city (Innovate Syracuse, 2017). These goals have been outlined as 1) helping departments to better understand and use data to drive decision-making; 2) think differently and creatively to innovate around specific challenges; and 3) ensure the ongoing advancement of programs and policies (Innovate Syracuse, 2017). Governance efforts to coordinate the goals across those divisions and the different departments they support were still in their early stages, with several respondents acknowledging the ambiguity of preliminary steps and the need for more sense-making. Under a section entitled “Challenges,” the Innovation Team website mentions specific points they consider very important as they move forward with their initiatives, as well as the questions they considered to be relevant to the work they do. These questions include: “Does the community understand what data are being collected and how they will be used?”, “Do people support this data collection?”, “Do our operating departments understand the outputs of data analytics and are they able to make decisions based on them?” and “Does the data being collected help solve a problem we are currently facing?” These questions are asked to clarify the path for data use ahead, and they also help the involved agencies and the i-Team to better understand and collaboratively formulate problems and prioritize both tasks and resources to manage competing demands being put in place by an ever-growing list of public challenges (see Fig. 2).
This section presents the main findings of our analysis in two subsections based on the evidence found in the case and in the following order: 1) Collaborative Practices in Data Analytics and 2) Data Analytics and the Data Life Cycle. A new concept emerged through the analysis of this case, namely, Problem Formulation for Successful Data Analytics Practices. Evidence about this emerging concept is presented in more detail in Subsection 3, including specific aspects observed in the case evidence.
Collaborative practices in data analytics: The role of leadership, governance and technology
The interviewees mentioned the role of leadership as having two fundamental purposes. First, at a more macro, political level, leadership support helped kick the WWC initiative off, sending a strong message to the public and the public servants on the extent to which the city government could prioritize data-oriented approaches to policymaking. Most interviewees indicated that the very existence of the i-Team depended on that focus. Secondly, leadership was understood as a proxy of being “self-driven” and “entrepreneurial” in their relation to data analytics. Given the often-stated limitation of information resources and the difficulties in having good access to them, the respondents seemed to hold to a high level of independence and accountability in the projects they were already conducting. The Interviewees revealed that they enjoyed the freedom to go after the data they needed by visiting communities, holding interview sessions, and thinking about innovative ways of addressing stated problems. While many of the interviewees confirmed that such a precise level of engagement was encouraged from the top of the organization, a few respondents did observe that a collective sense of involvement with the problems the team was facing was critical to move things forward positively.
Collaboration was often mentioned as a crucial element for effective data analytics work in City Hall. The respondents stated that the nature of their work is indeed collaborative and relies on participation of both the analysts and stakeholders who are involved or impacted by the issue being addressed by using precise data. By having people collaborate around that data, new understandings of existing data and of the problem itself then could emerge. While mentioned as being very key to the effort, such collaboration was not referred to as a practice that always comes naturally. Rather, it is a result of clearly concerted efforts to bring people to the same room and get them involved. That effort was not without its hurdles because the people and the data needed to address a particular problem were often scattered across different organizations and had to be mobilized in one venue. Therefore, that participation and mobilization required leadership and resources, a condition that was believed to be at least partially mitigated by the jump start provided by the WWC initiative.
Finally, it is worth noting that governance did not emerge or was not cited as being crucial to the people who were directly involved in data analytics. However, two elements that were observed by respondents could be indirectly related to that task. First, the need to have better institutional mechanisms that can bring people together and make data available for use can make data analytics more effective. Details on what those mechanisms could be, however, were not mentioned. Second, coordination emerged as being an important factor. The respondents observed that the unstructured nature of their work demanded some level of self-coordination and a go-get attitude toward the data and its relevant information for a subsequent effective analysis.
Data analytics and the data life cycle
This subsection presents the results of our case with a focus on the different stages of the data life cycle and how the activities in each stage were important for the data analytics efforts in Syracuse, New York. It is important to emphasize that the activities in each stage of the data life cycle all contributed in one way or another to a better way to use data for decision-making in the city. Also, it was learned that the data life cycle was a useful lens to better understand data analytics needs, actions, and results.
Data collection efforts and the operational side of data analytics
Syracuse, New York, seems to have a highly collaborative and interdisciplinary team that understands data management and analytics issues. From self-starting leadership to policy-design and analysis, this team benefits from a blend of data analytics practitioners and having researchers from academia who work in a consulting capacity and are well coordinated by a Chief Data Officer. Syracuse’s Innovation Team operates inside the City Hall with a structure that is similar to a think-tank. People there approached problems with autonomy, conceptualizing issues as the data was collected and sharing discoveries as the analyses were being conducted. Some respondents observed that analytics in their realm follows a treasure-hunt scenario, where different pieces of the puzzle are collected and then matched as new pieces are added.
The topic of the attended session was housing code enforcement. For the I-Team, that topic was discussed in a broader context as a “housing instability” issue. In the context of the City of Syracuse, it translated into the “high frequency of forced moves” faced by citizens, leading to such consequences as chronic homelessness and damaging financial and health impacts to the community (What is housing stability?, 2018). On the analytical side of that process, the i-Team was dedicated to collecting housing and code enforcement data to get a more complete understanding of the problem. Besides going about achieving a better understanding of the causes of evictions, for example, the analysts, policymakers, and members of the community put some effort into knowing what data resources were available, how useful they were, and where they should go, both inside the City Government and outside, to collect more information about different aspects of the problem. Syracuse’s process approach to the problem clearly recognized that data collection and management were critical to ensure that the efforts were going in the right policy direction and effectively so.
For this case, data analytics endeavors could be divided in two processes. First, statistical data on occupation and eviction rates were collected, including data sampled across different regions, but mostly from within the city of Syracuse. That data was expected to help with problem definition and point to the precise directions to be followed and to specific policy alternatives. Second, when transitioning from the “what” to the “why” questions, sessions with code enforcement personnel were expected to understand the reality of the citizens. To listen to the “voice of the residents”, the team would hold meetings with community members in their office or actually visit sites to personally collect data and become inform about the issues that needed to be addressed. They did seem to be actively involved with in-person data collection efforts, often producing data analytics products and consuming them as they learned about the problems and formulated them in different ways. According to the respondents, the data were scattered across different local governments organizations, thus often remained uncollected or inaccessible to those who needed to use them. Much of the Innovation Team’s efforts thus went to learning what kind of data does exist and to what extent inter-organizational partnerships can help them access the needed datasets for potential solutions.
Data analysis and data representation: Adding meaning to the craft
In practice, the data analytics efforts appeared to rely on multiple iterations so the desired results could be achieved. Data analytics practitioners seemed to value the tools that helped them with the numbers, but their tasks also appeared to be based on a dynamic data collection-analysis routine. In that sense, the more data they had or analyzed, the more they needed to dig deeper into the questions and issues being investigated. That part of the work appeared to be both qualitative and quantitative, with different proportions across the distinct DA roles. For instance, analysts working on similar or on the same projects were found to have quantitative and qualitative skills and use them as needed as a project evolved.
One aspect that emerged is that not only data for every problem did not appear to be easily accessible, but, when retrieved, that data were not easily manipulated. On the analysis side of the process, the hurdles with formats and missing data seemed to require manual adjustments, as well as constant validation with the data sources. Analysts were often holding meetings and collaborating with relevant stakeholders, so the information could be corrected or better contextualized for subsequent analysis. As one interviewer noted, it is a matter of verifying if their understanding of the information they do have access to is correct, and, if not, identifying what is missing and who can help address that issue. Among these missing parts, data or historical knowledge of a certain issue, such as when the problem started to be a problem, who dealt with it before, and what solutions had been considered already, were frequently cited. These were the key pieces of the puzzle that needed to come together as a consistently formulated problem that was clearly identified before any kind of analytics could become meaningful.
DA practices in Syracuse were not centered at any specific type of technology. Most of the interviewees implied that they used both structured and unstructured data, as well as information artifacts that were created as they went about understanding the problems to be solved. Data science technologies were used by analysts and leaders, but mostly in an ad hoc fashion. As three interviewees stressed, many answers are really about data, and most are either found in non-computerized form or are yet to be collected. Such understanding, as pointed out by both leaders and analysts, is what justifies the DA initiatives as field work, where investing in ties with communities as invaluable sources of data becomes a fundamental part of tasks and artifacts that were not always digital, such as Post-it notes, charts, graphs and maps spread across the walls and then constantly referred to when discussing specific problems.
As supporting elements for precise storytelling, analysts could be constantly reminded of their challenges and goals by looking at the data and information being openly displayed in their work environments. Those elements were not only positive visual references for data-driven problem solving, but also artifacts that stimulated debate and reflection among the team members. Analysts could constantly revisit those visual elements to check their collective understanding on issues and occasionally make new observations as their thinking evolved. Through the flexibility of this iterative, investigative process, data analytics practitioners could expand ono the existing collected data, extend the scope of their information sources, and more clearly formulate the problems to be addressed.
Problem formulation in data analytics practices
Evidence appears to suggest that problem formulation takes place in multiple ways and as a result of different efforts from stakeholders involved in data analytics use. Those efforts include steps related to data management and data analytics practices, but also organizational forces and context-specific demands related to the problem and the data available or still needed to tackle the problem. Therefore, it seems clear that problems are emergent issues and the result of a collaborative approach to data analytics, rather than simply a premise. Two sub-dimensions that appear to affect clear problem-framing are detailed below.
Data seeking and availability
Interviewees pointed out that problem formulation is not only an important pre-step to data analytics practices, but also an ongoing practice during the whole DA process. Mainly, this practice is nurtured by information needs or gaps that need to be addressed to move forward with data analytics and the policy issue at hand. This situation was observed by respondents who categorically classified “data analytics” as an “information availability problem” that had not been addressed “historically”. In this context, it also was observed that resources are needed to identify where the available data are coming from and collect more. In certain cases, not only the data collection part of the process appears to be demanding, but also skepticism about that dataset completeness or usefulness emerges. Some respondents indicated that they “knew” what they wanted but were sure that it was not “exactly what they wanted”. Specific information needs also exist at an inter-organizational level, where questions beyond the i-Team’s organizational boundaries were also asked (“Hey, this data question …do you guys have this dataset? Do you have the information on it? Is there any way you can look and figure it out for me?”).
Information seeking also involves revisiting or questioning existing resources to ensure the data collection process can achieve the established objectives in a timely and efficient manner. According to several interviewees, that task involves not only going after the data, but also understanding how data generation processes occur and the contexts that shape those processes (“we do make a big effort to actually go and see stuff happening …just knowing that first of all we don’t know everything …so get to know more than we do …and then also like seeing for ourselves so that we can see the processes happening …”). For one interviewee, in particular, a lot of their “time ends up being essentially being like ‘hey, we have no idea what we’re talking about here, but we want to learn …”. Finally, for problem formulation practices, the i-Team endeavored data integration efforts by looking for tacit and tentative relationships that could actually emerge from scattered or unexplored datasets. These practices uncovered problems that are “multi-headed” in nature and ask people “to find connections with data from different areas”. Also, as identified by one respondent, “(…) to effectively implement a policy with a complex problem, you have to be able to see how the different aspects of that problem affect each other …and that is what data allows you to do …everything is interconnected …”.
Problem selection and answerability
The interviewees were vocal about the importance of prioritizing and selecting the problems to solve. This process involved intense questioning, engagement from multiple stakeholders as well as multiple iterations. One interviewee observed that the process involves “having the people in the room talking to each other” and “your different data sets talking to each other, (because) that helps you, again, see patterns that you wouldn’t necessarily see, which allows you to find the best policy answer to a problem (…)”.
It also was observed that this is not a frictionless process, but a crucial part in defining what is worth being addressed, why and how. According to one interviewee, “there are always certain differences in opinions on the right way to go with a project …all across the board in the team and with other departments …(…). This process is about getting “deeper” insight into the problem and asking questions such as “is it fair for you to make that assumption?’, ‘are we thinking about ethics related to some of these things?’ and ‘should the government be making decisions on this sort of thing?’. According to a respondent, this process “does not really care about what is going on into the analysis in Excel” but is more focused on “sparking discussions”. Selection and prioritization practices in data analytics endeavors actually took place in different ways and were driven by different forces and constraints. Forces were observed that related to leadership (“if you choose a leader who wants to understand the complex relationships, who wants to work towards using data to prioritize, that’s going to have an effect on the community around you …so that is where politics plays a role …”), and to the i-Team’s internal drive to become more data-oriented and spread this vision across other departments (“…and that really should be a data-driven decision …it shouldn’t be one that’s decided based on whims or based on, you know, human beings thinking this looks like it needs attention …It should really be based on something more empirical than that …and I think the departments want that as well …it’s a question of getting to that point with our system …”).
Constraints appeared to influence problem formulation in different ways. While resource availability was outlined as being a concern, (“(…) we at the Innovation Team have tried to tackle with infrastructures …how do we prioritize which water means, which sewer means or which blocks of road get the limited funding that we have”), it has been observed that the extent to which a question is considered answerable should be taken into consideration. Excerpts such as “if the question is answerable, then I can get there or can talk someone through how to get there as well” and “…never want to be feeling like the question that I’m answering is a halfway answer” illustrate a definite concern with structuring and formulating the problem in a way that a solution can be derived and implemented. The extent to which answers obtained could then be considered valid (“in reality you are just kind of scratching the surface because of all these other different things that are in there”) or definitive (“we have to be really careful giving that answer and feel good about, (…) but not overloading them so much that he loses track of what he needs to answer the question”).
Lastly, perceptions on how the audience is going to react to data-driven answers appeared to be relevant on multiple occasions. Awareness of multiple points of view demanded data analytics practitioners to revisit assumptions (“…most of the time is like “well, just tell me where I’m wrong here …” …“Does this not look right?” …“Because it’s totally reasonable that I do not get it because I’m not filling the potholes”.) and the models used to produce those analytical outputs. Respondents identified the relevance of knowing “who” is asking the questions and what their expectations are. This focus is important to make sure recommendations are done in a targeted and prescriptive way, clearly a concern because certain assumptions and answers “do not always make sense to people”, so it is really important to “feel confident” about “counting things right” and providing a “simple” answer. Ultimately, when the data collection process involves the community being served, the community’s perspective is also taken into the problem formulation effort. According to one interviewee, “…showing, presenting and explaining what it is that they’re looking at and what it is that they’re reading and what they can do with it (data)” are a “challenge sometimes”. When visiting neighborhood groups, for instance, citizens get “rightfully frustrated” when “someone from the city shows up and they don’t get the answer that they want”.
Discussion and implications
This section discusses the main results of this study in the context of previous studies and presents a few implications for ongoing research and future practice. It is divided in three subsections. First, it briefly talks about data analytics as a collaborative practice and the role of leadership, governance and technology in that process. Then, it argues that data collection and data representation are strategies used to give more structure to the collaborative data analytics effort and better understand the problems. Finally, it proposes that problem formulation is a collaborative effort, and that effort should be part of any data analytics process, particularly in local governments.
Leadership, governance, and technology as enablers of collaborative efforts in data analytics
Evidence of the importance of leadership across City Hall’s DA efforts is consistent with previous research that established the connection between leadership and effective use of information (Cronemberger et al., 2017; Gil-Garcia & Sayogo, 2016). This time, such connection was considered in the light of data analytics as an unstructured number of practices sponsored externally by the WWC initiative and internally by a political and organizational buy-in. This finding also echoes the literature on the importance of leadership when addressing information needs (Cronemberger et al., 2017) and may raise important new questions for how to nurture those forces better and also keep them active. Levels of leadership support could also be assessed whenever cities attempt to achieve specific results by effectively using data analytics.
As part of positive leadership roles and governance concerns, the findings of this study suggest that establishing a data analytics agenda is central to getting stakeholders and resources both involved in specific data problems. In that regard, WWC seemingly contributed as a critical first step toward successfully institutionalizing DA as a formal policy tool in Syracuse (Hagen et al., 2019), indeed one that both symbolically and in practical terms is capable of creating momentum and fostering intra-organizational collaboration to address city challenges. Similar to triggering efforts, such as open government and open data initiatives (Puron-Cid et al., 2016; Ruijer & Meijer, 2020), these findings imply that data analytics and related topics, such as data management (Harrison et al., 2019; Harrison et al., 2018), are central and needed for public leaders who are willing to engage with DA and want to successfully push a smart city agenda (Poltie et al., 2020; Sun et al., 2016).
As far as these leaders’ technical experience with DA use, one important aspect relates to the qualitative nature of their DA work. Interviewing and coding of unstructured data were a common and central aspects of the conceptualization and formulation of problems. That was evident through many Post-it notes used for group dynamics and “human-centered design approaches” that were found in the rooms where policy meetings were held (see Fig. 2). Data analytics in Syracuse appeared to be highly interdisciplinary, human-centered, and focused on diverse information artifacts, including spreadsheets, maps, and drawings, as well as formal and informal conversations that helped guide the analysis. Much of the data collected informed overall policy-design as well as specific processes through which more data could be collected. That focus could be observed in a session with code enforcement personnel, which was held to not only revisit and reframe residential problems in the city, but also to identify what information could be missing for better analysis of their data collection processes. Following a debate-mediated structure, the i-Team attentively listened to concerns and opportunities for improvement and took notes on perceptions. These notes were later used to define what the next steps for data collection could or should be.
Additional evidence that suggests the foundational importance of problem formulation in data analytics practices comes from the fact that the interviewees were very clear about the scarcity of necessary information to do their analytical jobs. For many of them, that meant that parts of the problem to be solved were not fully understood and needed further investigation. They attempted to better conceptualize the problem and characterized the process as more of a “learning-on-the-go”, a strategy that involved interviewing citizens and organizing information obtained from workshops run in the city hall to obtain further information about a particular problem. Finally, information scarcity also created incentives for in-person visits to partnering agencies, where key staff members would potentially have important data sources or have access to that data in legacy IT systems. According to some respondents, the opportunity to understand information generation processes also contributed to add context to available sources and, in doing so, facilitated problem formulation. Data analytics outputs would further benefit from this effort because a more informed analysis could lead to richer perspectives even when having limited data.
Data collection and representation as structuring collaboration practices
The results suggest that data management in data analytics does not necessarily follow a structured process. Rather, it is a result of dynamic practices that unfold and adapt over time. That development occurs as data analytics practitioners explore existing and new sources of data. With that exploration can come a greater level of understanding of the problem being solved. It has been suggested that data analytics in small jurisdictions is participatory, closer to co-creation (Yu et al., 2019) and innovation practices observed in smart cities research rather than mostly technological as explored in literature on analytics from a technical perspective. From a public policy design standpoint, data management processes in data analytics use in local government may need to be more reliant on the ability to deal with unstructured data, than with assuming that all the data needed to solve a particular problem will be found in a single place and be ready to be used for a clear purpose. The scattered nature of information resources echoes research on the role of information sharing and integration to successful digital government projects (Dawes et al., 2009; Gil-Garcia & Sayogo, 2016).
It is also important to highlight the centrality of data in collaborative processes (Susha et al., 2017). Although participation seemed to be a key element, information produced and manipulated by data analytics practitioners were the core output and, as such, that information needed to be systematically documented. Such documentation was reflected in the number of Post-it notes used during data collection interactions and the subsequent documents created from the information gathered in those sessions. The literature refers to those artifacts as “boundary objects” (Black, 2013; Cronemberger et al., 2017), but the research discussing this information science concept in the context of data analytics is still scarce. Finally, the more qualitative approach portrayed by Syracuse’s experience also appears to contrast with research that has been more focused on technical and quantitative capabilities than on the ability to create important information from soft, non-digital data. That scenario could be the case for unstructured innovative practices, such as co-creation frameworks for smart cities (Yu et al., 2019), but still always considering the specificities of data analytics in local government contexts.
Adding collaborative problem formulation to data analytics practices in local governments
Syracuse’s experience is enlightening for several reasons. First, the evidence suggests that DA is not centered at technological artifacts only, but also occurs at collaborative practices through which raw data and information on public problems are shared and used. In that process, the formulation of problems helps define the guiding steps for data collection and, subsequently, for data analytics outputs. Such a process is markedly iterative, with multiple sessions dedicated to knowledge and information sharing, which then leads to a more complete understanding of specific issues. There is also evidence that collaborative problem formulation is a central practice and could be considered as a foundational step of DA as a process. Such efforts are normally qualitative in nature, relying on multiple iterations through unstructured data, and they facilitate collective sense-making and alignment with regards to defining the exact problem that needs to be addressed. This finding is consistent with previous research that acknowledges the importance of problem formulation and how the process can affect the overall results of data analytics (Passi & Barocas, 2019).
As per the findings, the problem formulation phase of DA still needs examination at different levels, including the role of political leadership and the public whenever setting priorities for data analytics use. Externally oriented initiatives, such as WWC, seem to have influence and offer political endorsement at the local level. Internally, a focus on problem formulation and data needs (Harrison et al., 2019) seems to prevail over technology, but it also seems to rely on important soft skills and specific organizational characteristics, including the ability to bring different stakeholders to collaborate around data and produce innovative and useful information. A theoretical view that positions DA as a socio-technical process and a transformational practice may lead to a variety of workable and useful models and frameworks. For instance, a comprehensive framework of DA use based on the Syracuse experience should be attentive to such concepts as collaborative practices, including the role of leadership, governance, and technology, as well as the importance of collaborative problem formulation for data analytics in local governments. This framework could be useful for researchers who are interested in data analytics in local governments to engage in a more comprehensive view and understand data analytics as a socio-technical process and transformational practice, in particular by including the important role of collaborative problem formulation (see Fig. 3).
A revised framework of data analytics as a collaborative process in local governments.
First, in line with the literature on digital government, innovation, and smartness (Barns, 2018; Gil-Garcia & Sayogo 2016), leadership was found here to be a driving force in setting the direction for data analytics use. As observed, in Syracuse such leadership starts with political commitment with the DA agenda and is supported along the way by the WWC partners (external stakeholders) and City Hall champions (internal stakeholders) that are involved in the different efforts. Since members of the Innovation Team are part of a highly collaborative environment and appear to enjoy autonomy for their responsibilities with the project, leadership in DA could also be interpreted as a transformational force that is triggered by a few actors and stewarded by collaborators both inside and outside City Hall.
Second, as mentioned earlier, problem formulation should be clearly identified as an important initial step for defining what needs to be addressed through DA. It also seems appropriate to link collaboration, a construct that is not new to digital government literature and continues to be studied, to actual problem formulation efforts. Since the research in problem conceptualization and formulation in the context of data analytics is still limited, those theoretical linkages should be further explored empirically. One possibility is to further understand collaborative problem formulation, by explaining which stakeholders and what conditions should be in place to help ensure a more effective DA process in local governments.
As suggested by the Syracuse experience, the existence of ad hoc approaches to DA indicates that some operational flexibility is needed to accommodate multiple ways to define and address any problem with the data. The relatively newness of the topic, particularly in terms of DA practices in local governments, could benefit from more flexibility in defining the models and frameworks for DA use and specifically for actual collaborative problem formulation. Also, more categories or lenses could be added considering new technical or theoretical advancements. New concepts and their relationships should also be considered for revising our proposed framework and, more generally, adding to our knowledge about DA and its impacts on government innovations in both delivery of services and implementation of policies at the local level.
By applying a qualitative approach, this paper explores the enablers of data analytics practices in a local government and proposes how better to understand the underlying mechanisms of data analytics as a collaborative process in light of the gathered empirical evidence. It proposes to expand this view by adding the role of problem formulation and explaining its importance in the context of collaborative data analytics in local governments. Collaborative problem formulation for data analytics has not been extensively treated in previous research, but it emerged as very important for the case analyzed in this paper. It must not be ignored especially since multiple studies have indirectly explored problem formulation as a topic. Hackathons, co-creation practices, and data collaboratives are constantly referred to as problem-solving practices and part of what they do is help to frame and conceptualize a problem. Nonetheless, we argue that not enough emphasis is given to the data processes and practices through which such problems are framed or to how those conceptualization efforts specifically relate to data analytics. In addition, some dimensions of these problem formulation practices, such as prioritization and answerability, have not been sufficiently highlighted in prior studies about data analytics in local governments. In the case we studied, analytical technologies were not found to be as central to data analytics as data-driven collaborative arrangements, frequently supported by city leadership and motivated stakeholders with a problem-solving mindset. It is thus important to clarify that our findings are not necessarily generalizable to other cases and more research is still needed to understand both their applicability and usefulness.
The evidence and discussion presented in this paper contributes to the existing literature in several ways. First, our study acknowledges the importance of previous theory in interdisciplinary research, particularly on socio-technical systems, public administration, and digital government, where constructs such as collaborative governance have been introduced and developed. These constructs are key to understanding data analytics as a collaborative process in local governments. Second, it proposes the data life cycle framework as a meaningful way to better understand the operational aspects of data analytics routines, a topic that is still relatively underexplored in public management, but clearly emerged as relevant in this study. The focus on the data life cycle helps us better understand how each of the different stages/activities contributes to data analytics practices in local governments, including data collection, analysis, and representation. Third, our study introduces collaborative problem formulation as a component and critical principle in data analytics practices, which still has remained underexplored from both technical and social perspectives, particularly with respect to local governments. Collaborative problem formulation seems to be as important to data analytics in local governments as any of the other stages in the data life cycle, if not more so. Finally, the results of our study provide a positive foundation for future studies on additional ways to add value through transformational data analytics practices in the public sector. Value creation in that sense would not need to be necessarily leveraged through technical solutions or policy goals, but via a better understanding of persisting roadblocks on the way toward achieving effective data analytics practices. In the case presented here, collaborative problem formulation emerged as being particularly important, but there may be different and new concerns in other local governments, and these of course deserve a similar in-depth analysis.
Given that recent research has been avidly exploring smart city practices and the promises of having new available data, it is surprising that not enough studies exist with a focus on the specific strategies, mechanisms, and practices of data analytics in local governments. Such smart city research agenda can benefit from more cases, comparative studies and longitudinal studies by examining, for example, the dynamics of data analytics as a collaborative process or the extent to which the absence or presence of certain factors, as outlined in this paper can affect data analytics efforts in local governments that have different characteristics and a variety of contexts. Such a focus on variation could inform the development of taxonomies on different forms of data collaboratives in local governments and be clearly of positive practical importance for data analytics maturity assessments, especially within the discussions related to a smart city agenda.
The proposed model also offers a set of propositions that could be further scrutinized. These efforts could be enriched by using concepts from mature Information Systems frameworks, such as UTAUT2. Each construct could be studied in isolation in other local government cases and that research could capture, for instance, the extent to which those stages (from the data life cycle) vary according to the problems being formulated and how such variations can be explained. Answers to these questions are relevant to the data life cycle and government information management theories, but also of practical importance to public managers who are daily engaged in data use and data analytics in local governments. A deeper exploration of the data life cycle model and its stages related to data analytics in local governments can generate new insights into this topic. Similarly, the connections noted here to the concept of data collaboratives could also be explored in future research. Another avenue for future research would be paying attention to a specific data analytics technique, such as sentiment analysis, and how it is used in local governments. Finally, a single case study is always enlightening, but some of the experiences are expected to be context-sensitive so more research on similar realities, particularly, in small and medium cities that are already using or planning to use DA as a strategy to promote innovations in services and policies, should be part of any future research agenda.
Footnotes
Author’s Bios
Felippe A. Cronemberger is research fellow at the Center for Technology in Government, University at Albany, State University of New York. He has worked in technology and consulting for more than 15 years, for both the private and public sectors, and has taught courses in web development, emerging trends in information technology, design thinking, and business intelligence. He was a research assistant through the SUNY Research Foundation and a visiting research fellow at ITMO University, in Saint Petersburg, Russia. His dissertation project received the University at Albany Dissertation Research Fellowship Award to explore factors influencing data analytics use in local governments. Felippe’s research interests are interoperability, information sharing, and simulation modeling.
J. Ramon Gil-Garcia is a Professor of Public Administration and Policy and the Director of the Center for Technology in Government, University at Albany, State University of New York (SUNY). Dr. Gil-Garcia is a member of the Mexican Academy of Sciences and in 2013 he was selected for the Research Award, which is “the highest distinction given annually by the Mexican Academy of Sciences to outstanding young researchers.” He is also a member of the Mexican National System of Researchers as Researcher Level III. In 2009, Dr. Gil-Garcia was considered the most prolific author in the field of digital government research worldwide and in 2018 and 2019 was named “One of the World’s 100 Most Influential People in Digital Government” by Apolitical, which is a nonprofit organization from the United Kingdom. More recently, in 2021, Dr. Gil-Garcia was one of the recipients of the two inaugural Digital Government Society (DGS) Fellows Awards. Currently, Dr. Gil-Garcia is also a professor of the Business School at Universidad de las Américas Puebla in Mexico, a Faculty Affiliate at the National Center for Digital Government, University of Massachusetts Amherst and an Affiliated Faculty member of the Information Science Doctorate Program at the College of Emergency Management, Homeland Security and Cybersecurity, University at Albany. Dr. Gil-Garcia is the author or co-author of articles in prestigious international journals in Public Administration, Information Systems, and Digital Government and some of his publications are among the most cited in the field of digital government research worldwide. His research interests include collaborative digital government, smart cities and smart governments, data and data analytics for decision making, artificial intelligence in government, adoption and implementation of emergent technologies, digital divide policies, information technologies in the budget process, digital government success factors, and multi-method research approaches
