Abstract
Large-scale education interventions aimed at diminishing disparities and generating equitable learning outcomes are often complex, involving multiple components and intended impacts. Evaluating implementation of complex interventions is challenging because of the interactive and emergent nature of intervention components. Methods that build from systems science have proven useful for addressing evaluation challenges in the complex intervention space. Complexity science shares some terminology with systems science, but the primary aims and methods of complexity science are different from those of systems science. In this paper we describe some of the language and ideas used in complexity science. We offer a set of priorities for evaluation of complex interventions based on language and ideas used in complexity science and methodologies aligned with the priorities.
Interventions designed to produce impact in education systems are not simple (Snyder, 2013; Supovitz & Taylor, 2005). They typically involve multiple intervention components aimed at multiple targets in multiple locations intending to influence multiple outcomes. For example, launching a new curriculum throughout a school district to reduce early literacy disparities is not a simple intervention nor is the introduction of federal education policy to promote accountability and reduce academic achievement gaps (Snyder, 2013). Each intervention involves implied change models for individuals (e.g., students, teachers, principals) and higher order groups of individuals within the system (e.g., schools, school districts, state departments of public instruction). As such, interventions that aim to produce an impact in education systems are complex.
Evaluating the implementation of complex interventions poses challenges such as identifying the full set of intervention components and implementers, determining how implementers interact within and across organizations in ways that drive intervention impacts, and locating the contextual enablers and barriers to implementation and impact in the system. Implementation evaluators must address these challenges not only to understand whether an intervention was implemented as intended but also to begin to understand how the system functions as a whole to maintain or potentially disrupt the current status.
Frameworks exist for identifying aspects of complex interventions to support implementation and evaluation. For example, Glouberman and Zimmerman (2002) have developed a formal typology of simple, complicated, and complex interventions (for an applied example, see Table 1). The typology indicates that simple interventions produce reliable effects by following an established protocol, and the protocol can be followed in most cases, allowing for straightforward evaluation of implementation fidelity. Complicated interventions include more components and as such require additional expertise, but still build on cause and effect relationships known to support reliable predictions and causal claims of intervention impact. As such, complicated interventions present greater challenges for implementation evaluation, but fidelity can be evaluated using models that incorporate the complications. However, unlike simple and complicated interventions, complex interventions cannot rely on preestablished cause-and-effect specifications for implementation. Rather, how and why a complex intervention works—including how implementation and impact are interrelated with context factors—must be learned during the implementation process.
Conceptualizing an Early Literacy Intervention as Simple, Complicated, and Complex.
In other work, evaluators have drawn from systems science to identify focal points for the evaluation of complex initiatives. To aid these efforts, M. B. Hargreaves and Podems (2012) have provided an informative review of texts that address evaluation through complex systems lenses. The texts highlight the interactive, nested, and emergent nature of phenomena in complex systems and suggest useful methods for investigating and learning within evaluations of complex systems. Evaluations that build from complex systems frameworks yield important insights into the networked nature of systems change (M. Hargreaves et al., 2013).
Although there has been an increased push for complex systems thinking in evaluation, a push that we consider instructive and useful, the evaluation field could benefit from additional attention to work that highlights major foci of complexity science. There is some overlap in the terms and ideas used in systems and complexity science. For example, dynamic processes, emergence, and whole-part distinctions are relevant to systems and complexity science. Nonetheless, the fields of systems and complexity science—their methods and objectives—are not identical. Authors who have sought to distinguish the fields demonstrate that complexity science is organized around the study of complex adaptive systems distinguished by a focus on agent behavior (Phelan, 1999). Additionally, complexity science seeks to identify simple rules that govern agent behavior and adaptation within complex phenomena (Holland, 1992; Holland et al., 2005).
In this article, we do two things to build the capacity of evaluators to engage with complexity science concepts. First, we articulate three priorities to guide evaluation in complex interventions that we synthesized from review of implementation, evaluation, and complexity science literatures. As a foundation for the priorities, and especially for their relevance to evaluating implementation in complex interventions, we preface the priorities with a summary of some of the primary concepts related to complex adaptive systems as used by complexity scientists. Second, we describe methods that we have used when studying implementation of complex interventions that align with priorities for implementation evaluation in complex adaptive systems.
Priorities for Implementation Evaluation in Complex Adaptive Systems
A Note on the Review Method
Much of the initial review process leading to the articulation of evaluation priorities was utilitarian in nature and conducted over a period of 5 years. During that time, we searched for examples of projects or scientific recommendations that could address questions we encountered while conducting systems change evaluations or proposing methods for evaluations of systems change. The literature reviewed initially included publications on implementation practice and science as well as publications documenting the challenges of identifying causal relations and impacts in field experiments.
The literature search was complemented with the information one author gained from coursework completed through the Santa Fe Institute. Consequently, the literature on complexity science that we reviewed came from biological, physical, and social scientists affiliated with the Santa Fe Institute. Although the focus on work emanating from the Santa Fe Institute did not lead to an exhaustive review of literature related to complexity science, the review reflects ideas central to the first institute dedicated to the study of complex adaptive systems. In borrowing heavily from this community of interdisciplinary scientists, we aimed to expand what we were learning from the review of the implementation and evaluation literature, so that we might generate new insights around the questions we were encountering in evaluations of complex adaptive systems.
We followed recommendations for integrative review methods (Torraco, 2005) in an effort to generate priorities for implementation evaluation in complex adaptive systems. These recommendations included guides for synthesizing what can be learned from different fields into new and compelling research agendas. What follows is a preface on concepts from the science of complex adaptive systems gleaned from the review process and a set of three priorities for evaluation that map onto key aspects of complex adaptive systems, namely the focus on agents, interactions among agents, and the study of agents’ responses to environmental signals.
Complex Adaptive Systems
The term complex adaptive systems refers to groups of interacting individual components, known as agents, that learn and change on an aggregate level in response to experiences encountered in the environment (Holland, 1987/2018, 1992, 2006). Examples of complex adaptive systems from animal sciences include the murmuration of starlings that fly in groups of thousands, exhibiting elaborate patterns and shifts of flight without any central leadership or coordination. Large schools of fish swim similarly, with the group formation moving collectively and adjusting to predators without requiring individual fish to possess knowledge of the whole formation. The science of biological complexity seeks to understand how simple rules underlying individual agent behaviors unfold over time to produce elaborate group patterns. Patterns are not only rule dependent but also highly sensitive to initial conditions—an idea referred to as chaos—constraining the ability of scientists to predict ultimate formations despite identifying simple rules.
Early specifications of complex adaptive systems indicated that they shared four properties: hierarchical organization; system–environment interactions; subsystem interactions; and internal models (Holland, 1987/2018). Like the whole-part distinction in systems science, complexity science recognizes that complex adaptive systems are formed by building blocks that aggregate into higher orders revealing a hierarchical and/or networked structure. Building blocks interact with each other and with the environment in ways that are guided by internal models. Internal models exist as rules guiding system responses to the environment.
More recent specifications of complex adaptive systems identify four major features that are not entirely different from the four properties named above but are more specific about agent behavior (Holland, 2006). First, complex adaptive systems are characterized by parallelism, where agents send and receive signals or otherwise transmit information. Next, agents not only act but act conditionally based on the signals they receive. Further, the actions that agents perform can be organized into subroutines triggered by environmental cues. Finally, the agents change over time. That is, their composition and behavior change as a result of experience acting in the environment.
Lessons learned in the study of complex adaptive biological and physical systems have generated important insights for the study of complex adaptive human systems. There are active research programs identifying the complex nature of cities (Bettencourt, 2013; West, 2017), economies (Battiston et al., 2016; Farmer, 2016), and human behavior (Jackson, 2019; Kohler & Smith, 2018). Traditionally, much of complexity science has relied on the mathematization of complex phenomena as well as modeling techniques, such as agent-based modeling and network analysis, that shed light onto the interactions among individual agents, the groups they form, and their responses to environments. Increasingly, qualitative methods and historical analysis serve as useful tools for examining complex phenomena.
Research agendas emanating from complexity science embrace the notion of sensitivity to initial conditions, acknowledging that small variations in the starting conditions of an intervention might dynamically evolve over time to produce substantive and substantial differences in processes and outcomes. From this perspective, evaluation methods that employ control or comparison groups or make strong assumptions about fidelity might fail to recognize complex features of implementation.
Essentially, the study of complex adaptive systems is the study of phenomena that are hidden from easy understanding because of their nonlinear, random, hierarchical, and emergent nature (Krakauer, 2018, p. xxviii). As such, complex adaptive phenomena require us to develop new models and frameworks to reveal the hidden nature of systems functioning.
Emerging Priorities
Given that complex interventions might not conform to commonly held assumptions about comparison and fidelity, new models and frameworks are needed for studying implementation and impact. In this section, we propose three priorities for evaluating implementation in complex interventions. The priorities build from the review of concepts related to complex adaptive systems as they enable the study of implementation agents, interactions among agents, and their response to environments in complex interventions. Moreover, the priorities are intended to facilitate the creation of models that illuminate hidden aspects of implementation in the complex intervention space.
Priority 1: Embedding. As an outside and infrequent observer, or when faced with a rigid data collection protocol that excludes tools for examining emerging properties of an intervention, it can be difficult to identify the intervention’s complex aspects. In this light, it is beneficial for the evaluation team to embed within the local context, interact frequently with stakeholders, and adapt methods based on information learned from these interactions. Strategic engagement of multiple stakeholders—those who affect and are affected by the intervention—can improve evaluators’ knowledge of how the intervention works. Evaluators can design methods for exploring with stakeholders how implementation agents are behaving, how the environment affects their behavior, and what guidelines are emerging based on evidence of impact. Sufficient time embedded in the implementation context is required to support this work.
Priority 2: Sensemaking. Even when embedded, evaluators and other stakeholders might need structured opportunities and dedicated time for sensemaking to understand interactive intervention components adequately. Marchal and colleagues (2014) have learned that in complex interventions, it is very difficult to predict system functioning and outcomes, and instead, evaluators must make sense of processes and their relation to outcomes retrospectively. They advise that the primary task of sensemaking is to build plausible explanations, and not predictive theories, through a process of learning while doing. In partnership with stakeholders, evaluators can design data collection strategies and routines for learning from evidence that supports sensemaking and theorizing not only about the rules that govern implementation agent behaviors within the intervention but whether the current rules are sufficient for producing desired outcomes.
Priority 3: Operationalizing. Sensemaking can be improved by operationalizing intervention components as thoroughly as possible. From the outset, a major aim of the implementation evaluation can be to identify the active ingredients of implementation and how they exert their effect on proximal and distal outcomes (Craig et al., 2008). Clear operationalization of intervention components is a traditional aim for intervention evaluation (Shadish, 2010). The challenge in complex interventions arises due to unspecified or underspecified intervention components that are critical for implementation and impact. For example, communication strategies, team learning resources, and practice protocols might support effective implementation of a systems intervention yet not be specified as intervention components at the launch of the intervention. Rather, the need for these implementation supports arises during implementation. Operationalizing emerging implementation supports can help evaluators make sense of how implementation agents interact in specific environments and how those interactions do or do not support desired outcomes.
Summary
The three priorities outlined above—the priority for embedding in context, the priority for sensemaking routines, and the priority for operationalizing strategies—reflect a set of driving questions pertinent to evaluating implementation in complex adaptive systems. Embedding: How can evaluators begin to identify the full set of stakeholders in the current intervention and how stakeholders influence implementation and outcomes through their knowledge and practice? Sensemaking: What do stakeholders need to know about practice, context, and outcomes in order to implement effective practice? What are useful routines for generating learning among stakeholders? Operationalizing: What actions can stakeholders make to learn about and generate desired outcomes in various settings?
In the next section, we describe methodologies for identifying and working with responses to the driving questions.
Methodologies for Implementation Evaluation in Complex Adaptive Systems
In the following sections, we describe three methodologies that we have used to study implementation of complex interventions. The first, agile implementation research, supports embedding and sensemaking through four phases of structured facilitation techniques (Kainz & Metz, 2019). The second, developing a theory of action, uses participatory methods for embedding and operationalizing, so that stakeholders articulate and understand their significant roles and responsibilities within the intervention. The third, practice profiles, promotes deeper sensemaking and operationalizing through participatory methods for identifying key components of and effective practices for the diverse roles in complex interventions (Metz, 2016).
Agile Implementation Research for Embedding and Sensemaking
The term agile comes from the software development field (Kitzmiller et al., 2006; Martin, 2003), where it is used to describe a research culture that prioritizes and supports evidence use and adaptation to achieve desired outcomes. In health care fields, agile implementation research engages implementation support professionals, stakeholders, and evaluators in cycles of inquiry that drive the creation and use of local evidence among implementers for the purpose of achieving desired outcomes. Traditional models of implementation science build from the assumption that evidence-based practices will drive desired outcomes when implemented effectively, with fidelity. More recent work in implementation science promotes continual adaptation as information emerges about which mechanisms drive effective practice in dynamic contexts (Aarons et al., 2019). Evaluating practice in light of new information about strategies for optimizing feasibility, acceptability, and impact can be described as principled implementation or the identification of practices that are well suited to the context and lead to desired outcomes. Agile implementation research supports principled implementation by (1) generating local evidence of the relations between implementation and outcomes, (2) creating routines for stakeholders to make sense of emerging evidence, and (3) launching feedback processes for using the results of sensemaking activities to drive implementation practices, even through adaptation.
Agile implementation approaches shift focus away from bringing evidence-based interventions to scale and instead focus on using evidence in situ to spread effective practice and desired outcomes in dynamic contexts. This method of continual evidence use begins with a shared understanding of the temporal and contextual nature of effective practice. Effective practice encompasses a suite of human actions and interactions that (1) occur in the present time and context, (2) yield compelling evidence that they produce expected and desired outcomes in the present time and context, and yet (3) in many cases are based on research evidence produced in the past and in other contexts (Kainz & Metz, 2019). The shift toward principled implementation is a shift toward time- and context-specific effective practice, which is by definition an agile process.
An agile research agenda responds to calls for rigorous and relevant science that can guide implementation and adaptation in complex settings (Chambers et al., 2013; Ghate, 2015; Glasgow and Chambers, 2012). As part of an agile research agenda, evaluators facilitate stakeholder groups through four interrelated and iterative phases designed to support evidence generation, sensemaking, and implementation/adaptation: (1) convening relevant stakeholders, especially implementing practitioners, (2) articulating models and expectations for action and outcomes, (3) conducting frequent inquiries into expected and unexpected aftershocks in complex systems, and (4) managing knowledge and adapting models and practices to achieve more desirable outcomes.
Convening relevant stakeholders and practitioners
Successful uptake of evidence requires strategic and guided interactions among evaluators, service providers, and other key stakeholders (Palinkas et al., 2011). Evaluators facilitating agile implementation must develop methods for convening relevant stakeholders and practitioners, engaging them in articulating expectations, and revising expectations based on emerging evidence. Successful convening requires particular attention to emerging evidence of how and under what conditions stakeholders and practitioners work together to recognize and sustain shared motivations and drivers of high-impact practices (Rycroft-Malone et al., 2015).
Articulating models and expectations for action and outcomes
Evaluators can facilitate stakeholder meetings with the purpose of identifying models for action and expectations for results in the collaborative implementation setting. Although it might seem unnecessary to articulate models for action, our experience suggests that stakeholders arrive in the planning space with diverse backgrounds, knowledge, and understandings of the intervention rationale and purpose. Articulating models and expectations—especially visual representations of models—can catalyze stakeholder and practitioner knowledge in ways that lead to shared understandings and productive refinements of the implementation strategy. Models that represent the program theory can be useful for guiding change in complex systems (Koleros et al., 2018).
Conducting frequent inquiries with attention to expected and unexpected aftershocks in complex systems
Ongoing learning is a core value of the implementation setting (Chambers et al., 2013; Damschroder et al., 2009), and agile research conducts incremental inquiries designed to yield evidence and promote continual learning. To be sure, gathering and learning from evidence in real time can be challenging for professionals whose roles might not allow the time or flexibility for sensemaking (Antonocopoulou, 2006). Moreover, in many complex systems, the political and authority dynamics of the practice setting can constrain the potential for learning from evidence and adapting interventions accordingly (Lemay & Sa, 2012). For this reason, agile research includes methods for detecting and studying the expected and unexpected aftershocks that might arise while intervening with complex adaptive systems (Walton, 2016). Intentionally studying aftershocks can improve researchers’ models of the system factors that related to causal mechanisms in context.
Managing knowledge and adapting models and practices to achieve more desirable outcomes
Increasingly, implementation scientists are documenting the techniques and conditions that support adaptations in systems, organizations, and programs that lead to improved outcomes (Aarons et al., 2012, 2019). Evaluators can design methods for creating learning routines for stakeholders, coming to agreement about and archiving decisions based on learning, and using decisions to guide subsequent action and adaptation. Such methods that elicit, structure, and use stakeholder knowledge are referred to as knowledge management techniques. A host of resources for designing participatory methods for knowledge management and intervention adaptation are available to researchers (Castelloe et al., 2002; Chen et al., 2013; Hislop, 2013; Kainz & Metz, 2019; Nicholas et al., 2019; Sterman, 2006). Managing knowledge and implementing adaptations often requires extensive time, and it is important to consider the capacities and constraints of participating stakeholders (Rycroft-Malone et al., 2016). In particular, soliciting practitioners’ perspectives when determining what is possible and desirable can serve as a positive and trust-building foundation for agile implementation work.
Theories of Action for Embedding and Operationalizing
An initiative’s theory of change outlines the predicted causal linkages between the activities conducted as part of the initiative and the expected short-term, intermediate, and long-term outcomes for the population of interest. Ideally, the theory of change also details researchers’ assumptions about how and why they expect a desired change in outcomes to occur in a particular context. In simple interventions involving a single organization, a theory of change is typically sufficient for defining causal linkages and guiding research questions concerning the efficacy and effectiveness of the intervention.
In complex systems interventions, by contrast, a theory of change is necessary but likely insufficient for defining causal linkages and guiding impact evaluations. Instead, a more detailed approach is needed to understand who is responsible for conducting the various components of the intervention, how the work is contextually situated, and what infrastructure and other supports are needed to ensure that change is realized.
We use the term theory of action to describe a tool for articulating roles and responsibilities in complex interventions. Borrowing from the work of Patton (2008, 2010), we start with the claim that a theory of action includes stakeholders’ rationales for what they are attempting to do, how implementation will drive outcomes, and what supports are needed to drive implementation and impact. Expanding from Patton, we have identified a set of intervention components and guiding questions to facilitate participatory action theorizing with stakeholders (see Table 2). Compared to a theory of change, we believe a theory of action is more capable of tracking multiple actors’ shared contributions to complex change processes and outcomes in a systems intervention.
Theory of Action Elements and Guiding Questions.
Developing a theory of action at the beginning of an initiative or as an initial evaluation step will maximize its impact on the intervention. In fact, forming a theory of action is ideally the first step in creating an evaluation framework and an evaluation plan. This process requires a strong facilitator, input from key stakeholders from various levels of each organization involved in the initiative, adequate time for sensemaking, and tools and space to enable participants to visualize theory elements (e.g., rooms that accommodate small group interactions; large wall spaces for mapping with sticky notes). It can also be helpful for participants to complete a preliminary survey to start critical thinking about causal linkages, roles, context, and infrastructure in order to expedite the creation of a theory of action later on.
We recommend beginning building a theory of action by identifying and convening the key stakeholders who will be involved in the process, including leaders and staff involved in implementing the components of the intervention and members of community groups that will benefit from the work. The goal of convening key stakeholders is to understand the range of perspectives and opinions of those effecting and affected by the proposed intervention. This process is not intended to develop a theory of action that perfectly meets all parties’ needs but to encourage participants to ask and answer questions, listen to and learn from others, and build a detailed, living map of the system change initiative.
There are different ways to sequence the work required to develop a theory of action, but the meeting process, occurring over several sessions, will typically address six aspects of the systems intervention: (1) vision, (2) outcomes (long- and short-term), (3) key actions, (4) links between actions and outcomes, (5) change partners/actors, and (6) context. Table 2 provides sample questions to guide the theory of action development process. Additional guiding questions can be added based on an initiative’s particular agenda. For example, if an initiative has a social justice agenda, then researchers can ask questions about equitable actions and outcomes in building the theory of action. The Collaborating for Equity and Justice materials (Wolff et al., 2017) contain useful tools for interventions charged with promoting and evaluating equity dimensions.
For a theory of action to be more than just a one-time map exercise, the facilitator/evaluator should build the capacity of local community leaders to lead subsequent discussions as well as future theory development and revision by passing along tools. Ideally, the theory of action will become a dynamic part of an ongoing initiative and serve as a planning tool for future sustainable intervention impact planning.
Practice Profiles for Operationalizing and Sensemaking
Complex challenges often call for complex solutions that do not simply expand or modify preexisting interventions. In this regard, communities seeking equitable and improved outcomes are often unable to use existing manualized programs to address the complex and emerging challenges confronting this goal. In these cases, communities use available knowledge to develop feasible, evidence-based, context-specific interventions that meet the unique needs of a target population. However, in doing so, these communities regularly begin with conceptually defined strategies, meaning that when these interventions lack specification, it is often challenging for community actors to implement the intervention with quality, to improve the intervention over time, and to sustain and scale the intervention to achieve population-level impact (e.g., Hall & Hord, 2006).
Practice profiles help operationalize a conceptually defined strategy through community engagement and research methods, ensuring that it is clear what practitioners will do as they carry out a given intervention. Once an intervention is described in sufficient detail, implementation methods can help to accustom staff to the new organization of work, to use data to continuously improve the intervention, and to ensure that leadership and administrative practices align with new expectations. Creating enabling contexts that leverage and build hospitable funding, acknowledge regulatory and policy environments, engage key stakeholders, and promote ongoing learning will also assist the operationalization process.
Practice profiles are informed by three different disciplines: implementation science, co-creation, and improvement science. Implementation science seeks to understand which approaches best translate research findings into real-world applications and deploys those approaches in different contexts to achieve a range of outcomes (Ramaswamy et al., 2019). A core principle of implementation science is that an intervention must be teachable, learnable, doable, and assessable for those using, evaluating, and supporting the intervention in real-world settings. Practice profiles enhance interventions in each of these respects, promoting shared understandings of an intervention’s core components and providing measurable indicators of success for each component to help researchers and stakeholders support and evaluate the intervention. Co-creation occurs when stakeholders actively engage in all stages of the production and implementation processes, resulting in service models, approaches, and interventions that are optimally contextualized (Metz & Bartley, 2017; Vargo & Lusch, 2004). Contextualizing an intervention means creating an intervention in light of its local delivery setting, including those who deliver the intervention, the system stakeholders, and the children and families expected to benefit from the intervention. Practice profiles seek to engage all individuals with a stake in the intervention to contribute to its development and implementation.
Improvement science emphasizes multiple rapid tests of change by various individuals working under different conditions. Practice profiles employ usability testing, a process for producing rapid cycles of learning from practice (i.e., use of the core components described in a practice profile) in order to answer fundamental questions that drive improvement work: What are we trying to accomplish? How will we know a given change is an improvement? What change can we make that will result in improvement? (Cohen-Vogel, et al., 2015, Langley et al., 2009; Lewis, 2015). When usability testing of a practice profile is shaped by causal thinking that links hypothesized solutions to data, we accelerate contributors’ learning and improvement of the intervention and increase the likelihood of sustainability and large-scale implementation. Operational learning (Chambers et al., 2013) is a core value of the practice profile methodology. As described by Damschroder and colleagues (2009), successfully implementing innovations requires “dedicated time for reflecting or debriefing before, during, and after implementation as one way to promote shared learning and improvements along the way” (p. 11).
Practice profiles include the following components:
The philosophy, values, and principles that underlie the intervention.
These guide the practitioners’ decisions and ensure consistency, integrity, and sustainable effort across all practitioners.
A clear description of the core components.
These define the role of practitioners and inform activities in each phase of work. Core components (sometimes called active ingredients) clearly describe the features that must be present to say that the intervention is being used to achieve outcomes.
Operational definitions of the core components.
These describe the activities associated with each core component and allow the intervention to be teachable, learnable, doable, and assessable across a range of contexts. Operational definitions promote functional consistency across practitioners at the service delivery level.
Practical assessments of performance.
These assessments determine whether the intervention is implemented as intended. Fidelity assessments are used to improve practitioner competency and implementation supports such as training and coaching.
Practice Profile Methodology
Developing practice profiles requires a specific methodology (see Figure 1) that ensures the inclusion of research or best practices, the alignment of competencies with the intervention’s theory of change, and the recognition of what works according to the experiences of communities, practitioners, service recipients, and key stakeholders. Teams with diverse perspectives conduct the following interrelated (sometimes overlapping) steps in an iterative process to identify the principles, core components, and activities of practitioners: (1) document review, (2) semi-structured interviews, (3) a systematic scoping review, (4) a vetting and consensus process, and (5) content validation and usability testing. An initial prototype emerges after the first three steps and is refined through the final two steps. We describe these steps in greater detail in the following section.

Practice profile methodology
Document review
Existing documentation of the intervention is reviewed. This can include descriptions of the program’s theory, logic model, protocols, goals, communication plans, and other tools and resources (e.g., monitoring tools, quarterly reports, and site visit reports). Documents are prioritized and selected for review based on the amount of information they provide about the intervention’s principles and core components. The document review has two purposes: (1) to begin the task of describing the intervention in greater detail and (2) to inform the development of the interview protocol for the semi-structured interviews. Findings from the document review are coded for themes that will help researchers draft a description of the practice profile.
Semi-structured interviews
Individual interviews are conducted with a sample of stakeholders from three categories: (1) individuals who helped to develop the intervention or the underlying theory of change for the intervention, (2) individuals who deliver the intervention, and (3) individuals who experience the intervention. The goal of these interviews is to identify which intervention principles guide successful work with children, youth, adults, and families, as well as which specific activities of the practitioners bring these principles to life. Interview participants are asked to (1) provide examples from the field to illustrate the use of guiding principles and core activities related to the interventions, (2) describe successes and challenges in implementing the innovation, and (3) assess the benefits of and challenges to the innovation supporting their desired outcomes. Findings from the interviews are thematically coded and integrated with findings from the document review to produce an updated draft description of the practice profile.
Systematic scoping review
Scoping reviews (Levac et al., 2010) allow for a rapid systematic review of published work in a broad thematic area. In building a practice profile, the goal of the scoping review is to access and review published research that identifies competencies related to the innovation. The scoping review includes six stages (Arksey & O’Malley, 2005): (1) identifying the research question for the scoping review to address, (2) identifying relevant studies and reports, (3) selecting studies and reports using post hoc inclusion and exclusion criteria based on increasing familiarity with the literature, (4) extracting data to capture process-oriented information, (5) summarizing and reporting results, and (6) consulting community members and key stakeholders for additional insights. Studies and articles are identified through literature searches and a snowballing technique involving key sources such as developers or implementers of the innovation. Themes are summarized and integrated with findings from the document review and interviews to inform the first prototype of the practice profile.
Vetting and consensus building
Community members, practitioners, leadership, and key stakeholders vet the initial prototype of the practice profile, guided by facilitators and prepared questions. This process happens in two phases, typically over the course of several meetings. This first phase of vetting and consensus building focuses on content validation, providing an opportunity for stakeholders who participated in document reviews and interviews to make any needed corrections to the findings incorporated into Prototype 1. The second phase of vetting further integrates the findings across data sources (viz. document review, interviews, and systematic scoping review). For example, when the scoping review yields research evidence incongruent with interview findings, stakeholders are asked to determine how to integrate the findings for each of the principles and core components. Both research and practice evidence are considered in these discussions.
Testing, validating, and evolving the practice profile
Usability testing uses rapid cycle detection of strengths and gaps related to the evolving innovation with a small sample of cases. By testing the innovation as it is expected to be implemented with only a few examples (e.g., three to five practitioners initiating new services) across agencies/counties/regions, improvements can be made quickly from one cycle to the next. Data synthesized from these practitioners provide feedback on the practice profile’s overall usability. Typically, four or five usability testing cycles involving four or five practitioners each will produce sufficient information to refine the profile and lay a foundation for continuous quality improvement strategies as the innovation is scaled within the practice setting. When consistent challenges occur, implementation supports or the profile itself can be adjusted. Reflection, problem solving, and small cyclical tests of change are the hallmarks of this phase of practice profile development.
Conclusion
In this article, we claimed that many education interventions intending to produce system impacts involve multiple components and intend to impact multiple targets and are, as such, complex interventions. Evaluating the implementation of complex interventions requires attention to unique challenges such as identifying the full set of implementers and interactions among implementers and their contexts. In order to match evaluation complexity to implementation complexity, we provided a set of priorities for evaluating implementation of complex interventions and examples of aligned methodologies. The priorities emerged from our review of the features and properties of complex adaptive systems. Our motivation for examining and incorporating ideas from complexity science was to bolster evaluators’ capacities to identify and describe aspects of systems change. Careful study of systems change through a complexity lens might support greater success achieving desired outcomes, especially those related to promoting education equity as depicted in Table 1.
The priorities identified here are consistent with and build on emerging views on producing change and promoting equity in complex systems. For example, in 2014, the Institute of Medicine (2014) of the National Academies of Sciences, Engineering, and Medicine in the United States convened a workshop exploring evaluation methods uniquely designed for complex global health interventions. Recommendations synthesized from the workshop presentations included introducing evaluative thinking early in project development; creating and evolving theories of change as lessons are learned; replacing Did it work? questions with more nuanced questions about what worked, what worked less well, what could be scaled, and what could be sustained; creating an ecology of evidence use that includes strategies for learning from evidence to support systems change; and investing sufficient resources, time, commitment, trust, and relationships to ensure improvement.
Better research methods can give us more reliable answers to our current questions. Better models can help us ask the right questions (Smaldino, 2019). A complex systems model for education systems intervention would offer a promising tool for examining intervention capacities to improve long-term learning outcomes for children in varying contexts of the United States. The mechanisms necessary to improve learning outcomes will most likely extend beyond classrooms and include factors related to the environments in which young children develop and the policies, social structures, and history that shape those environments. New models about such mechanisms can in turn promote the development of effective policy and practice for equitable learning opportunities and outcomes.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research and/or authorship of this article: Duke Endowment (Grant No. 18-06-SG0-A).
