Abstract
Past research has shown student-teacher relationships (STRs) are associated with student outcomes, including improvements in academic achievement and engagement and reductions in disruptive behaviors, suspension, and risk of dropping out. Schools can support STRs universally and systematically by implementing universal, school-wide, and class-wide programs and practices that aim to facilitate high-quality STRs. This study applied meta-analytic and common element procedures to determine effect sizes and specific practices of universal approaches to improving STRs. The universal programs with the largest effects were Establish-Maintain-Restore and BRIDGE. Other programs demonstrated moderate effects in one study, with combined effect sizes revealing smaller effects. The common elements procedure identified 44 practices teachers can implement to promote positive STRs, with 14 proactive and direct practices. Programs with the largest effects, in general, contained the most proactive and direct practices for improving STRs. Implications of these findings and future research recommendations are discussed.
Schools are inherently social environments with a myriad of opportunities to build relationships. Past research has shown student-teacher relationships (STRs) are associated with a variety of positive student outcomes, including increases in academic achievement and engagement and reductions in disruptive behaviors, suspension, and risk of dropping out (Cornelius-White, 2007; Quin, 2016; Roorda et al., 2011). Schools can support STRs universally and systematically by implementing school- and class-wide programs and practices that facilitate positive, high-quality STRs. These universal programs can leverage preventative practices that foster positive STRs instead of relying on reactive strategies, such as reprimands and removal from the classroom, to address behaviors manifesting from poor STRs. Moreover, evidence-based practice should be informed by the highest level of research: synthesis and meta-analytic procedures (Gersten et al., 2005). The present study systematically analyzed school- and class-wide programs that aim to improve STRs utilizing two procedures, meta-analysis (Cooper et al., 2009) and the common elements approach (Chorpita et al., 2007), to evaluate universal programs linked to improved STRs, determine which programs have been associated with the largest effect sizes, and distill practices commonly seen across effective programs.
The Student-Teacher Relationship: Conceptualizations and Theoretical Underpinnings
The dominant paradigm in research used to conceptualize STRs is grounded in attachment theory (Pianta, 2001). Much of the research on attachment has focused on caregiver-child relationships (Ainsworth, 1989; Bowlby, 1988). Bowlby posited these early relationships between caregivers and children inform subsequent relationships, including STRs (Pianta, 1999; Riley, 2010). Other researchers argued that these inner working models of relationships are dynamic and subject to change throughout the lifespan (Fonagy et al., 1996; Fraley & Shaver, 2000; Riley, 2010). For teachers, school psychologists, and other educational professionals, the idea of dynamic attachment is preferred, as educators can provide security and trust in their relationships with students to adapt their internal working models. This research substantiates the main attachment-driven dimensions of Pianta’s (2001) conceptualization of the STR: closeness, conflict, and dependency, and the importance of STRs in child development. As conceptualized by Pianta, conflict encompasses the teacher’s perspective of whether their relationship is negative, strenuous, and ineffective. Conversely, closeness is defined as whether the teacher perceives their relationship as warm, affectionate, and effective. The third component, dependency, can be conceptualized as the degree to which a student is overreliant on a teacher, struggles with separation, and has inappropriate boundaries with asking questions (Pianta, 2001).
Self-determination theory (also known as self-motivation theory or self-systems theory) offers an additional perspective on the importance of STRs. Student engagement in classrooms has been linked to higher academic achievement and social outcomes (e.g., Dogan, 2017). Self-determination theory, grounded in basic needs theory, assumes that three psychological needs must be addressed for a person to be intrinsically engaged in a task: (1) autonomy, (2) competence, and (3) relatedness (Niemiec & Ryan, 2009; Reeve, 2012). The component most closely related to the STR, relatedness, has also been proposed as a critical component of student engagement (Pianta, 1999). Researchers have theorized that the association between STRs and student outcomes is mediated by student engagement (Appleton et al., 2008; Diperna, 2006) and that student engagement is inherently a relational process (Pianta et al., 2012). Moreover, previous research has suggested that a sense of belonging is a core human need (Lambert et al., 2013) that facilitates engagement and responsiveness in a given setting. In a school setting, lack of belonging is likely to manifest in terms of disengagement, which could involve student withdrawal, truancy problems, and/or acting out behaviors (Battistich & Horn, 1997).
Outcomes Associated With Student-Teacher Relationships
The importance of the STR in affecting student outcomes has been demonstrated across dozens of studies synthesized within three meta-analyses (Cornelius-White, 2007; Quin, 2016; Roorda et al., 2011). High-quality positive STRs have been found to have medium to large positive relationships with student engagement, small to medium positive relationships with academic achievement (Roorda et al., 2011), a positive relationship with students’ sense of belonging (e.g., Birch & Ladd, 1997), and a positive relationship with students’ self-esteem and social skills (Cornelius-White, 2007). From a longitudinal perspective, STRs predict later reading achievement (Valiente et al., 2019) and lead to reductions in problem behavior during the middle school years (Pakarinen et al., 2017). High-quality positive STRs are also inversely related to students’ problematic behavior (Brewster & Bowen, 2004; Quin, 2016; Silver et al., 2005), dropout (Quin, 2016), and suspension (Green, 1998; Quin, 2016).
The associations between STRs and student outcomes have been established for numerous demographic groups and contexts including early childhood (e.g., Pianta & Stuhlman, 2004) and adolescents and young adults in high school (Roorda et al., 2011; Wang et al., 2013). It is critical to begin establishing close relationships between students and teachers at a young age (i.e., kindergarten or early primary school) because it may affect future student outcomes (e.g., social skills, academic achievement; Hamre & Pianta, 2001; Pianta & Stuhlman, 2004) and serve as a protective factor for students who are at risk academically or behaviorally (Baker, 2006; Burchinal et al., 2002; Dearing et al., 2016).
Importance of Prevention
With state and federal education spending continuing to be a point of concern and debate, efficient resource allocation for educational initiatives is increasingly important (Brown et al., 2017; Ladson-Billings, 2006). The public health model of prevention can be utilized to frame schools’ resource allocation and educational programming (Strein et al., 2003). Interventions and supports are delivered across three tiers of support: (1) universal core curriculum and prevention supports for all students, (2) targeted intervention for some students with needs that go beyond the universal supports, and (3) more intensive intervention for a few students with higher levels of need. For the purpose of this review, universal programs are defined as programs implemented at the school- or class-wide level to all students. The general idea is that intensive services are costlier per student, and by investing in prevention (i.e., Tier 1 services), school leaders can limit the number of students who need more intensive, costly services. One example of a school-wide tiered framework that is closely related to STRs is Positive Behavioral Interventions and Supports, because it is a proactive, research-based approach that supports students and teachers by encouraging positive behaviors (e.g., Sugai & Horner, 2006). As stated previously, high-quality strong STRs have been shown to predict later academic, social, emotional, and behavioral adjustment in school (Cornelius-White, 2007; Pianta & Stuhlman, 2004; Quin, 2016; Roorda et al., 2011). Therefore, schools should consider universal interventions that aim to build and maintain strong STRs as an investment in promoting positive student outcomes and preventing problems that warrant more intensive intervention.
Existing Gaps in STR Research
Numerous studies have analyzed the relationship between school- and class-wide programs and STRs (e.g., Baroody et al., 2014; Bierman et al., 2017). Previous conceptual chapters have overviewed factors related to STRs (Pianta et al., 2002). These lower levels of research evidence are important building blocks of evidence-based practice; however, research must be collectively and systematically reviewed. Given the importance of literature consolidation, current gaps in this research area remain: (1) There has been no systematic meta-analysis that examines the magnitude of effect of different universal approaches to improving STRs and (2) it remains unclear what are the common practice elements of effective universal approaches to improving STRs.
While a meta-analysis can be helpful to understand the impact of STR interventions on student outcomes, it does not elicit information specific to overlapping or distinct practices of different STR interventions. A common elements approach was an important feature of the current study to identify commonalities across universal programs by extracting discrete practices that are common or overlapping across STR interventions, which builds on the strength of meta-analysis to identify STR interventions and aggregate findings. In other words, evidence-based programs can be distilled down to a smaller number of specific practice elements (Chorpita et al., 2005; Chorpita et al., 2007). This allows professionals to understand and use the practice elements more common across effective programs. Previous researchers have effectively used the common elements procedure targeting other outcome variables, including the social, emotional, and behavioral outcomes of students (e.g., Sutherland, Conroy, McLeod, et al., 2018). The benefits of understanding and using specific practice elements include the following: (1) researchers gain understanding of the practice elements most commonly seen across effective programs, (2) practitioners can understand and implement practice elements if implementation barriers exist for expensive and extensive manualized programs, (3) researchers and practitioners can glean if specific practice elements are more effective for specific populations, and relatedly, (4) through identifying practices of programs associated with positive outcomes, school leaders can better select manualized programs that match their school or classroom’s individualized needs. This knowledge can inform resource allocation (i.e., time and money) to the most effective practices for improving relationships.
Purpose of the Present Study
In light of the gaps in the current STR research, the purpose of this study was twofold. First, this study involved conducting a systematic meta-analysis of universal programs that lead to improved STRs. Second, this study used common practice elements procedures to better understand the specific practices of effective universal programs that lead to improved STRs. This study was guided by the following two research questions:
Method
Search Strategy
A multimodal search strategy was used to collect peer-reviewed articles, theses, and dissertations, utilizing the following techniques: (1) searches in databases using a combination of key terms, (2) ancestral searches (i.e., footnote chasing), (3) forward citation searching (i.e., searching articles that cite included articles), and (4) email inquiries to the authors of included studies to capture any other unpublished work. More specifically, the process of collecting articles began with database searches, three within EBCSOhost (Academic Search Premier, Educational Source, and ERIC), PsycInfo via Ovid, and ProQuest Dissertations and Theses, applying search terms to article titles, abstracts, and search headings particular to each database (see the appendix).
Articles found in the databases were first screened by the lead author at the title and abstract level to determine if the purpose of the study matched the independent (school- and class-wide program) and dependent (STR) variables guiding this meta-analysis. Across all five search engines, 10,229 articles were identified. There were 2,771 duplicates across search engines. An additional 6,690 articles were excluded at the title and abstract levels because they were not relevant to this study. This resulted in the retention of 768 articles at the title and abstract levels that were further examined for eligibility and inclusion in this study (see Figures 1 and 2).

PRISMA flow diagram database results.Note. PRISMA = Preferred Reporting Items for Systematic Reviews and Meta-Analyses; STR = student-teacher relationship; RCT = randomized controlled trial.

Backward and forward citation search results.Note. STR = student-teacher relationship; RCT = randomized controlled trial.
Inclusion and Exclusion Criteria
The criteria used to determine eligibility for inclusion in this meta-analysis were as follows: (1) included participants in grades preK–12, (2) primarily conducted in an educational setting (public, private, and charter schools included), (3) included a comprehensive outcome measure of the STR, (4) written in the English language, (5) examined a school- or class-wide program that aims to improve STRs, (6) included an effect size or enough information to calculate an effect size, and (7) utilized a randomized controlled trial (RCT) or a quasi-experimental design.
Definitions
Universal Tier 1 programs were defined as core supports that all students in a given setting receive for the purposes of preventing problems from emerging and promoting success enabling factors, such as engagement and motivation (Fuchs & Fuchs, 2006; Shapiro, n.d.). Within this study, “all students” was further defined as programs implemented at the school- or class-wide level. Therefore, if researchers implemented pullout programs for small groups of students or individual interventions for students, they were excluded from inclusion in this study. For the purpose of this study, any program that included school- or class-wide practices that could be implemented in general education classes by the general education teacher with numerous students was considered. If a study analyzed the effects of a discrete practice (e.g., greetings at the door) instead of a school- or class-wide program, the study was excluded, as the purpose of this study was to analyze and distil effective programs. The STR outcome variable was defined as teachers’ and/or students’ perceptions of the quality of their relationships. STR quality is conceptualized with different theories and components; this review did not exclude specific measures based on underlying components. An acceptable measure deemed to assess the construct of STR included items that measured multiple aspects of a relationship based on Pianta’s (2002) STR conceptualization (i.e., features of the individuals, perceptions, and interactions). To help ensure studies included in this meta-analysis used acceptable measures capturing teacher and/or student perceptions of the STR, studies that used a minimal set of items to represent the STR within a broader school climate survey, impairment survey, or conceptual attitudinal survey were excluded. Moreover, even though student-teacher interactions (STIs) are one component of assessing the STR, they do not capture the perceptual element of STRs that affects how a student or teacher thinks and feels about their relationship based on their shared experiences (Lippard et al., 2017; Pianta et al., 2002). Thus, studies using STIs as the sole outcome measure tapping STRs were excluded. Last, an RCT was defined as a study in which participants were allocated to the program at random, and a quasi-experimental design was defined as a study in which there was both an intervention and comparison control group; however, the study lacked random assignment to group type. Only these types of studies were included to ensure high-quality research studies with stronger internal validity.
A total of 16 studies met the above inclusion criteria. The other 752 studies retained at the title and abstract level were excluded for the following reasons: lacked an STR outcome variable (n = 229), not conducted in the United States (n = 54), not written in English (n = 137), not an RCT or a quasi-experimental design (n = 61), did not provide enough information to calculate an effect size (n = 17), age range of participants was not preK–12 (n = 34), did not include a school- or class-wide program (n = 54; excluded if just a practice or if Tier 2/3 program), not implemented in an educational setting (n = 1), should have been excluded at title/abstract level (n = 138), and had duplicates within database searches (n = 27).
Next, a backward searching procedure was used by searching the reference lists of the 16 studies to find studies prior search techniques may have missed. In addition, a forward searching process was performed by using the Google scholar’s “cited by” operation to identify any later studies that cited the included studies. Authors continued with forward and backward citation searching within all articles that matched the inclusion criteria until no new studies emerged. These procedures resulted in the following articles being identified and retained: (1) backward search technique 1,379 articles identified and two retained and (2) forward search technique 741 articles identified and 0 retained (see Figure 2). Thus, there was a total of 18 studies retained. Last, 17 authors who emerged within the 18 included studies were contacted via email to inquire about any unpublished results. Of the 17 authors contacted, eight responded. One new article was captured via this search strategy; however, this article lacked enough information to calculate an effect size and was excluded.
A retrospective decision to include studies conducted out of the United States was made, as STR programs and practices were determined to be relevant across educational settings in different countries. Of the 67 studies initially excluded because they were conducted outside of the United States, 64 did not meet inclusion criteria. Therefore, three additional studies conducted outside of the United States were included in this meta-analysis. Authors of these three additional studies were contacted and identical backward and forward citation searching followed. No new articles were found using these techniques. The inclusion of these three studies resulted in a final pool of 21 articles for coding and analysis.
Meta-Analysis
Coding Scheme
An a priori coding scheme was utilized with the 21 included articles. The variables coded were as follows: (1) study characteristics including author, year, title, journal name, publication type, and search strategy; (2) participant demographics including number of students and teachers, grade, program, age of students, geographical region, percentage of students who are on free or reduced-price lunch, race of students and teachers, gender of students and teachers, and the education and experience of teachers; (3) study quality including design, attrition, measurement tool used, the validity and reliability of the measurement tool used, and the analysis used; (4) effect variables (including the type of effect size, effect size coefficient for conflict, closeness, or an overall STR), standard errors of the effect size, mean STR pretreatment, mean STR posttreatment, and standard deviations of the means. Effect sizes were reported directly from the study or calculated if needed. A statistician assisted with each step of the meta-analysis, including coding, converting effect sizes, and reporting and interpreting findings. For the meta-analysis, articles were coded twice, first by the main author and second by an advanced, trained graduate student. Disagreements were handled by double-checking the codes within articles for consensus. After double-checking the codes within articles, there was 100% consensus.
Converting and Reporting Effect Sizes
If available, Cohen’s d effect sizes were coded as the effect sizes. If means, standard deviations, and sample sizes were given at posttest, these were utilized to calculate effect sizes to increase the ability to compare effect sizes across studies. Some studies reported their own effect sizes (e.g., Hedges’s g); however, different studies controlled for different covariates, which makes it challenging to compare effects across studies. Most of the studies included in this meta-analysis were RCTs (n = 17); thus, means and standard deviations at posttest to directly compare raw effects were utilized. If means and standard deviations were not reported (n = 3), other effects were converted to Cohen’s d effect sizes utilizing the equations in Table 1. However, one study (Nix et al., 2016) reported effects with odds but did not report an odds ratio; thus, the log odds ratio and variance of the REDI program group having a high-stable STR compared to the control group having a high-stable STR were calculated. The log odds ratio was then converted to Cohen’s d (see Table 2).
Effect size conversions
Converting odds to odds ratio
Weighting Effect Sizes
Programs analyzed across numerous studies were aggregated to determine an average effect size for each program only if they utilized the same outcome variable and analyzed the same program. Results were also aggregated across studies to provide an overall, weighted effect size for each outcome variable. When aggregating studies, studies with more precision were weighted more highly utilizing the following weight (Cooper et al., 2009; Lipsey & Wilson, 2001): w =
The combined effects are provided in the Results section. All effect sizes were separated based on the outcome variable reported: STR closeness, STR conflict, and overall STR.
Given the total number of studies and corresponding effect sizes included in this meta-analysis, adequately powered moderator analyses were not possible to perform. Instead, authors descriptively examined differences in effect size estimates according to specific variables of interest (e.g., program type).
Publication Bias
Funnel plots were created utilizing Excel Version 16.2 with a template developed by Van Rhee and Suurmond (2015). This template included Egger’s test of asymmetry; however, readers are urged to examine the results of the Egger’s test with caution; given the small sample size of this meta-analysis, the probability of receiving a significant Egger’s test (signifying publication bias) is low.
Common Elements Data Extraction/Coding
The studies that demonstrated significant, positive results were further analyzed for common practice elements, which refer to discrete practices associated with universal approaches to improving STRs. Authors were contacted to obtain the manual for their program (n = 12 researchers). Manuals not obtained from authors were searched via Google. Only three manuals (Establish-Maintain-Restore [EMR], BRIDGE, and Banking Time [BT]) were obtained. Other program practices were gathered from the description of the program within the published manuscript. After all the relevant sections of the studies were compiled, each practice was highlighted from these sections. These highlighted practices were compared across programs to create a comprehensive list of all practices. Similar to previous studies, a practice was defined as “a specific behavior or action of a teacher that manipulates features of the physical, interactional, or instructional environment to promote child outcomes” (McLeod et al., 2017, p. 207).
The total list of practices was then distilled. Distillation refers to “the reduction of data to a simpler, smaller data set of meaningful units” (Chorpita et al., 2005, p. 13). First and foremost, there were some comprehensive programs that included multiple practices not related to STRs (e.g., academic content). These teaching practices aimed at improving academic knowledge (e.g., reading, math) were not included. Only program components related to social, emotional, behavioral, and/or relational development were included. This was determined because social, emotional, behavioral, and relational development can affect interactions between teachers and students (e.g., Conroy et al., 2015), and positive STIs can affect the STR (Hartz et al., 2017). In addition, similar practices were combined to create practice elements. This was completed by grouping practices that used the same mechanism (i.e., teacher behavior) to achieve similar goals into discrete practice elements. For example, “behavior specific praise” and “effective use of praise” were combined into one practice element, “praise.” This resulted in a list of 44 common practice elements.
After the first distillation phase, practice elements were categorized according to a 2 × 3 organizational scheme developed in consultation with experts in the field. The 2 × 3 organizational scheme was organized with the following categories: (1) directly or indirectly affecting the STR and (2) proactive, teaching content, or reactive strategy. Direct practices were defined as intentional interactions aimed at improving aspects of the STR, including perceptions and feelings of trust, connection, belonging, respect, and care. For example, if a teacher greeted their children at the door every day expressing care, this was conceptualized as a direct practice. Likewise, if a teacher asked students personal questions to build their relationship with and understanding of that student, this was also conceptualized as a direct practice. Indirect practices included altering external, environmental influences (e.g., classroom management) and students’ skills (e.g., emotion expression and understanding) to indirectly influence STRs through interactions. For example, if classroom rules and expectations were stated clearly and explicitly, students’ behavior may have subsequently improved, which may have influenced the interactions between students and teachers, which could have ultimately improved perceptions of relationship. Likewise, if students learned how to appropriately express their emotions through curriculum, this may have improved communication during interpersonal conflict between teachers and students, which could have ultimately improved relationships.
Last, the organizational scheme differentiated between proactive, teaching, and reactive strategies. The organization was based on antecedent, teaching, and consequent strategies derived from behavior analytic theory and was defined by the temporal occurrence of when the practice occurred in relation to a child’s behavior (i.e., before [proactive] or after [reactive] a child’s behavior). Moreover, proactive strategies were direct or indirect relational practices that were delivered noncontingently and aimed at preventing problem behavior or promoting greater engagement, whereas reactive strategies were practices that occurred contingently in response to an interaction between the teacher and child or a child’s behavior. A teaching practice was differentiated from a proactive practice because a teaching practice was aimed at specifically teaching a student a skill, typically as part of a curriculum or embedded within instruction, for example, teaching a student about how to appropriately communicate, how to self-regulate, or about emotion identification and the physiological bodily changes that occur with emotions. If not part of a curriculum or instruction, we considered the practice a proactive direct or indirect practice. For example, if a teacher utilized the practice of “precorrection,” that was coded as a proactive, indirect practice instead of a teaching practice. Once the practice elements were categorized, three expert consultants in the field specializing in programs designed to improve STRs reviewed the organization of practice elements. In addition, two advanced graduate students trained in the organizational scheme provided input regarding categorization of the common practice elements. Disagreements were discussed until consensus occurred. Operational definitions of each of the categories can be found in Table 3.
Definitions of practices
These practices can be considered proactive and reactive. They are set up to emphasize expectations and negotiate if-then arrangement; however, there are also inherent consequent strategies embedded including feedback and incentives/rewards.
The final list of practice elements was then utilized to code each of the STR programs. For the common elements procedure, programs were coded for practices first by the main author and second by an advanced graduate student. Disagreements were discussed until consensus was reached. Given the number of programs (12) and practices (44) coded in the common elements procedure, there was a total possibility of 528 disagreements. Considering that there were only 23 disagreements, this equated to only 4% disagreements (23/528) across all codes and 96% agreement between coders. After discussions between the main author and graduate student, consensus occurred with 100% agreement. The end product was a program-by-program data set with each practice element coded 1 (present) or 0 (not present). This procedure allowed the researchers to determine frequency counts of each practice element across the effective STR programs. Although this procedure did not allow researchers to determine which practice elements were active ingredients, it did examine patterns of practice elements most common across effective programs.
Results
Meta-Analysis
Study Characteristics
A total of 21 studies were included in the meta-analysis that analyzed 13 programs. Ranges of the number of schools, teachers, and students included in each of the studies were as follows: schools: 1 to 78 (M = 22.1, n = 17), teachers: 10 to 252 (M = 117.8, n = 17), and students: 50 to 3,331 (M = 500.7, n = 21). The grades of participants included in studies were as follows: preK (n = 10), early elementary (Grades K–2, n = 5), upper elementary (Grades 3–5, n = 2), mixed elementary school (Grades K–5, n = 3), and middle school (Grades 6–8, n = 1). Of the nine studies that reported urbanicity, six were urban and three were mixed (urban, rural, suburban). Of the 13 studies that reported U.S. region, the majority of studies (n = 7) were completed in the northeast region of the United States. Of the three studies completed outside of the United States, two of them were conducted in Norway and the third was conducted in Belgium. The demographic variables across studies for students and teachers can be found in Table 4. A limited number of studies reported teacher education; thus, education of teachers was not aggregated or reported. Last, the study designs included 17 RCTs and four quasi-experimental designs. Quasi-experimental designs either had teachers report on program usage and predicted their STRs based on that dichotomous variable (yes/no) or did not randomly assign participants to intervention and control group. Last, 14 of the studies mentioned the consideration of treatment integrity, and only seven studies reported actual percentages of adherence, ranging from 50% to 96.5%. No studies reported on the cost of their program.
Demographic variables (total N = 21 studies)
Note. Studies conducted outside of the United States did not report on student race/ethnicity.
Measurement Tools
All of the 21 studies captured in the meta-analysis utilized the Student-Teacher Relationship Scale (STRS; Pianta, 2001). The STRS is a 28 item self-reported measure completed by teachers. It has 12 items related to conflict, 11 items related to closeness, and five items related to dependency (Pianta, 2001). A shortened version of the STRS is available, which includes only 15 items from the original measure (“Measures Developed,” Pianta, 2018). This shortened version only includes measures of closeness (8 items) and conflict (7 items). Of the 21 who utilized the STRS, six utilized the full version and 15 utilized the short version. Even though some studies utilized the full version of the STRS, all studies reported only closeness, conflict, and/or a combined STRS as the outcome variable. Thus, the dependency scale was not included in this meta-analysis. All of the studies reported internal consistency, which ranged across studies (α = .62–.95). Only one study reported moderate validity of the STRS.
Given all studies utilized the STRS, it is critical to analyze the psychometric properties of this measurement tool. Pianta (2001) first outlined the validity and reliability of the STRS in the program manual created for this tool. Test-retest reliability correlations were as follows: closeness (.88), conflict (.92), and dependency (.76). Internal consistency for the entire normative sample was .89. The factor structure of the STRS demonstrated adequate construct validity. The STRS also demonstrated adequate concurrent and predictive validity with behavioral and academic outcomes of students (e.g., behavior problems and competencies in elementary school). Last, this program manual outlines evidence for adequate discriminant validity, as this measure does not correlate with behavioral problems or social competence more than .58, suggesting variance that is explained by relationships and not the other variables. Outside of this program manual, the psychometric properties of the full version of this measure, as well as the short form, have been tested across numerous studies, countries, and languages (e.g., Germany, Turkey, Italy, Greece), suggesting adequate reliability (internal consistency, test retest) and validity (construct, criterion, factorial) across numerous contexts (e.g., Milatz et al., 2013; Settanni et al., 2015).
Effect Sizes
Effect sizes of programs were separated based on outcome variable reported: closeness, conflict, and overall STR. Combined effect sizes for programs analyzed by numerous studies can be seen in Table 5. As a general guideline, Cohen’s d effect sizes fall within the following categories: small: 0.2 to 0.5, medium: 0.5 to 0.8, and large: 0.8 (Cohen, 1977). Overall, across all STR programs effect sizes ranged from d = −0.11 to 0.65. This range was disaggregated according to the two dimensions of STR: (1) closeness with a range of d = −.07 to .65 and (2) conflict with a range of d = −0.56 to 0.06. The total, combined weighted effect sizes across outcome variables were as follows: closeness: d = 0.22 (SE = 0.03), conflict: d = −0.05 (SE = 0.03), and overall STR: d = 0.26 (SE = 0.03). Programs with the largest effect sizes were EMR (combined effect size d = 0.64) and BRIDGE (d = 0.65), with the Chicago School Readiness Project (CSRP) being associated with the smallest effect size estimate (d = −0.11). However, some programs were studied across numerous studies (e.g., EMR), while the effects for other programs were only studied once (e.g., BRIDGE). Other programs demonstrated moderate effect sizes in one study (BT d = 0.52; Responsive Classroom [RC] d = 0.63); however, when combined with other studies, the overall effect size for the program was smaller.
Effect sizes (Cohen’s d) of program on STR across studies
Note. RCT = randomized controlled trial.
Denotes studies that computed a standardized mean difference based on dichotomous regression coefficients. Standardized mean differences are based on posttest mean differences between intervention and control group.
Publication Bias
Publication bias was evaluated through the visual analysis of funnel plots along with an Egger’s test of asymmetry. Through visual analysis of the funnel plots, it appears studies with large standard errors and small effects are missing from this meta-analysis, indicating publication bias. However, the Egger’s tests of symmetry for the three funnel plots were not statistically significant: closeness: t = −0.02, p =.99; conflict: t = −0.73, p =.49; and overall STR: t = 1.62, p = .15. The power of the Egger’s test to detect bias is low with the small amount of studies captured in this meta-analysis. Thus, results should be interpreted with caution considering the true program effect sizes may be lower.
Common Elements Results
Only the programs that demonstrated statistically significant, positive results were included in the common elements procedure. The authors were primarily interested in determining the most common practices elements across effective programs. In sum, there were 12 programs coded for practice elements: EMR, BT, Playing-2-gether, RC, Head Start (REDI [Research-based, Developmentally-Informed]/Head Start), Tools of the Mind (TOTM), Best in Class, Starting Strong, Kindergarten Summer Readiness Classroom, BRIDGE, Incredible Years Teacher Management Program (IY-TCM), and INSIGHTS. CSRP was not included in the common elements procedure because it demonstrated an overall negative effect on the STR, and the purpose of the common elements procedure is to determine practices common across effective interventions. Furthermore, there were 44 total practices coded across all organizational categories with 14 proactive strategies that aimed at improving STRs (see Table 6).
Organizational scheme of practice components and frequencies across effective programs
Note. Definitions of practices can be found in Table 3.
These practices can be considered proactive and reactive. These practices are set up to emphasize expectations and negotiate if-then arrangement; however, there are also inherent consequent strategies embedded including feedback and incentives/rewards.
Of the proactive direct practices, the most common practices seen across effective programs were praise (n = 8), teachers demonstrating respect (n = 5), spending 1:1 time with students to build relationships (n = 5), coaching and validating emotions (n = 5), objective observations to change teachers’ internal representations of SRs (n = 5), getting to know students personally (n = 5), positive to negative ratio of interactions (n = 3), check-ins throughout the day (n = 3), reflective and supportive listening (n = 3), positive greetings at the door (n = 2), expressing care (n = 2), and child-led activities (n = 2). The definitions of these direct proactive practices can be referenced in Table 3. Moreover, of the studies that demonstrated medium effect sizes for creating close positive STRs (d = .50 or greater for closeness or overall STR), these programs exhibited higher frequencies of proactive direct practices (see Table 7). This includes programs that demonstrated a medium or greater effect size in one study, with smaller effect sizes when combined across studies (i.e., BT, RC). The one exception is the program Playing-2-gether, which demonstrated a high percentage of proactive direct practices, with a smaller effect size. However, it should be noted the authors of this study reported a larger effect size than what was gleaned in this meta-analysis because they considered the moderator variable of time (pre-/postintervention effects). Another noteworthy finding is the percentage of practices within each program that were categorized as proactive or direct practices in our coding scheme. Sixty-five percent and 89% of practices within EMR and BT were categorized as proactive and direct, compared to 26% and 17% for BRIDGE and IY-TCM. Last, when comparing total practices across effective programs, the majority of practices fell within two domains: proactive/direct: 34% and proactive/indirect: 26%. A lot of these proactive, direct strategies are ways to make interactions more child-centered, while explicitly trying to build a relationship between a child and a teacher. By praising children, letting them lead games, narrating their actions or labeling their feelings, and getting to know them personally, teachers are expressing care and acceptance to the students. These types of strategies are assumed to improve STRs (e.g., Driscoll & Pianta, 2010).
Frequency of practice components and proportions across programs
Note. EMR = Establish-Maintain-Restore; BT = Banking Time; RC = Responsive Classroom; HSR = Head Start (REDI); TOTM = Tools of the Mind; BC = Best in Class; P2G = Playing-2-gether; SS = Starting Strong; KSRC = Kindergarten School Readiness Classroom; IY-TCM = Incredible Years Teacher Management Program.
Proactive indirect practices also comprised these programs. These are actions teachers can take before they interact with a student to structure the environment of his/her classroom to facilitate STRs. Through high-quality, well-managed classrooms, students know what to expect and what is expected of them, which affects interactions between teachers and students, and can ultimately improve relationships (Korpershoek et al., 2016). The most common proactive, indirect practices were as follows: establishing clear, predictable classroom rule and routines (n = 8); parental involvement (n = 5); student choice and empowerment (n = 4); clearly established transitions and down time (n = 3); peer-assisted learning strategies (n = 3); sending a positive note home to parents (n = 2); giving students a sense of responsibility (n = 2); teachers using scaffolding skills (n = 2); class-wide meetings (e.g., morning meeting; n = 2); and organizing the physical layout of the classroom to facilitate relationships (n = 2).
Additionally, teachers can proactively teach and bolster skills within students. If teachers instruct students on how to improve their social skills, self-regulation, and overall emotion understanding, this could affect interactions between students and teachers, which ultimately improves relationships. The following teaching content was found within the programs in this common elements procedure: teaching problem-solving skills (n = 8), social skills (n = 5), self-regulation/control (n = 5), emotion understanding (n = 4), emotion expression (n = 3), self-monitoring skills (n = 2), self-esteem (n = 2), and goal-setting (n = 2).
Last, teachers can utilize consequent strategies after a student behavior to change student behavior in the future and to repair STRs that have been damaged. Ways to change student behavior could include positive discipline strategies (n = 7), feedback (n = 6), incentives/rewards (n = 5), time-out (n = 4), daily report cards (n = 3), and behavior contracts (n = 2). One of the most interesting strategies proposed across these programs is how to repair relationships between students and teachers (n = 2). If teachers have interpersonal conflict with their students, they must take time to repair the relationship. The types of strategies proposed could include the teacher taking ownership of the problem, the teacher and the student working together to find a win-win solution, the teacher showing effort to understand the student’s perspective, and the teacher suggesting a “fresh start” and/or stating care for the student.
Although moderator analyses comparing programs could not be completed due to the small sample size of included programs, a descriptive analysis of programs with strong effects compared to nonsignificant or negative effects is warranted. The program with the largest effects was EMR, which included numerous direct relational practices such as expressing care, getting to know the student, conducting home visits, giving praise, and using restorative relational practices if there is a negative interaction. The programs with the smallest effects were CSRP and TOTM. Authors were unable to receive a list of specific practices for CSRP, but broad content areas of program components have been discussed in previous studies (Watts et al., 2018), which included classroom management strategies, self-regulation teaching practices, and teacher stress and burnout consultation. TOTM included numerous indirect practices such as teaching students skills in problem solving, self-regulation, and self-monitoring, providing scaffolding of tasks, giving feedback, and having clear classroom rules and routines. TOTM included only two direct relational practices: checking in with the student and objectively observing the student. This brief descriptive analysis suggests although indirect practices may facilitate positive relationships between students and teachers, direct relational practices may be more potent in improving STRs.
Discussion
The purpose of the present study was to advance the literature on the effects of universal programs promoting STRs by (1) conducting a meta-analysis to examine the effectiveness of programs overall, (2) identifying which programs were associated with the largest effects on STRs, and (3) determining the common practice elements associated with effective universal approaches. Overall, 21 studies were identified that met inclusion criteria, with a total of 13 unique STR programs. The total, combined weighted effect sizes across outcome variables were as follows: closeness: d = 0.22 (SE = 0.03), conflict: d = −0.05 (SE = 0.03), and overall STR: d = 0.26 (SE = 0.03). There was significant variability in effect size estimates for the different STR programs. The programs with the largest effect sizes were EMR (combined effect size d = 0.64) and BRIDGE (d = 0.65). However, as stated previously, EMR was studied across two studies, whereas the effects for BRIDGE is from only one study. Other programs also demonstrated moderate effect sizes in at least one study (BT d = 0.52; Responsive Classroom d = 0.63), yet when combined with other studies, the overall effect size was smaller.
Once effective programs were identified, the authors distilled those programs into 44 unique practices that teachers deliver that can potentially promote positive STRs. Like previous research using distillation procedures to identify common elements (e.g., Sutherland, Conroy, McLeod, et al., 2018), the list of 44 common practice elements provides a taxonomy for the field to continue to investigate the precise ways in which teachers can enhance their relationships with students. Moreover, the organizational scheme used in this study provides researchers and professionals with opportunities to investigate and compare proactive strategies that aim to influence relationships versus reactive strategies or strategies that may indirectly affect STRs. Below, the authors highlight noteworthy findings, contrast the present findings with previous work, and describe the utility, implications, and limitations of these findings.
First, the studies that demonstrated effect sizes above a moderate level (d ≥ 0.50) for creating close, positive STRs (closeness and overall STR) exhibited higher frequencies of proactive, direct practices within their programs. These results do not suggest that these practices are definitively the features of these programs that cause improvements in STRs; rather, they pinpoint practices that may serve as the main active ingredients of effective programs. Future research will need to isolate the impact of these practices to gather evidence regarding whether specific proactive, direct relationship practices serve as active ingredients of effective STR programs. The second noteworthy finding was the percentage of practices within each program that were categorized as proactive and direct practices in our coding scheme. In other words, comparison of effective programs with proportionally more proactive and direct practices (e.g., EMR and BT) compared to others that have proportionally more indirect practices (BRIDGE and IY-TCM) is important. For example, 65% and 89% of practices within EMR and BT were categorized as proactive and direct compared to 26% and 17% for BRIDGE and IY-TCM. This finding suggests that if educators are interested in improving relationships between students and teachers, they may not need to use more complex and potentially expensive programs that package numerous direct and indirect practices. It may be more cost-effective to focus on more feasible and affordable preventative practices that target interactions and relationships between students and teachers. However, the hypothesis that direct practices are likely a more potent active ingredient of STRs relative to indirect practices needs to be confirmed in future research.
Even though we are arguing direct practices may be a more potent practice, this does not mean indirect practices are any less important. Indirect, proactive practices may be creating an educational environment that facilitates direct, proactive approaches. The effects of indirect approaches are likely to be smaller because they are mediated by direct approaches, but they are an important building block for creating classrooms that bolster relationships.
Last, when comparing total practices across effective programs, the majority of practices fell within two domains: proactive/direct: 34% and proactive/indirect: 26%. This finding suggests out of the 12 programs that demonstrated positive effects and were included in the common elements procedure, the majority of practices in these programs were preventative in nature. These results are not surprising, as the purpose of this study was to analyze universal preventative programs implemented at Tier 1 in schools; however, they substantiate the importance of proactive approaches for improving STRs.
The overall effect size across studies for conflict was very small (d = −.05). Programs that focused more on bolstering students social, emotional, and behavioral skills (e.g., IY), in general, demonstrated stronger effects for decreasing conflict between teachers and students. Overall, as previous authors have mentioned, teachers’ perceptions of conflicting relationships with students may be more challenging to change (Cappella et al., 2012). Newer studies have suggested ways for repairing relationships that have endured interpersonal conflict (e.g., EMR; Cook, Coco, et al., 2018). These programs, like having the teacher take ownership for part of the problem, the teacher and the student working together to find a win-win solution to the problem, showing effort to understand the student’s perspective, and having the teacher suggest a “fresh start” and/or state they care for the student (e.g., “I know we had a rough day yesterday, but I am so glad you are in my class today”), are all strategies that should be evaluated with future research. A gap in the research still exists for how to decrease conflict between students and teachers instead of increasing closeness.
Although many practices appeared frequently across programs, this does not indicate those practices are the most effective. It may indicate those practices are easier to implement in schools, cost less, or use less resources. For example, across the proactive and direct practices, praise was most commonly seen across studies (n = 8) compared to home visits, which was found in only one program. The higher count of programs including praise does not suggest that praise is more effective than home visits. A promising avenue for future research is to examine the differential effectiveness of more time intensive yet potentially effective nonclassroom-based practices (e.g., home visits) against easier to implement practices delivered by teachers in the classroom.
Characteristics of Studies Captured in Meta-Analysis
Notable findings were present regarding the demographics and characteristics of the studies captured in this meta-analysis. First, all but one study analyzed programs for improving STRs for students in preK or elementary school, with the majority of studies looking at relationships for children in preK (n = 9). A notable research gap was the lack of research in the middle and high school settings. STRs remain important throughout secondary school (Wang & Holcombe, 2010). Understanding the importance of these STRs for adolescents and how to improve them could be addressed in future studies. In addition, as Pianta (1999) mentioned, relationships are complex systems more accurately conceptualized through patterns of interactions over time, across situations, and from multiple modes of analysis. Future studies should address the analysis of STRs through longitudinal research designs to more accurately capture the complexity of STRs.
All of the studies captured in this meta-analysis used the same outcome measure, the STRS (Pianta, 2001). Although this measure has demonstrated evidence of technical adequacy across numerous studies, languages, and cultures (e.g., Settanni et al., 2015), it lacks the student perspective of STRs, which is arguably the most important (Gage et al., 2016). This suggests a need for the creation of validated, standardized, STR assessment tools that address the teacher and student perspectives. Utilizing a multimodal, multi-informant measurement process (e.g., interviews, observations, scales, etc.) should be the gold standard moving forward. Although using a single measure can be viewed as a limitation, many have argued that to advance science, researchers working in a similar area need to adopt a standard set of instruments so findings across studies can be more accurately compared (Robinson et al., 2009).
Findings Compared to Previous Studies
Other researchers have completed common elements procedures with social emotional learning (SEL) programs (e.g., McLeod et al., 2017; Sutherland, Conroy, McLeod, et al., 2018). McLeod et al. (2017) and Sutherland, Conroy, McLeod, et al. (2018) analyzed common practice elements across efficacious SEL programs for improving a variety of outcome variables, one of them being STRs. Considerable overlap exists among the practices found in these previous studies and the present study (McLeod et al., 2017, p. 208; see Sutherland, Conroy, McLeod, et al., 2018, pp. 81–82), suggesting some of the practice elements captured in this review have support for improving not only STRs but other student outcomes as well (e.g., engagement, social problem solving, problem behaviors: Sutherland, Conroy, McLeod, et al., 2018). Examples of practices that demonstrated overlap across the current study and these previous common elements procedures include, but are not limited to, altering how teachers respond (e.g., supportive listening, praise), teaching students skills (e.g., emotion regulation and self-management skills), classroom management strategies (e.g., establishing routines, student choice, opportunities to respond, precorrection), consequent strategies (e.g., time-out and rewards), and mesosystem factors (e.g., home-school collaboration). With regard to the meta-analytic findings, no previous published meta-analyses have synthesized the effects of programs for improving STRs, but an unpublished systematic review (e.g., Weiers, 2017) captured many of the same studies and came to similar conclusions.
Limitations
The findings of this study are tempered by its limitations. The search terms, databases, and inclusion criteria used in this study may not have yielded all relevant studies. The authors attempted to minimize the number of studies missed through use of a scoping search, an extensive application of terms in different fields and databases, and consultation with a library science expert to create a comprehensive search strategy that included all relevant search terms and databases, with inclusion of dissertations and theses to help combat publication bias. Although visual analysis suggests publication bias included in this meta-analysis, Egger’s test of symmetry was not significant. Thus, the results may not accurately reflect the true program effect sizes compared to the reported effect sizes in this meta-analysis.
Another limitation is the subjectivity of the common elements procedure to distill programs into discrete practices. The 2 × 3 organizational scheme was created by the authors of this article to help the field conceptualize and categorize practices; however, other groups may categorize and define these practices differently. This limitation was addressed by having two additional experts in the field consult on the organization of these practices. All of the codes were double-coded by an advanced graduate student.
An additional limitation is that authors were contacted to obtain access to program manuals; however, only three program manuals were obtained. These three program manuals may have provided more detailed information for coding practices compared to other programs that were analyzed through information provided in journal articles. Due to page limit restrictions in journal articles, a comprehensive list of practices may not have been obtained for these programs. Authors determined importance in providing a detailed list of practices from the available program manuals; however, this decision to code some programs using manuals and some without could have affected conclusions from the common elements procedure.
Another limitation is the authors of this study only analyzed posttest differences in effect between the treatment and control group. This was because most of these programs were RCTs, and the quasi-experimental designs reported no pretest differences between groups. However, research suggests the effectiveness of relationship-building programs can depend on preintervention relationship status and the moderator of time. If preintervention STR status is more conflicting, these interventions are typically more effective. This is because STRs typically have a ceiling, making it difficult for close, positive STRs to become stronger. Some programs that provided effect sizes considered the moderator of time, and their effect sizes were reportedly larger than the effect sizes gleaned in this meta-analysis (e.g., Vancraeyveldt et al., 2015).
The last limitation that needs to be addressed is the current state of research that has been completed in this content area and, consequently, what can be appropriately done with these results. We suggest one of the main implications of these findings is to be able to train teachers on specific practices and to use modular approaches versus expensive, comprehensive, manualized programs. However, this study only identified high-frequency practices that occur across effective programs. This does not suggest that each practice identified in this review is an active ingredient or has direct effects on the STR by itself. Future research needs to determine which of these practices are active ingredients and which have the largest effects on the STR. Some studies have already began looking at the effects of these discrete practices (e.g., greetings at the door; Cook, Fiat, et al., 2018); however, more extensive research evidence needs to be completed before our suggested practice implications can be fully realized.
Implications for Research and Practice
The practice elements captured in this study may inform practice and offer content for inclusion in teacher professional development. Given teachers’ continued reports of feeling overwhelmed and unprepared to handle problematic behavior and the link between poor STRs and problem behavior (Begeny & Martens, 2006; Freeman et al., 2013), targeted training could fill this void. Indeed, research examining discrete relationship practices, such as the 5:1 positive to negative STI ratio, have been shown to decrease disruptive behavior and improve student engagement (Cook et al., 2017). However, simply training teachers on these practices may not lead to sustained implementation (e.g., Collier-Meek et al., 2018). Schools need to provide a range of implementation strategies (e.g., ongoing training, consultation, audit, and feedback) that support teachers’ adoption, delivery, and sustained use of these practices (Cook et al., 2019).
Findings also have implications for manualized versus potentially more nimble, customizable modular approaches. Research has indicated allowing teachers flexibility and autonomy increases the implementation and sustainability of practices (e.g., Han & Weiss, 2005). Modular approaches enable individuals to select specific practices among an array to tailor to the context (i.e., environment and students). This suggests that schools may not need to train teachers on more time-intensive, strict, and potentially costly manualized programs; rather, teachers could utilize a more adaptable, modular approach to address their specific classroom needs and relationships, which may also increase buy-in, a common implementation barrier (Forman et al., 2009). The field must consider teacher buy-in as a potential implementation barrier of these practices. Some of the practices depicted in this study are easy and virtually free to implement (e.g., praise or restoring a relationship through skillful conversation), which reduces some of the known barriers to teacher implementation (e.g., teacher buy-in or workload stress). Research should continue to examine the differential effects and implementation of manualized and modular approaches to promoting STRs.
Moreover, this study provides a list of potential practice elements that could be implemented at a universal tier within multitiered systems of support. These practices are cheap, easy universal practices that could supplement, and be easily embedded within, other universal strategies and tiered systems of support (e.g., SEL curricula, positive behavioral interventions and supports). Integrated prevention has been offered as a more effective approach than stand-alone approaches (Domitrovich et al., 2010). Research has supported this notion that the combination of programs and practices from different theoretical perspectives and corresponding practices produces better outcomes than any single program alone (Cook et al., 2015). The practices elucidated in this study can potentially prevent the need for more costly, intensive services considering positive STRs are inversely related to a variety of negative outcomes for students later on, including problem behavior, suspension, and drop out (e.g., Silver et al., 2005; Quin, 2016). Therefore, implementing evidence-based, universal practices to build high-quality STRs may serve as a protective factor for students and a cost-effective investment of school resources. Research that examines approaches that integrate intentionally STR practices with other programs represents an important avenue to pursue with future research.
This is the first study to consolidate universal programs that improve STRs. It offers the research field a list of practice elements to study in future research. This study offers the field new hypotheses: (1) direct and proactive practices found in this study are most effective at improving STRs and (2) a mediation relationship exists between indirect practices, STIs, and STRs. Future research can test these hypotheses, replicate this meta-analysis, continue to conduct RCTs of universal programs, and study the effects of discrete practices on the STR to determine the most potent practices for improving STRs in everyday practice.
Although this study identified STR programs and distilled them, this study did not systematically identify all studies empirically testing discrete STR practices, such as positive greetings at the door (Cook, Fiat, et al., 2018). Future research should build off the findings from this study by conducting a comprehensive review of studies examining the effects of discrete relationship practices. Such research will help provide a common nomenclature for the field of discrete STR practices, as well as apply innovative methods to examine different dimensions of the identified practices that have implications for producing change in real-world educational conditions outside of research, such as fidelity of implementation, feasibility, malleability, and impact.
Conclusion
In summary, this study represented an effort to meta-analyze the extant literature on universal approaches to promoting STRs and perform a distillation process to identify and categorize common practice elements across efficacious programs. This study identified 44 potential practices across programs found to be effective at improving STRs, with those programs with the largest effects using higher proportions of proactive, direct practices than other types of practices. A large percentage of the practices delineated from this study overlap with previous studies that sought to find common practices across evidence-based programs. School leaders and educational professionals may utilize this information to inform teachers’ practices. We hope this research will stimulate future research that endeavors to identify discrete relationship-building practices that are lower cost yet produce a high yield, as well as examine moderators of the potential impact of universal programs to develop a better understanding of with whom and under what conditions such programs work.
Footnotes
Appendix
Authors
LAURIE KINCADE is a fifth-year doctoral candidate in the Department of Educational Psychology at the University of Minnesota, Twin Cities, 250 Education Sciences Building, 56 East River Road, Minneapolis, MN 55455; email:
CLAYTON COOK holds the John and Nancy Peyton Endowed Chair in Child and Adolescent Wellbeing at the University of Minnesota and is a professor of educational psychology in the College of Education and Human Development, Burton Hall, 178 Pillsbury Drive SE, Minneapolis, MN 55455; email:
ANNIE GOERDT is a third-year doctoral student in the Department of Educational Psychology at the University of Minnesota, Twin Cities, 250 Education Sciences Building, 56 East River Road, Minneapolis, MN 55455; email:
