Abstract
Potential bidirectional associations between preschool classroom overactive (or externalizing) and underactive (or internalizing) behaviors and language and literacy skills (i.e., vocabulary and listening comprehension) were examined in a sample of children enrolled in Head Start (N = 297). Cross-lagged panel designs using structural equation modeling (SEM) were conducted using data gathered through teacher ratings and direct assessments developed for use in preschool programs serving diverse populations of young children. Significant associations varied by type of behavior and language and literacy skill. Higher overactive behavior in the fall was associated with lower listening comprehension skills in the spring, whereas higher underactive behavior in the fall was associated with lower vocabulary skills in the spring. In addition, lower listening comprehension skills in the fall were associated with higher levels of underactive behavior in the spring. Implications for future research, policy, and practice are discussed.
Over the past decades, there have been heightened national concerns regarding young children’s social and emotional development. Early childhood research indicates that high proportions of preschool children exhibit social-emotional and behavioral needs (Feil et al., 2005) that are negatively associated with children’s early academic learning (e.g., Denham, 2006; Domínguez Escalón & Greenfield, 2009; Ladd et al., 2006; Raver, 2002; Thompson & Raikes, 2007). These concerns are greater for young children whose families live in poverty; studies with low-income preschool samples suggest that up to one third of the children served in public preschool or community-based programs serving children from low-income families exhibit moderate to clinically significant social-emotional or behavioral needs that can interfere with classroom learning (Barbarin, 2007; Feil et al., 2005).
Many early childhood researchers have identified significant associations between behavioral needs and academic underachievement (e.g., J. M. McDermott et al., 2013; Montes et al., 2012; Reyes et al., 2020), with many focusing on language and literacy outcomes (e.g., Bulotsky-Shearer & Fantuzzo, 2011; Chow et al., 2018; Chow & Wehby, 2018; Fantuzzo et al., 2003; McClelland et al., 2007). Early language and literacy skills have also been identified as important predictors of children’s future school success (Duncan et al., 2007; Justice et al., 2008; Whitehurst & Lonigan, 1998) and found to serve as protective factors for children at risk of academic difficulties in other domains (Burchinal et al., 2008). Behavioral needs early in childhood have been negatively associated with a host of language and literacy skills such as letter naming (e.g., Bulotsky-Shearer & Fantuzzo, 2011), receptive and expressive vocabulary (e.g., Bub et al., 2007; Bulotsky-Shearer, Bell, Romero, et al., 2012; Fantuzzo et al., 2003), phonemic awareness (e.g., Bulotsky-Shearer & Fantuzzo, 2011), and reading (e.g., Bulotsky-Shearer et al., 2011). These findings underscore the need for early identification and classroom-based interventions that attend to young children’s social-emotional and behavioral needs so that children can engage more successfully in classroom activities that foster these important language and literacy skills.
Although the evidence linking children’s early social and emotional adjustment to language and literacy skills is clear and convincing, it is important to keep in mind that “children’s early academic skills and emotional adjustment may be bidirectionally related” (Raver, 2002, p. 4) and that children’s language and behavior may also influence each other over time (Bichay-Awadalla et al., 2019). In addition to linking social and emotional adjustment to language and literacy outcomes, research has documented the negative influence language delays may have on children’s behavioral outcomes (e.g., Bornstein et al., 2013; Brownlie et al., 2004; Chow & Wehby, 2018; Petersen & LeBeau, 2020; Peyre et al., 2016; Qi et al., 2006, 2019). Together, these research findings suggest that these important areas of development—behavior and language and literacy—may mutually influence each other over time during preschool. Research efforts that examine the potential bidirectional associations between behavior and language and literacy skills would be beneficial in informing our understanding of children’s early engagement in classroom learning and the development of classroom-based interventions to promote early learning (Raver, 2002).
Snow (2007) provides a conceptual model for understanding the dynamic relations between classroom behavior and language and literacy development. He argues that children’s school readiness comprises multiple domains that are distinct but very much interrelated (Snow, 2007). In other words, he argues that children’s capacities are unique but dynamically and transactionally influence each other over time. The majority of research efforts to date, however, have focused on examining how capacities or skills in a certain domain influence development in other domains in a unidirectional manner. Examinations of dynamic interrelations among key school readiness competencies have been underexplored (Snow, 2007). With the advent of more sophisticated structural equation modeling (SEM) approaches, it is possible to examine some of these complex associations (Martens & Haase, 2006). Using a cross-lagged panel SEM design, the current study examines potential bidirectional associations between overactive (e.g., aggression, opposition, and inattention/hyperactivity) and underactive (e.g., shyness and withdrawal) classroom behavior and language and literacy skills within a sample of Head Start children. Overactive behavior, often referred to as externalizing behavior, includes overt behaviors (like pushing, noncompliance, and trouble regulating behavior or paying attention) that are disruptive to classroom routines as well as peer or teacher interactions. Underactive behavior, often referred to as internalizing behavior, is more difficult to observe, but may be displayed by socially withdrawn behavior, as well as fear of or difficulty initiating interactions with teachers or peers. In the sections below, we review relevant literature that explores the associations between these behaviors and language and literacy skills.
Preschool Classroom Behavior and Language and Literacy Outcomes
Developmental psychopathology theory suggests that children’s social and cognitive development is largely influenced by early behavioral adjustment patterns (Cicchetti & Sroufe, 2000). Children’s adjustment patterns within the classroom context—for instance, how well they are able to interact with teachers and peers, regulate their behavior, and focus their attention—influence children’s engagement in learning opportunities where other skills are taught (Bulotsky-Shearer, Bell, & Domínguez, 2012; Pianta, 2006). In addition, aligned with an ecological and transactional model, children who have difficulty navigating the demands of preschool social and learning settings may display overactive or underactive behavior (Sameroff & Fiese, 2000). These behaviors are seen as resulting from the interaction and potential mismatch between the cognitive or social-emotional demands of classroom settings, and they are seen as mutable and preventable, rather than as a static problem within the child (Downer et al., 2010; Lutz et al., 2002). In accord with this theoretical framework, in our study we use context-focused measures that assess children’s behavior within classroom contexts as it is observed by teachers.
Research has examined associations between classroom behavior and concurrent and long-term academic outcomes. Findings highlight the negative association between both overactive and underactive behavior and language and literacy skills early in childhood (Campbell et al., 2000; Fantuzzo et al., 2003, 2007).
Considerable research has examined the negative contributions of overactive behavior, probably in part due to the visibility and/or disruptiveness of such behavior in the classroom. Overactive behaviors, such as aggression and inattention, have consistently been associated with language and literacy difficulties and reading delays (Domínguez Escalón & Greenfield, 2009; Petersen & LeBeau, 2020). Findings suggest that children who exhibit behavior difficulties often receive less feedback from teachers and spend less time in important instructional activities (Vitiello et al., 2012; Williford et al., 2017), which may, in turn, negatively influence their ability to learn other important skills. Research also suggests that these children are often less likely to interact and collaborate with peers in socially mediated learning activities, which are often the focus of preschool environments (Bulotsky-Shearer et al., 2010; Haak et al., 2012).
Fewer studies have examined the associations between underactive behavior and children’s early language and literacy skills. Children who exhibit underactive behavior are less disruptive during classroom routines and, therefore, their needs are more likely to be overlooked. Unfortunately, however, existing studies have found that children who display underactive behavior in the preschool classroom are also at risk of learning difficulties (Bulotsky-Shearer, Bell, & Domínguez, 2012; Domínguez et al., 2011; Fantuzzo et al., 2003; J. M. McDermott et al., 2013; Reyes et al., 2020). For example, underactive behaviors, such as shyness and social withdrawal, have been linked to lower expressive and receptive vocabulary (Bub et al., 2007; Fantuzzo et al., 2003; J. M. McDermott et al., 2013). Shy children have also been found to have difficulty establishing close relationships with teachers and exhibit less social initiative with peers (Bulotsky-Shearer, Bell, Romero, et al., 2012; Rydell et al., 2005). Given the interactive and social nature of early learning, children who exhibit underactive behavior may have difficulty engaging in classroom activities focused on language and literacy.
Preschool Language and Literacy Skills and Behavioral Outcomes
In addition to examining the contribution of overactive and underactive behavior to language and literacy skills, early childhood researchers have examined how early language and literacy delays may contribute to classroom behavioral adjustment (Brownlie et al., 2004; Chow & Wehby, 2018; Qi et al., 2006, 2019). Raver (2002) suggests that young children who exhibit learning difficulties, such as difficulty reading, may become frustrated and, as a result, exhibit more disruptive classroom behavior. Similarly, Qi and colleagues (2006) argue that children with early language difficulties may display overactive behavior because of the frustration they experience by not being able to successfully use language to communicate with adults and other children in their classroom. In fact, elementary school research supports these claims (Chow et al., 2018; Chow & Wehby, 2018). Researchers have found that elementary school–age children with speech and language disabilities are more likely to exhibit difficulties interacting with their peers and often experience peer rejection, which in turn has been found to result in greater externalizing and internalizing behavior (Menting et al., 2011; Vallance et al., 1998).
Preschool studies examining concurrent associations also have found that children who exhibit language delays are more likely to display higher levels of both externalizing and internalizing behavior (Qi et al., 2006). Research links both receptive and expressive language delays to higher parent and teacher ratings of aggressive, withdrawn, or socially anxious behavior (Cohen et al., 1993). Receptive language delays that are linked to reading difficulties are most troublesome as they are associated with higher externalizing behavior but are less likely to be detected (Qi et al., 2006).
Interrelations Between Behavior and Language and Literacy
Researchers have advocated that during early childhood the relationship between behavior and language and literacy is bidirectional, or nonrecursive in nature, with both sets of competencies influencing each other over time (Bichay-Awadalla et al., 2019; Bornstein et al., 2013; Girard et al., 2016; Raver, 2002). A review of literature conducted by Benner et al. (2002), as well as recent meta-analyses (Chow et al., 2018; Chow & Wehby, 2018), suggests that the co-occurrence between the two is high and stable.
Findings from recent studies examining cross-lagged associations between behavior and language ability cite varied bidirectional interrelations. In a longitudinal study, Girard and colleagues (2016) found that expressive language delays predicted externalizing behavior, but that this relationship was bidirectional over time (Girard et al., 2016). Bornstein and colleagues (2013) similarly found that lower language ability drives higher externalizing behaviors, but found no evidence of the opposite association (Bornstein et al., 2013). A recent study conducted in Head Start found a bidirectional relationship between internalizing behavior and expressive language, but a unidirectional relationship between receptive language and internalizing behavior (Bichay-Awadalla et al., 2019).
Given the increased behavioral and academic risks that children living in underserved communities face as they transition to school, research is needed to more fully understand the underlying dynamic processes by which early behavioral difficulties and language skills influence one another. Using an SEM cross-lagged panel design, the present study examined potential bidirectional associations between overactive and underactive behavior and language and literacy skills in a sample of children enrolled in Head Start. Children’s overactive and underactive behavior and vocabulary and listening comprehension skills were assessed at the beginning and end of the preschool year, using instruments developed specifically for use in preschool classrooms serving diverse populations of young learners. In addition, classroom behavior was assessed using a context-focused teacher report, validated and developed for use within Head Start classrooms. Based on conceptual models highlighting the dynamic relations that exist between classroom behavior and academic skills, we expected relations to be bidirectional. We hypothesized that overactive and underactive behavior at the beginning of the preschool year would be negatively associated with language and literacy skills at the end of the preschool year. We also hypothesized that language and literacy skills at the beginning of the year would be negatively associated with overactive and underactive behavior at the end of the preschool year. Analytic models following guidelines by Martens and Haase (2006) were conducted to examine these hypotheses and are described in more detail in the sections below.
Method
Participants
During the 2008 to 2009 academic year, as part of a larger project, six Head Start centers were selected from a pool of centers (N = 36) that met the following criteria: (a) were located within 20 mi of the university’s campus, (b) had at least two Head Start classrooms, and (c) completed the online version of the local Head Start Program’s system-wide school readiness assessment. A total of 30 classrooms across the six centers were selected and invited to participate in the study. Once center directors, teachers, and teacher assistants from the 30 classrooms consented to participate in the study, a random sample of children was selected from each classroom and parents were asked to consent and allow them to participate in the study. Participating children were stratified into four groups based on both age (3- vs. 4-year-olds) and sex (boys vs. girls). A total of eight to 10 children were selected from each classroom, with roughly equal numbers from each of these four groups. This resulted in a total sample of 297 children. The remaining children were assigned to “alternate” status. A total of 77 children were dropped from the initial sample: 38 due to low English proficiency, 12 because they were not able to engage in the direct assessments, eight due to chronic absenteeism/tardiness, seven due to lack of assent, and seven due to lack of parental consent or relocation to other programs. “Alternate” children of the same sex and age group were continuously selected to replace children who were dropped to ensure that the sample size was adequate for the analyses of the larger project.
The final sample included 279 children; 51.3% were girls. Children’s age at the beginning of the school year ranged from 36 to 59 months (M = 48.29, SD = 6.44). Ethnicity was reported for 99% of the sample; 63% of the children were Black or African American, 30% Latinx, and 7% other ethnicities. The lead teachers in the 30 participating classrooms were all female. Of the 97% of teachers who reported ethnicity, 66% were Latinx, 28% were Black or African American, 3% were Asian, and 3% were White or Caucasian. Of the 97% of teachers who reported education level, 38% had a child development associate (CDA) credential or other associate’s degree, 52% had a bachelor’s degree, and 10% had a master’s degree. Ninety-three percent of the teachers reported the number of years they had been a preschool teacher (M = 12.39, SD = 8.54).
Measures
Teacher ratings of classroom behavior
Children’s classroom behavior was assessed using the Adjustment Scales for Preschool Intervention (ASPI; Lutz et al., 2002). The ASPI is a teacher-report measure developed in collaboration with Head Start teachers and validated for use with low-income preschool populations. It is a context-focused measure and consists of 144 items that describe children’s behaviors across 22 routine classroom situations, including relationships with teachers, peers, and learning tasks. To complete the ASPI, teachers were asked to mark any description that applied to the child. In other words, each item or description was marked if the behavior applied to the child or left blank if it did not apply. Items for each of the five factors were summed to create subscale scores, which were then converted into T-scores (M = 50, SD = 10), based on the standardization sample.
Exploratory and confirmatory factor analytic studies with urban, low-income preschool samples in the Northeast have revealed five valid and reliable dimensions: Aggressive, Inattentive/Hyperactive, Oppositional, Withdrawn/Low Energy, and Socially Reticent/Shy (Lutz et al., 2002). Each of the dimensions demonstrated adequate internal consistency, with Cronbach’s alpha coefficients of .92, .78, .79, .85, and .79, respectively. The dimensions have also been found to be replicable and generalizable to important subgroups of the standardization sample (i.e., younger and older children, boys and girls, African American, Latinx, and Caucasian ethnicities). Convergent and divergent validity for the five dimensions has been established with the constructs of interactive peer play, receptive and expressive vocabulary, learning behaviors, and observations of classroom externalizing and internalizing behavior (Bulotsky-Shearer & Fantuzzo, 2004; Fantuzzo et al., 2003, 2005, 2007).
Direct assessment of language and literacy skills
The learning express (LE; P. A. McDermott et al., 2009) is an item response theory (IRT)-based test designed to detect growth of cognitive competencies. The content of the test is based on national and regional standards for academic school readiness. A total of 325 items were created and divided between two equivalent forms, each containing items in four subscales: Vocabulary Knowledge, Math, Listening Comprehension, and Alphabet Knowledge. The LE requires children to be tested individually by a trained assessor using a large flip-book. One side of the book is oriented toward the child and presents pictures, letters, or numbers. The other side provides a prompt for the assessor, asking the child to respond either by pointing, verbalizing, or using manipulatives. Two scales, Vocabulary Knowledge and Listening Comprehension, were used in this study. Items on the Vocabulary Knowledge subscale require children to point to a picture or verbally say what a picture represents. Items on the Listening Comprehension section require children to listen to one or several sentences and identify the picture that matches the verbal prompt. Items in each subscale are ordered by difficulty, according to results from two-parameter IRT analysis. The number of items administered to the child is determined by basal and ceiling rules.
In a large, ethnically diverse Head Start sample, the LE was demonstrated to have high internal consistency across subscales and across measurement occasions (composite internal consistency estimates were .96 for Vocabulary Knowledge and .93 for Listening Comprehension; P. A. McDermott et al., 2009). It was also shown to be sensitive in detecting both a wide range of individual differences among Head Start preschoolers and change within the course of one preschool year after controlling for children’s age, sex, language status, and prior experience. Concurrent validity for the measure was indicated by significant correlations between the four subscales and teacher ratings of related school readiness domains (Preschool Child Observation Record; High Scope Educational Research Association, 1992).
Procedures
Once approval was obtained from the university’s institutional review board (IRB), the local Head Start partners provided contact information for the selected centers and classrooms. Directors and teachers were then contacted and asked to participate. Once consent at the center- and classroom-level was obtained, consent forms were sent to the parents of all eligible children in each classroom. Head Start program partners provided basic demographic data (date of birth, sex, and ethnicity) for all participating children in the study. Teachers were asked to complete the ASPI at the beginning and end of the school year. Participating teachers received packets containing the ASPI for each participating child in their classroom, with instructions on how to fill the rating scale provided both in person and in writing. After completion of each packet, teachers were compensated with gift cards to buy educational materials for their classrooms.
Concurrent with teacher ratings, direct assessments of children’s language and literacy skills (LE) were conducted at the beginning and end of the school year. Independent assessors were trained to administer the LE according to the test manual. To reduce practice effects, each child was randomly assigned one of the two LE forms in the fall and administered the opposite form in the spring. Assessors tested children individually outside their classroom, in a designated area. The assessment battery took approximately 20 to 30 min. Children were given stickers after completion of the assessment.
Data Analytic Approach
A series of structural equation models was analyzed using Mplus Version 6 (Muthén & Muthén, 1998–2017). SEM was chosen as the most appropriate data analytic strategy because of its ability to create latent variables from a series of indicators or observed variables, incorporate multiple predictor and outcome variables into a single model, and examine cross-lagged panel designs. In addition, Mplus allows for models to be analyzed while accounting for the multilevel structure of the data (e.g., children nested within classrooms). Research has shown that ignoring the dependency inherent in multilevel data causes a bias in the estimation of standard errors, which increases the likelihood of Type I error (Raudenbush & Bryk, 2002). By using TYPE = COMPLEX in Mplus, the estimation of the standard errors for the parameters are adjusted to account for the multilevel structure of the data.
For all models, the chi-square value was assessed as an indicator of overall fit, with low, non-statistically significant values indicating good model fit (Kline, 2005). As the chi-square is known to be sensitive to sample size (Kline, 2005), three additional fit indices were examined. These included the Bentler comparative fit index (CFI; Bentler, 1990), the root mean square error of approximation (RMSEA; Steiger & Lind, 1980), and the standardized root mean square residual (SRMR; Bentler, 1990). The CFI values greater than .90, RMSEA values equal to or less than .08, and SRMR values equal to or less than .10 were considered acceptable and indicated adequate model fit (Browne & Cudeck, 1992; Kline, 2005).
Prior to testing the cross-lagged panel models, measurement models were specified to create and examine overactive and underactive latent variables. Separate measurement models were examined for the fall and spring. Next, a series of four SEM models were built, following guidelines on cross-lagged panel designs (Martens & Haase, 2006). To test cross-lagged models, Martens and Haase suggest testing (and comparing the fit of) the following four models (Models A to D): (a) a baseline model with only the autoregressive effects, (b) a model with the autoregressive effects and one latent variable predicting the other at later time points, (c) a model with the autoregressive effects and the other latent variable predicting the former at later time points and (d) a fully cross-lagged model with the autoregressive effects and both latent variables predicting each other at later time points. (Martens & Haase, 2006, p. 883)
Given that our analytic approach accounted for the multilevel structure of the data (using TYPE = COMPLEX in Mplus), our models were limited to the number of clusters available in our data (Muthén & Muthén, 1998–2017). Therefore, we were unable to include all variables in one single model. Instead, we built models for each of the four relationships between behavior and language and literacy skills: overactive behavior and listening comprehension, overactive behavior and vocabulary, underactive behavior and listening comprehension, and underactive behavior and vocabulary. In all models, children’s age, sex, and ethnicity were included as covariates.
The baseline model (Model A) specified autoregressive effects (a path between fall and spring behavior and a path between fall and spring language and literacy), but did not specify paths across the different variables. In other words, in this baseline model no paths were specified between behavior and language and literacy skill. Models B and C built on Model A and specified paths between behavior and language and literacy. Model B included the autoregressive effects as well as a path between fall behavior and spring language and literacy skill. Model C included the opposite: the autoregressive effects and a path between fall language and literacy skill and spring behavior. Finally, the fully cross-lagged model (Model D) specified both paths in addition to the autoregressive effects—it included both a path between fall behavior and spring language and literacy skill, and a path between fall language and literacy skill and spring behavior. Demographic covariates (age, sex, and ethnicity) were included in all models.
The fit of each of the models described above was compared using the Satorra–Bentler chi-square difference test (Satorra & Bentler, 2001), which is the recommended test when nested models are estimated with the maximum likelihood mean-adjusted (MLM) or maximum likelihood robust standard error (MLR) estimator (UCLA: Academic Technology Services, Statistical Consulting Group, 2012). Only the best fitting models (final models) for each of the four relationships were interpreted and presented in the results.
Results
Descriptive Statistics
Prior to modeling, we examined the data to determine whether they were normally distributed as well as to look for outliers; no assumptions were found to be violated. See Table 1 for descriptive statistics and Table 2 for bivariate correlations between variables.
Descriptive Statistics.
Note. ASPI scores represent standardized T-scores (M = 50, SD = 10). Learning express scores represented IRT-derived ability scores (M = 200, SD = 50). ASPI = Adjustment Scales for Preschool Intervention.
Bivariate Correlations for Fall and Spring Scores.
p < .05.
Measurement Models
We first specified the measurement models for the latent variables. The fall model, in which we created latent variables for fall underactive and overactive behavior, resulted in good fit to the data, χ2(4) = 6.83, p = .15; CFI = 0.99; RMSEA = 0.05; SRMR = 0.02. The factor loadings for shy/withdrawn behavior and low energy were both above .40 and significant (B = 0.70, SE = 0.12, p < .001; B = 0.81, SE = 0.11, p < .001, respectively). The factor loadings for aggressive, oppositional, and hyperactive/inattentive behavior were also above .40 and significant (B = 0.95, SE = 0.04, p < .001; B = 0.73, SE = 0.06, p < .001; B = 0.72, SE = 0.04, p < .001, respectively).
The spring model, in which we created latent variables for spring underactive and spring overactive behavior, also resulted in good fit to the data, χ2(4) = 2.77, p = .60; CFI = 1.00; RMSEA = 0.00; SRMR = 0.02. Consistent with the fall measurement model, the factor loadings for shy/withdrawn behavior and low energy were significant (B = 0.58, SE = 0.17, p < .001; B = 0.94, SE = 0.23, p < .001, respectively). The factor loadings for aggressive, oppositional, and hyperactive/inattentive behavior were also significant (B = 0.93, SE = 0.05, p < .001; B = 0.71, SE = 0.05, p < .001; B = 0.77, SE = 0.042, p < .001, respectively).
Cross-Lagged Panel Models Examining the Association Between Overactive Behavior and Language and Literacy
Overactive behavior and listening comprehension
We first compared the baseline model (Model A) with the Overactive Behavior → Listening Comprehension model (Model B). Results from the Satorra–Bentler chi-square difference test used to compare these two models were significant, χ2diff(1) = 22.17, p < .001, suggesting that the Overactive Behavior → Listening Comprehension model provided a significantly better fit to the data than the baseline model. In contrast, the Listening Comprehension → Overactive Behavior model (Model C) did not provide a significantly better fit than the baseline model, χ2diff(1) = 1.08, p = .30. Because the Overactive Behavior → Listening Comprehension model provided the best fitting model, we compared the fit of the fully crossed-lagged model (Model D) with this model. The chi-square difference test, however, indicated that it did not provide a significantly better fit than Overactive Behavior → Listening Comprehension model, χ2diff(1) = 1.22, p = .27. The best fitting model, Model B, was Overactive Behavior → Listening Comprehension, χ2(33) = 127.121, p < .001; CFI = 0.92; RMSEA = 0.10; SRMR = 0.04 (see Figure 1).

Final models examining associations between overactive behavior and language and literacy.
In this final model, both autoregressive paths were significant, indicating that fall overactive behavior was significantly associated with spring overactive behavior (B = 0.83, SE = 0.05, p < .001) and fall listening comprehension was significantly associated with spring listening comprehension (B = 0.38, SE = 0.05, p < .001). Age was significantly and positively associated with spring listening comprehension (B = 1.00, SE = 0.36, p < .01), indicating that older children had better listening comprehension skills than younger children. Hispanic children were also found to be less likely to exhibit overactive behavior, relative to their African American peers (B = −1.582, SE = 0.68, p < .05). Finally, fall overactive behavior was significantly and negatively associated with spring listening comprehension (B = −1.16, SE = 0.29, p < .001), indicating that children who exhibited overactive behavior at the beginning of the school year obtained lower listening comprehension scores at the end of the school year, after controlling for demographic covariates and initial listening comprehension skills.
Overactive behavior and vocabulary
We first compared the baseline model (Model A) with the Overactive Behavior → Vocabulary model (Model B). Results from the Satorra–Bentler chi-square difference test were not significant, χ2diff(1) = 0.82, p = .36, suggesting that the baseline model provided a significantly better fit to the data than the Overactive Behavior → Vocabulary. We then tested Vocabulary → Overactive Behavior model (Model C) and compared it with the baseline model. Results from the Satorra–Bentler chi-square difference test were significant, χ2diff(1) = 3.68, p < .05, suggesting that the Vocabulary → Overactive Behavior model provided a significantly better fit to the data than the baseline model. Because the Vocabulary → Overactive Behavior model provided the best fitting model, we compared the fit of the fully crossed-lagged model (Model D) with this model. The chi-square difference test, however, indicated that it did not provide a significantly better fit to the data than Vocabulary → Overactive Behavior, χ2diff(1) = 1.45, p = .23. The best fitting model, Model C, was Vocabulary → Overactive Behavior, χ2(33) = 141.04, p < .001; CFI = 0.92; RMSEA = 0.10; SRMR = 0.04 (see Figure 1).
In this final model, both autoregressive paths were significant, indicating that fall overactive behavior was significantly associated with spring overactive behavior (B = 0.84, SE = 0.05, p < .001) and fall vocabulary was significantly associated with spring vocabulary (B = 0.52, SE = 0.04, p < .001). Age was significantly and positively associated with spring vocabulary (B = 0.93, SE = 0.29, p < .001), indicating that older children had better vocabulary skills than younger children. Hispanic children were also found to be less likely to exhibit overactive behavior, relative to their African American peers (B = −1.55, SE = 0.66, p < .05). No significant associations were observed between overactive behavior and vocabulary.
Cross-Lagged Panel Models Examining the Association Between Underactive Behavior and Language and Literacy
Underactive behavior and listening comprehension
We first compared the baseline model (Model A) with the Underactive Behavior → Listening Comprehension model (Model B). Results from the Satorra–Bentler chi-square difference test used to compare these two models were not significant, χ2diff(1) = 1.86, p = .17, suggesting that the baseline model provided a significantly better fit to the data than the Underactive Behavior → Listening Comprehension model. In contrast, the Listening Comprehension → Underactive Behavior model (Model C) did provide a significantly better fit to the data than the baseline model, χ2diff(1) = 10.46, p < .001. Because the Listening Comprehension → Underactive Behavior model provided the best fitting model of the first three models, we compared the fit of the fully crossed-lagged model (Model D) with this model. The chi-square difference test, however, indicated that it did not provide a significantly better fit to the data than Listening Comprehension → Underactive Behavior model, χ2diff(1) = 1.93, p = .16. The best fitting model, Model C, was Listening Comprehension → Underactive Behavior, χ2(4) = 27.924, p < .05; CFI = 0.97; RMSEA = 0.06; SRMR = 0.04 (see Figure 2).

Final models examining associations between underactive behavior and language and literacy.
In this final model, both autoregressive paths were significant, indicating that fall underactive behavior was significantly associated with spring underactive behavior (B = 0.40, SE = 0.18, p < .05) and fall listening comprehension was significantly associated with spring listening comprehension (B = 0.40, SE = 0.05, p < .001). Age (B = 0.88, SE = 0.37, p < .05) and sex (B = 7.76, SE = 3.99, p < .05) were significantly and positively associated with spring listening comprehension, indicating that older children and girls had better listening comprehension skills relative to younger children and boys, respectively. Hispanic children were also found to obtain higher listening comprehension scores, relative to African American peers (B = 7.91, SE = 3.20, p < .01). Finally, fall listening comprehension was significantly and negatively associated with spring underactive behavior (B = −0.01, SE = 0.01, p < .01), indicating that children who obtained lower listening comprehension scores at the beginning of the school year were more likely to obtain higher underactive behavior scores at the end of the year, after controlling for demographic covariates and initial underactive behavior.
Underactive behavior and vocabulary
We first compared the baseline model (Model A) with the Underactive Behavior → Vocabulary (Model B). Results from the Satorra–Bentler chi-square difference test used to compare these two models were significant, χ2diff(1) = 4.23, p < .05, suggesting that the Underactive Behavior → Vocabulary model provided a significantly better fit to the data than the baseline model. We then compared the baseline model with the Vocabulary → Underactive Behavior model (Model C). Results from the Satorra–Bentler chi-square difference test used to compare these two models were not significantly different, χ2diff(1) = 0.06, p = .80, suggesting that the baseline model provided a significantly better fit to the data than the Vocabulary → Underactive Behavior model. Because the Underactive Behavior → Vocabulary model provided the best fitting model, we compared the fit of the fully crossed-lagged model (Model D) to this model. The chi-square difference test, however, indicated that it did not provide a significantly better fit to the data than Underactive Behavior → Vocabulary, χ2diff(1) = 0.12, p = .73. The best fitting model, Model B, was Underactive Behavior → Vocabulary, χ2(14) = 20.12, p = .13; CFI = 0.99; RMSEA = 0.04; SRMR = 0.02 (see Figure 2).
In this final model, both autoregressive paths were significant, indicating that fall underactive behavior was significantly associated with spring underactive behavior (B = 0.42, SE = 0.16, p < .01) and fall vocabulary was significantly associated with spring vocabulary (B = 0.51, SE = 0.05, p < .001). Age was significantly and positively associated with spring vocabulary (B = 0.89, SE = 0.29, p < .01), indicating that older children had better vocabulary skills than younger children. A trend was observed between fall underactive behavior and spring vocabulary; children exhibiting underactive behavior at the beginning of the year obtained lower vocabulary scores at the end of the year (B = −0.91, SE = 0.47, p < .055).
Discussion
Guided by conceptual models that suggest dynamic relations between preschool classroom behavior and academic skills, our study examined the potential bidirectional associations between both overactive and underactive behavior and language and literacy skills. To do so, we gathered data using teacher ratings of children’s behavior within classroom contexts and direct assessments developed specifically for use with young children from underserved communities. We conducted a series of cross-lagged panel designs using SEM. The analytic approach employed involved building cross-lagged models progressively to identify models that best fit these data. Findings extend our understanding of the associations between early behavior and language and literacy skills.
Our first hypothesis, which predicted children who were rated by their teachers as exhibiting behavioral needs at the beginning of the preschool year would also demonstrate lower language and literacy skills at the end of the preschool year, was partially supported. Higher levels of overactive behavior in the fall were associated with lower listening comprehension skills (but not lower vocabulary) in the spring. Higher levels of underactive behavior in the fall, on the contrary, were associated with lower vocabulary skills (but not lower listening comprehension skills) in the spring.
Our second hypothesis that children who displayed lower language and literacy skills in the fall would also display higher behavioral needs in the spring was also partially supported. Children with lower listening comprehension scores at the beginning of the year were rated by their teachers as exhibiting higher levels of underactive behavior (but not overactive behavior) at the end of the preschool year. No significant associations were found between fall vocabulary and spring language and literacy skills.
It is important to note that contrary to our hypothesis, the best fitting models did not include bidirectional associations. Instead, the significant associations observed in final models were unidirectional in nature as reported above.
Associations Between Overactive Behavior and Language and Literacy
As mentioned above, higher teacher ratings of overactive behavior (including aggressive, oppositional, and inattentive behavior) early in the preschool year were significantly associated with lower listening comprehension skills at the end of the year. Recent cross-lagged studies have shown inconsistent results examining the relationship between overactive behavior and language and literacy skills (Bichay-Awadalla et al., 2019; Girard et al., 2016). The findings from this study suggest that the relationship between overactive behavior and receptive language skills is unidirectional and are consistent with findings linking overactive behavior with language and literacy difficulties (e.g., Fantuzzo et al., 2003; Harden et al., 2000).
Research highlights an important link between self-regulatory, working memory, and attention skills and the development of important language and literacy skills, such as listening comprehension (Florit et al., 2009; Haak et al., 2012). Listening comprehension requires not only the ability to understand words and verbal statements, but also the ability to sustain attention and remember what’s being communicated. A study by Stevens and colleagues (2011) highlights the critical role that selective attention plays in the development of early language and literacy skills. In a study examining the role of memory, Florit and colleagues (2009) found that both short-term and working memory accounted for unique variance in listening comprehension, after controlling for verbal ability. This suggests that overactive behavior may inhibit children’s ability to develop important receptive language skills that require sustained attention. However, research that further examines these important associations in early childhood is needed.
Associations Between Underactive Behavior and Language and Literacy
Two unidirectional associations were found between underactive behavior and language and literacy, however, the direction of these associations differed based on the specific language and literacy skill. First, underactive behavior early in the preschool year was associated with lower vocabulary skills at the end of the year. Previous research has also found that children with underactive behavior exhibit lower vocabulary skills (Fantuzzo et al., 2003; Kaiser et al., 2000; J. M. McDermott et al., 2013; Reyes et al., 2020; Strand et al., 2011). Vocabulary is often fostered throughout the preschool day in a variety of settings or activities. In fact, some researchers believe that unlike print awareness, which often requires intentional, teacher-directed interaction, vocabulary development results from conversations or opportunities that are “more informal and less exclusively mediated by teachers” (Dobbs-Oates et al., 2011, p. 8). A theoretical model proposed by Chow and Wehby (2018) in their systematic review suggests that children who display internalizing behavior, such as shyness and social withdrawal, may be less likely to participate in these informal conversations or interactions that support the development of foundational language skills, such as vocabulary.
Second, listening comprehension scores at the beginning of the year were negatively associated with underactive behavior at the end of the year. This finding is consistent with previous findings showing that children who begin the preschool year with difficulties with receptive language are more likely to display underactive behavior later on (Bichay-Awadalla et al., 2019; Chow et al., 2018). In a study conducted in Australia, researchers found that children who entered preschool with a language impairment were more likely to exhibit internalizing behavior in preschool than children with typically developing language (Prior et al., 2011). In a study conducted with older children in the United Kingdom, Lindsay et al. (2007) also found that children who struggled in a listening comprehension task, requiring children to understand a series of narratives, were more likely to exhibit behavior needs. More recent studies examining the bidirectional relationship between behavior and language and literacy skills in low-income preschool children in the United States also concluded that the relationship between receptive language and underactive behavior is unidirectional in nature, with early difficulties with receptive language leading to underactive or internalizing behavior (Bichay-Awadalla et al., 2019; Chow et al., 2018). In these studies, the authors postulated that early difficulties understanding language might serve as a barrier to social interaction, leading children to withdraw from teachers and peers.
Whereas no bidirectional relationship was found between underactive behavior and language and literacy skills, the finding that fall underactive behavior was significantly associated with lower spring vocabulary skills and that fall listening comprehension skills were associated with spring underactive behavior suggest what may be a cycle that occurs between underactive behavior and language and literacy skills. For example, children with lower receptive language skills may be more likely to be socially withdrawn, which in turn may hinder the development of vocabulary skills. Taken together, these findings highlight the importance of the development of skills that foster social interactions in the preschool classroom. Future research is needed to unpack these complex associations over longer periods of time.
Limitations and Future Directions for Research
Although this study extends our understanding of the complex relations that may exist between underactive and overactive behavior and language and literacy skills early in childhood, we must acknowledge several limitations that should be addressed in future research. First, the study’s sample consisted of urban-residing and predominantly African American and Latinx children. Results indicate that Latinx children in this study received lower teacher ratings of externalizing behavior. Early childhood studies have found that teachers’ perceptions of social and academic ratings and gains are significantly associated with the racial/ethnic match (of teachers and children) for African American children (Downer et al., 2016). Studies that examine potential teacher bias in ratings of children’s behavior in measures like the one used in this study are needed to guide interpretations our data. Future studies can also examine differences in cultural norms that may explain differences in children’s behavior and teacher ratings. Additional studies examining the generalizability of these findings using samples that include children from other ethnic groups and/or rural areas, who also represent the population of children served in public early childhood programs like Head Start, also are needed.
Future studies could also examine potential moderators. A recent Head Start study found that children’s sex moderated associations between receptive and expressive language skills and internalizing behavior for girls but not for boys (Bichay-Awadalla et al., 2019). Given our study’s sample size (and the number of parameters being estimated in our multilevel models), we were unable to examine whether the findings in this study were moderated by sex. Multiple group analyses that examine potential sex differences in the associations examined in this study could provide a better understanding of potential sex differences. Importantly, future studies could also examine the role of contextual variables that were not collected as part of this study. Family characteristics and classroom variables have been identified as important influences on children’s classroom behavior and academic readiness (e.g., Hamre & Pianta, 2005; Howes et al., 2008; Mashburn et al., 2008). For example, classroom quality dimensions, such as emotional support, have been found to buffer the negative effects of classroom behavior on school readiness outcomes (Domínguez et al., 2011). Similarly, other dimensions, such as instructional support, have been found to buffer the negative effects of family risk factors on school readiness outcomes (Hamre & Pianta, 2005). Examining the potential moderating influence of these variables would allow a more comprehensive understanding of the relations between behavior and language and literacy skills over time.
Finally, to examine potential bidirectional associations, this study collected data at two points in time, using teacher reports of children and direct assessments of language and literacy skills. However, studies that collect longitudinal data and examine how behavior and language and literacy skills influence each other over longer periods of time would provide a richer understanding of how these relations unfold—whether the patterns identified hold or change over time (Selig & Little, 2012). Early childhood studies also bring into question the use of teacher reports for behavior and direct assessment for language. Studies that apply other methodology and use different sources of measurement, such as behavioral observation, are warranted to replicate our findings (Chow & Hollo, 2018; Waterman et al., 2012).
Conclusion and Implications for Practice
In a nationally representative survey, kindergarten teachers reported that a substantial proportion of their students exhibited academic difficulties and lacked the social-emotional and regulatory skills needed to successfully engage in critical classroom learning opportunities (Rimm-Kaufman et al., 2000). Early childhood researchers and practitioners have increasingly highlighted this as a significant cause of concern (Thompson & Raikes, 2007), in particular in very recent times, with concerns about racial equity and the disproportionate patterns of suspension and expulsion of ethnic minority boys from early childhood classrooms (U.S. Department of Education Office for Civil Rights, 2016). To develop identification and intervention efforts that consider the individual needs of all children, we need to better understand how children’s skills in different domains develop and influence each other over time (Snow, 2007). In addition, researchers should develop strength-based models that inform ways of measuring, studying, and intervening to support children’s social-emotional and behavioral skills that are aligned with family culture, home language, race, and ethnic identity (Bulotsky-Shearer et al., 2020; Cabrera & The Society for Research in Child Development, Ethnic and Racial Issues Committee, 2013; García Coll et al., 2002). Importantly, research should acknowledge and address systemic racism and implicit bias embedded in programmatic assessment practices within schools (e.g., teacher-report bias, Chen et al., 2018; NAACP Legal Defense and Educational Fund, Inc. [LDF], 2017).
We examined how early behavioral adjustment and academic skills were related to each other during preschool. Our findings suggested that, during preschool, overactive and underactive behavior are differentially associated with language and literacy difficulties and that language and literacy difficulties are differentially associated with behavioral outcomes. These findings contribute to the broader early childhood field by underscoring the importance of developing early identification and intervention models that address both children’s social-emotional and academic needs. Mental health specialists and educators should be encouraged to share data about children’s behavior and academic achievement to better understand the potential mechanisms that underlie children’s difficulties. Chow and Hollo (2018) and Bornstein et al. (2013) describe several potential mechanisms that underlie language and behavior difficulties, such as language processing, working memory, attention or self-regulatory, and social skills that can be addressed through early intervention services as well as classroom-based instruction. Often, underlying language delays are subtle social-emotional needs that present themselves as behavior needs within early childhood classrooms (Chow et al., 2020).
Preschool classroom instructional models can integrate play-based experiences where educators can support children as they practice both language and social skills simultaneously. Other structured preschool activities, such as interactive storybook readings, can also be designed to incorporate social-emotional skills (e.g., Little talks, Manz et al., 2017) and link language and literacy instruction with social skill modeling (Head Start REDI program, Nix et al., 2016). Chow and colleagues (2020) also discuss several evidence-based strategies that can be used throughout the day in preschool classrooms to support students with language and behavioral difficulties. This requires providing clarity and consistency to children, so that they understand and feel supported as they respond to the behavioral expectations established in the classroom. Teachers can support children in meeting these expectations by providing specific feedback, redirection, and praise. In addition, Chow and colleagues (2020) suggest several language strategies that can be used consistently to help children communicate effectively; these include teachers modeling effective communication strategies and providing wait time to allow students to process information. Finally, consistent scaffolding of appropriate behavior and language use is critical to support all children, particularly those who exhibit both language deficits and behavioral difficulties. Continued research efforts that shed light on these complex associations and how to support children who display language and behavioral difficulties are crucial and will help us ensure all children successfully transition into kindergarten ready to learn.
Footnotes
Acknowledgements
The authors would like to acknowledge the Institute of Education Sciences for funding this research as well as our collaborators in the Miami-Dade Head Start Program.
Authors’ Note
The opinions expressed are those of the authors and do not represent views of the Head Start Program or the U.S. Department of Education.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Training Grant R305C050052 to the University of Miami, Dr. Greenfield, P.I. and development grant R305K060036 to the Miami Museum of Science, Dr. Brown, P.I. and a subcontract from the Museum to the University of Miami, Dr. Greenfield, Co-P.I. This funding source had no role in the study design, the collection, analysis and interpretation of data, or in the writing of the report and the submission for publication.
