Abstract
In the present study, we explored the factor structure as well as validity and reliability of the Spanish version of the Child Adjustment and Parent Efficacy Scale (CAPES) suitable for assessing child behavioural and emotional difficulties (Intensity Scale) and parental self-efficacy (Self-Efficacy Scale) among Spanish-speaking parents from the US, Latin America and Spain. This instrument was designed to be brief and easy to read in order to reach parents with low-literacy levels and from under-resourced backgrounds. Psychometrics for the English version of the CAPES indicates good internal consistency, as well as satisfactory construct and predictive validity of the measure (Morawska et al., 2014). A sample of 174 parents of children (91 boys and 78 girls) from Panama participated in this study. They completed the instrument alongside the Strengths and Difficulties Questionnaire (SDQ) for measuring child psychological problems and the Parenting Task Checklist (PTC) for measuring parental self-efficacy. In addition, a group of 49 parents completed the CAPES at time 1 (T1) and 2 weeks after (T2). Psychometric evaluation of the Spanish version of the CAPES revealed that it has adequate internal consistency and test–retest reliability, as well as satisfactory convergent and discriminant validity. In conclusion, this instrument shows promise as a brief outcome measure to be used in clinical settings and to assess the effects of parenting interventions among Spanish-speaking parents. More research into psychometric properties of the Spanish version of the CAPES is needed, before it can be widely applied in practice.
It is estimated that one in five children (20%) suffer from some sort of emotional or behavioural difficulty (Brauner & Stephens, 2006). Even though almost 90% of the world’s child and adolescent population lives in under-resourced countries where English is not the language spoken (UNICEF, 2012), instruments for assessing emotional and behavioural difficulties have usually been validated only with English-speaking samples. This hinders the growth of mental health research and services in under-resourced countries as well as among linguistically and culturally diverse populations. There is an urgent need for validation of assessment tools in languages other than English to facilitate mental health needs of non-English speaking children worldwide (Collins et al., 2011).
An additional difficulty is that most instruments developed in English-speaking countries tend to be long and complex, and therefore are problematic for parents with low literacy levels (Al-Tayyib, Rogers, Gribble, Villarroel, & Turner, 2002). This is especially important in the context of under-resourced countries or linguistically and culturally diverse populations where literacy levels tend to be low in the general population (Cree, Kay, & Steward, 2012). In order to reach these parents, there is a need for instruments that are brief and easy to respond and that use simplified language (Doak, Doak, & Root, 1996).
The Child Adjustment and Parent Efficacy Scale (CAPES; Morawska, Sanders, Haslam, Filus, & Fletcher, 2014) is a brief 27-item questionnaire with two scales; the Intensity Scale measures child behavioural and emotional problems and the Self-Efficacy Scale measures parental confidence in dealing with these behaviours. The English version of the instrument shows good psychometric properties. Between 2010 and 2014, the government of Panama, an under-resourced country in Central America, funded a research project for the dissemination of existing evidence-based parenting programmes. The need for a measure of child psychological problems and parental self-efficacy that was culturally validated in this context and that was designed specifically for parents with low literacy became apparent. Given that the CAPES was a recently developed, brief and easy to read instrument, we decided to translate it into Spanish and determine its psychometric properties with a Spanish-speaking population of parents in Panama. The readability of the Spanish version is likely to be similar to that of the English version (grade 9) and we therefore wanted to test the psychometric properties and the appropriateness of the CAPES with a sample of parents with diverse educational levels (including some with very low educational levels). The aims of the study were: To apply standard principles of cultural adaptation of the instrument through translation-back translation to prepare a Spanish version of the CAPES. To assess its internal consistency as well as its test–retest reliability To evaluate the construct, discriminant, convergent and concurrent validity of the instrument.
In our evaluation of construct validity, we hypothesized a 2-factor model for the CAPES Intensity in which the scale would measure two domains referring to the child’s (i) emotional and (ii) behavioural difficulties. On the other hand, we hypothesized a 1-factor structure for the CAPES Self-Efficacy.
Materials and methods
Participants
The sample was composed of 174 Panamanian parents. The only inclusion criterion was that parents had a child between 2 and 12 years old. Table 1 shows socio-demographic characteristics of the sample.
Socio-demographic characteristics of the sample.
M = mean; SD = standard deviation.
Measures
Child adjustment and parent efficacy scale (CAPES; Morawska et al., 2014)
The CAPES is a 27-item measure that assesses child emotional and behavioural problems and parental self-efficacy. The Behaviour Scale is composed of 24 items that assess behaviour problems (e.g. My child rudely answers back to me) and the Emotional Maladjustment Scale is composed of three items that assess emotional adjustment (e.g. My child worries). Nineteen items are about problematic behaviours (e.g. Loses their temper) while 8 items are about positive behaviours (e.g. Cooperates at bedtime). Parents rate each item from 0 (not true at all) to 3 (true most of the time) depending on how true the statement was for their child in the past 4 weeks. Items are summed to yield a total intensity score (CAPES Intensity Scale) composed of a Behaviour Score and an Emotional Maladjustment Score. Higher scores indicate higher levels of problems. An additional Self-Efficacy Scale asks parents to rate their confidence in handling the 19 problematic behaviour items (Morawska et al., 2014). Parents rate each item from 1 (certain I can’t do it) to 10 (certain I can do it) depending on how confident they are in successfully dealing with their child’s behaviour. Items are summed to yield a total self-efficacy score with high scores indicating higher levels of parental self-efficacy. The internal consistency of the English version of the CAPES was excellent for Intensity (α = .90), Behaviour (α = .90) and Self-Efficacy (α = .96), and adequate for Emotional Maladjustment (α = .74) (Morawska et al., 2014). The English and the final Spanish versions of the CAPES are provided in Appendices A and B.
Strengths and difficulties questionnaire (SDQ; Goodman, 1997)
The SDQ is a screening measure used to identify children’s psychological problems. It consists of 25 items that assess: emotional symptoms, conduct problems, inattention/hyperactivity, peer problems and prosocial behaviour. The English version of the instrument has good test–retest reliability (r = .85; Goodman, 1997). The Spanish version was provided by the authors of the instrument. However, no information on the psychometric properties was available. Thus, we validated the factor structure of the Spanish SDQ in the current sample via Confirmatory Factor Analysis (CFA) in Mplus v. 7.2 (Muthén & Muthén, 1998–2012). 1 The summarized results have been outlined in Appendix C. The analysis revealed a 2-factor structure in which emotional and behavioural problems were grouped together as one factor (items referring to child’s emotional symptoms, hyperactivity, peer and conduct problems) whereas the items referring to prosocial behaviours formed a second factor. The factors were named Intensity (12 items) and Prosocial Behaviour (4 items). The internal consistency was adequate for both subscales (α = .71 and = .60 2 , respectively).
Parenting task checklist (PTC; Sanders & Woolley, 2005)
The PTC is a 28-item tool used to assess task-specific parental self-efficacy. For each item, parents are asked to indicate on a scale of 0 (Certain I can’t do it) to 100 (Certain I can do it) how confident they feel in managing their children’s difficult behaviours across a variety of settings and tasks. The PTC consists of two subscales, Behavioural and Setting. The instrument was translated and back-translated into Spanish for the purpose of this study. Given that it has not been validated in a Spanish-speaking sample previously, we evaluated its factor structure in the current sample via CFA. The summarized results are outlined in Appendix C. The analysis revealed a 1-factor structure assessing behavioural and setting aspects of parental self-efficacy with excellent internal consistency (α = .95).
Procedure
Parents were recruited from four State-owned schools selected by convenience in four different low-income neighbourhoods of Panama City. With the consent of the head teacher, letters of invitation to participate in the study were sent to all parents of the schools (to approximately 2,100 parents). Some parents were also recruited face-to-face, when dropping off and picking up their children at school. Interested parents were asked to firstly attend an induction meeting at their child’s school in which the primary investigator (PI) explained the study and the assessment procedure. Those interested in the study made an appointment to complete assessments a week after the meeting. Written informed consent was collected before the start of the assessment. Assessments took place in groups of 10 parents at their child’s school. As all the measures were self-reported, parents were instructed to complete the questionnaires by themselves one after the other, but the PI was present to answer any questions. Each assessment session lasted approximately 2 hours.
In order to determine test–retest reliability, a sub-sample of 50 parents selected by convenience from two schools were asked to attend a second assessment appointment and complete the CAPES for a second time 2 weeks after initial completion. Contact details of these parents were collected during the first assessment. Two days before the second assessment, parents received a reminder call from the principal researcher. A total of 49 parents attended this second assessment.
Translation of the instrument
The Spanish version of the CAPES was prepared using guidelines for translation of psychological instruments (e.g. Borsa, Damasio, & Bandeira, 2012) including the standard translation-back translation method by Brislin (1970). Three official and authorized Panamanian translators were involved. Firstly, one person translated the instrument from English to Spanish. Then, a second independent person translated the new Spanish version back into English. Finally, a third independent person compared the original and new English translations and made any necessary modifications. For additional quality, expert translators were used and all the authors of this study revised the third synthesized version. The instrument was piloted with a group of 10 parents from similar communities and no changes were suggested.
Data analysis
Sample size
Recent simulation studies indicate that the recommended sample sizes for CFA are N ≥ 200 for theoretical models and N ≥ 300 for population models (Myers, Ahn, & Jin, 2011). On the other hand, empirical research by MacCallum, Widaman, Zhang, and Hong (1999) suggests that the adequacy of factor analysis results depends more on data characteristics than on the sample size employed. With the communalities in the range of .5 and small number of factors it is not difficult to achieve good recovery of population factors with samples in the range of 100 to 200 cases (MacCallum et al., 1999). Under this guideline the available sample of N = 174 was acceptable for testing models presented in Figure 1 and Figure 2. However, Structural Equation Modelling (SEM) is a large sample technique and a larger sample (> 500 cases) would be more preferred for obtaining smaller bias in parameter estimates, and therefore the presented results should be treated with caution.

Factor structure of the Spanish version of the CAPES Intensity. Standardized estimates. Note. Model fit: WLSMV χ2 (298) = 477.98, p < 0.001; CFI = .919; RMSEA = .059; all models based on N =174; all factor loadings significant at p < .001; correlation between two factors significant at p < .01; in brackets 95% Confidence Intervals.

Factor structure of the Spanish version of the CAPES Self-Efficacy. Standardized estimates. Note. Model fit: χ2 (149) = 234.83, p < 0.001; CFI = .904; SRMR = .064; RMSEA = .058 (90% CI .043 - .071); all models based on N = 174; all factor loadings significant at p < .001; correlations between error terms significant at p < .001; in brackets 95% Confidence Intervals.
Construct validity
The factor structures of the CAPES were examined via CFA in Mplus v. 7.2. For the CAPES Intensity Scale, we hypothesized a 2-factor model in which one factor would represent behavioral problems and the other emotional problems. For CAPES Self-Efficacy, we hypothesized a 1-factor model. The factor structure of CAPES Intensity was estimated using the Mean- and Variance-adjusted Weighted Least Square estimator (WLSMV) given the ordinal nature of the items (4-point Likert scale) (Muthén, du Toit, & Spisic, 1997). The CAPES Self-Efficacy items were both continuous (10-point Likert type scale) and not normally distributed (see Preliminary Analysis section in the Results), therefore we applied Robust Maximum Likelihood (MLR) estimator for this scale. The chi-square (χ 2) goodness-of-fit statistic, the comparative fit index (CFI), the Root Square Error of Approximation (RMSEA) with 90% CI, and the Standardized Root Mean Square Residual (SRMR; only available for MLR estimator) were used to evaluate model fit. For the model to be considered to have acceptable fit, RMSEA and SRMR should be < .08 with CFI > .90 (Hu & Bentler, 1999). Models were re-specified based on Modification Indices (MIs), inspection of standardized residuals and theoretical considerations (Kline, 2011). Furthermore, consistent with Stevens (1992), only the items with loadings > .40 were considered as a part of the factor. To assess the extent to which the newly specified model exhibits an improvement over its predecessor, we used two approaches suitable for the two estimators. For the WLSMV, we compared the fit indices of the two models. 3 For the MLR, the chi-square difference test was applied for nested models (using the scaled chi-square and formulas developed by Satorra and Bentler, 1994), and the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) values for non-nested models. A significant difference in the chi-square value associated with the difference in degrees of freedom suggests that the model with the fewer degrees of freedom fits the data significantly better (Kline, 2011). Smaller values of AIC and BIC indicate better model fit (Schreiber, Stage, King, Nora, & Barlow, 2006).
Convergent validity
Three standard approaches were applied to assess convergent validity of the factor structures: (i) we evaluated the statistical significance and magnitude of factor loadings; (ii) checked that the estimate of the Average Variance Extracted (AVE) which is shared between the construct and its measures is above .50; and (iii) tested that estimates of Composite Reliability (CR) were above .70 (Fornell & Larcker, 1981; Morawska et al., 2014).
Discriminant validity
Two techniques were employed to assess discriminant validity of the factor structures: 4 (i) we examined that the correlations between the latent constructs are not < 1.00 using the Model test command in Mplus (the command allows testing a series of parameter constraints using the Wald test); and (ii) examined if the AVE estimates for each construct are higher than shared variance (Squared Interconstruct Correlation; SIC) between them (Fornell & Larcker, 1981; Morawska et al., 2014).
Concurrent validity
To determine the concurrent validity of the instrument, we examined the correlations at the latent level in Mplus v.7.2 between the CAPES and other measures of child adjustment and parental self-efficacy: SDQ and PTC. The advantage of using latent constructs instead of composite scores is that the latent approach can decompose true score variance from error variance, allowing for estimation of effect sizes that are not attenuated by measurement error (Kline, 2011). The WLSMV estimator was used to evaluate correlations between CAPES Intensity and SDQ (ordinal nature of the items for both scales). MLR estimator was used to evaluate associations between CAPES Self-Efficacy and the PTC (continues items for both scales). Given that the SDQ and the PTC were not previously validated in Spanish, we simultaneously validated their factor structure in the current sample via CFA. Results are presented in Appendix C.
Reliability
Due to the limitations associated with Cronbach’s alpha coefficient when the assumptions of tau-equivalence and uncorrelated errors are violated (Cheng, Yuan, & Liu, 2012), we assessed the internal consistency of the measure by calculating the H coefficient (Hancock & Mueller, 2001). Its advantage over the traditional construct reliability measures is that it draws the information from all indicators in a manner that corresponds to their own ability to reflect the construct. The range and interpretations of H coefficient values is exactly the same as for the popular Cronbach’s alpha.
The test–retest reliability was assessed using Intraclass Correlation Coefficient (ICC) with 95% confidence interval (Weir, 2005). The coefficient is an improvement over the traditional Pearson’s r or Spearman’s ρ, as it takes into account both consistency of performances from test to retest (within-subject change) as well as change in average performance of participants as a group over time (i.e., systematic change in mean). The ICC values range from 0 to 1 with values > .60 indicating good and values > .75 indicating excellent test–retest reliability (Fleiss, 1986). The ICCs were computed in SPSS v. 21 using a two-factor mixed effects model and type consistency (McGraw & Wong, 1996).
Results
Preliminary analysis
The missingness was negligible (1.96% and 1.48% in the CAPES Intensity and Self-Efficacy, respectively). Nevertheless, we decided to use the most advanced approaches to handle missing data available in Mplus v 7.2. For the WLSMV, Multiple Imputations procedure with 50 imputations was applied 5 (Asparouhov & Muthén, 2010). The parameter estimates and standard errors were averaged over each imputed data set using Rubin’s method (1987). The fit indices were averaged over each imputed data sets to create an arithmetic mean while the chi-square test statistics from each imputed data set were combined to yield an F statistic and a p value for the chi-square test using a method developed by Li, Meng, Raghunathan, and Rubin (1991) that was utilized in the Combchi macro developed by Allison (2007) for SAS software. For the MLR estimator, the Full Information Maximum Likelihood (FIML) procedure was used. Both Multiple Imputations and FIML have been proven to outperform traditional approaches for handling missing data when the data are at least missing at random (MAR) (Enders, 2001).
Out of 27 CAPES Intensity Scale items, 12 showed significant skew, and 17 showed significant kurtosis (the average skewness and kurtosis were −.51 and −.57, respectively). In terms of the CAPES Self-Efficacy, all the items showed significant skew and 17 items showed significant kurtosis (the average skewness and kurtosis were −.29 and 10.60, respectively). The normalized estimates of Mardia’s coefficient of multivariate skewness and kurtosis were also high for both scales: skewness of 147.04 (C.R. = 204.50) and 153.47 (C.R. = 162.28); kurtosis of 604.99 (C.R. = 161.63) and 816.27 (C.R. = 162.28) for the CAPES Intensity and Self-efficacy, respectively. In addition, 235 (.07%) univariate outliers were detected. Squared Mahalanobis distances (D2) showed no evidence of serious multivariate outliers.
Construct validity
Factor structure of the CAPES Intensity
The analysis started with testing a single model factor (Model A) to serve as a comparison to our hypothesized 2-factor model separating behavioural from emotional difficulties (Model A1). As Table 2 indicates, both models showed poor fit to the data. However, the 2-factor model indicated that the correlation between the two constructs was very close to the value of 1 (r = .94, p < .001). According to Kline (2011), if the correlation between two factors is > .90, it is unlikely that they measure distinct constructs. Therefore, the 1-factor model was chosen as a better representation of the data. Inspection of the factor loadings indicated that eight items needed to be removed from this model due to the insignificant loadings on the designated factor (items 20–27, see Appendix A). The revised model showed good fit to the data (see Table 2, Model A2). However, inspection of factor structure indicated that item 3 (My child worries) had a very low loading (.25) on the designated factor and thus we decided to remove it from the model. The final 1-factor model showed good fit to the data (Model A3) and the factor was called Behavioural and Emotional Problems.
Confirmatory factor analysis of the factor structures of Spanish version of the capes intensity.
Note. WLSMV χ 2 = chi-square statistic from the robust weighted least squares estimator; df = degrees of freedom; CFI = comparative fit index; RMSEA = root mean square error of approximation; SD = Standard deviation across 50 imputations. WLSMV chi-square statistics were combined across 50 imputations and the procedure by Li et al. (19991) to estimate a p value for the chi-square test of exact fit. The fit indices (CFI and RMSEA) were averaged across 50 imputations. The option of calculating Confidence Intervals for Chi-square and other fit indices when Multiple Imputations are used is not available in Mplus (the rules to calculate CIs have not been established under MIs).
All models based on N = 174; ***p < .001.
In the next step, we inspected the eight items that had to be removed from the previous model. Most of these referred to child’s competencies in various aspects of everyday life and thus we investigated if they would create a separate factor. The 1-factor model showed very good fit to the data (See Model A4) and was called Child’s Competencies. In the final step, we tested the two factors together in one model, which showed good fit to the data (See Table 2, Model A5 and Figure 1). Therefore, the final model was a 2-factor model (one merging behavioural and emotional problems together and the other one being of child’s competencies) rather than the originally hypothesized 2-factor model separating behavioural versus emotional problems.
Factor structure of the CAPES Self-Efficacy
The analysis started with the hypothesized single factor model (Model B), which did not show good fit to the data (see Table 3). Inspection of MIs revealed that the model fit could be improved by allowing three correlations between error terms. Items 5 and 7 both referred to parental self-efficacy in managing child’s misbehaviour during mealtime; items 1 and 4 both referred to parental self-efficacy in managing the child when he/she is being angry and upset; and items 18 and 19 referred to parental self-efficacy in managing child’s depressive symptoms which makes these correlations theoretically sensible. Changes were made one at a time (Models B1–B3). The final model showed acceptable fit to the data and is presented in Figure 2.
Confirmatory factor analysis of the factor structures of Spanish version of the capes self-efficacy.
Note. χ 2 = Satorra-Bentler scaled chi-square; df = degrees of freedom; CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation; CI = confidence interval; AIC = Akaike’s information criterion; BIC = Bayesian information criterion.
All models based on N = 174; ***p < .001; **p < 0.01.
Convergent and discriminant validity
For both scales all the indicators had significant loadings on designated factors (See Figure 1 and Figure 2). The AVE estimates for both scales were below the cut-off value of .50, but the CR estimates were satisfactory (AVE: .41, 43 and .39 and CR: .95, 90 and .92 for Behavioral and Emotional Problems, Child’s Competencies and Self-Efficacy, respectively).
In terms of discriminant validity of the CAPES Intensity constructs, the correlation between two factors was low (r = −.20; p < .01). The Wald test of parameter constraints further indicated that the correlation between the two factors is not equal to 1 (Wald χ2(1) = 239.43, p < .05). Finally, the SIC estimate (.04) was lower than the AVE estimates for the two subscales (.41 and .43 for Behavioural and Emotional Problems and Child’s Competencies, respectively).
To assess the discriminant validity of the CAPES Self-Efficacy, we compared this scale with the CAPES Intensity subscales. Due to the categorical nature of the CAPES Intensity items, WLSMV estimator was applied with Multiple Imputations to handle missing data. The correlations between the Self-Efficacy and the CAPES Intensity subscales were low (r = .16, p = .01 and r = .08, p = .01 for the association with Behavioural and Emotional Problems and Child’s Competencies, respectively). The Wald test of parameter constraints further indicated that the correlation between the factors is not equal to 1 (Wald χ2(1) = 48.89, p < .001 and Wald χ2(1) = 71.50, p < .001 for correlation with Behavioural and Emotional Problems and Child Competencies, respectively). Finally, the SIC estimates (.03 and .01 for relations with Behavioural and Emotional Problems and Child’s Competencies, respectively) were lower than AVE estimates for all three factors (.41, .43 and .39 for Behavioural and Emotional Problems, Child’s Competencies and Self-Efficacy, respectively).
Concurrent validity
As Table 4 presents, both the CAPES Behavioural and Emotional Problems and the Child’s Competencies subscales correlated significantly and positively with SDQ Intensity and significantly and negatively with SDQ Prosocial Behaviour. As expected, CAPES Intensity revealed stronger relationship with SDQ Intensity than with SDQ Prosocial Behaviour. In contrast, CAPES Self-Efficacy had stronger relationships with SDQ Prosocial Behaviour than with SDQ Intensity. Finally, CAPES Self-Efficacy correlated significantly and positively with the PTC.
Reliability, means, standard deviations, polychorical correlations among the CAPES intensity and SDQ and Pearson correlations among the CAPES Self-efficacy and PTC.
Note. M = mean; SD = standard deviation; ICC = Intraclass Correlation Coefficient; CI = Confidence Interval. To evaluate correlations between the CAPES Intensity and SDQ we used WLSMV estimator (polychorical correlations) and Multiple Imputations approach to handle missing values. Correlation coefficients were pooled across 50 imputed samples using Rubin’s rules (1987). To evaluate correlations between the CAPES Self-efficacy and PTC we used MLR estimator (Pearson correlations) and FIML procedure to handle missing data.
All calculated on N = 174; *p < .05; **p < .01; ***p < .001.
Reliability
The coefficients H indicated adequate internal consistency of the Child’s Competencies subscale and excellent internal consistency of the Behavioural and Emotional Problems and the Self-Efficacy subscales (see Table 4). The CAPES also showed satisfactory test–retest reliability (Intraclass Correlation Coefficient ICC values > .60).
Discussion
The present study examined the psychometric properties of the Spanish version of the CAPES. The CFA supported the hypothesized 1-factor structure of the Self-Efficacy Scale. For the CAPES Intensity, we hypothesized that the scale would measure two domains; one referring to child’s emotional and the other one to child’s behavioural difficulties. However, the CFA supported a 2-factor structure with items referring to child’s emotional and behavioural problems merged together in one factor (named Behavioural and Emotional Problems) and items referring to child’s competencies merged in the second factor (named Child’s Competencies). We are not aware of any previous study with a Spanish-speaking sample in which the factor structure of an existing child adjustment scale showed both externalizing and internalizing problems as 1-factor (rather than as separate factors). For example, one scale which factor structure has been extensively compared across cultures is the Caregiver-Teacher Report Form from ASEBA (C-TRF; Achenbach & Rescorla, 2000). Even though the scale in Chile showed a slightly different factor structure than in the normative American sample, its structure still suggested a 2-factor model separating internalizing and externalizing symptoms (Ivanova et al., 2010). However, the educational level in this Chilean sample was higher than in our Panamanian sample.
A potential explanation of such high correlation of items assessing children’s emotional and behavioural problems in the current sample is that children’s internalizing and behavioural externalizing problems might be occurring and/or expressed together in this culture (Panama and potentially other Latin American settings), and therefore not perceived by parents as two distinct aspects of child maladjustment. Externalizing and internalizing problem behaviours often co-occur in children and adolescents (Angold, Costello, & Erkanli, 1999) and this comorbidity may be especially high in Latino cultures. For example, in a study looking at ethnic differences in internalizing and externalizing symptoms, Latino children showed the highest level of comorbidity (McLaughlin, Hilt, & Nolen Hoeksema, 2007). This result is in line with the outcomes of the CFA conducted on the Spanish version of the SDQ in the current sample (See Appendix C); a 2-factor model was supported as the best structure, in which items assessing child’s emotional and behavioural problems were grouped together as one factor and the items referring to prosocial behaviours formed a second factor.
At this point, it is also important to recognize a potential psychometric explanation for the 1-factor model found in the Spanish version of the CAPES: that emotional and behavioural problems are grouped together because the emotional subscale only has 3 items and low internal consistency. However, if the length of the scale was the cause of this structure, the same solution (1-factor model encompassing emotional and behavioural problems) would have been obtained in the Australian sample of parents (Morawska et al., 2014), and this was not the case.
It is also worth mentioning that the discrepancy in factor structure between the two language versions of the CAPES might be due to differences in the statistical approaches used in each study. In the current study, we used individual items as indicators of latent constructs, whereas a parcelling approach was used in the Australian validation (Morawska et al., 2014). As noted by Bandalos (2002), parcelling can mask a multidimensional structure of the measure in a way that satisfactory model fit is found for misspecified models. Our 1-factor model grouping behavioural and emotional difficulties requires further investigation with larger and more diverse samples of parents from different countries in Latin America.
Both CAPES Intensity and Self-Efficacy showed very good convergent validity as measured by examination of factor loadings, and composite reliability estimates. However, the AVE estimates indicated that there is on average more error in the items intended to measure the constructs than there is variance explained by these constructs. Nevertheless, both measures showed good discriminant validity. Finally, CAPES Intensity and Self-Efficacy revealed good concurrent validity by showing significant correlations with SDQ and PTC scales, respectively.
The present study has several limitations that should be noted. First, the sample was predominantly composed of mothers coming from low-income families. Furthermore, the sample size was fairly small for SEM analysis. This may have biased the evaluation of parameter estimates. Future studies should aim to test the psychometric properties of the Spanish version of the scale with larger (> 500 cases) and more diverse samples (in terms of gender and socio-economic variables). It is also important to establish the psychometric properties of the CAPES with Spanish-speaking parents from other countries in Latin America, Latino parents in the USA and with parents from Spain. In addition, further validation using clinical and normative samples of parents is required in order to establish norms and determine whether the measure can differentiate between clinical and non-clinical populations. Finally, it is important to mention that the instruments to establish criterion validity (SDQ and PTC) were not previously validated in Spanish-speaking samples. However, we evaluated the factor structure of these Spanish versions of the SDQ and PTC in the current sample and results have been outlined in in this article (See Appendix C).
In conclusion, results are promising with regard to the use of the CAPES with Spanish-speaking parents but further studies are needed with bigger and more diverse samples before the CAPES can be recommended for routine use in research and clinical practice. An international agenda for developing and validating good quality instruments that can be used with non-English speaking parents, especially from under-resourced backgrounds and with a low literacy level, should be pushed forward.
Footnotes
Notes
Appendix A
Appendix B
Appendix C
Confirmatory factor analysis of the factor structures of Spanish version of the PTC.
| Model | χ 2 | df | Δχ 2 | Δdf | CFI | SRMR | RMSEA | RMSEA 90% CI | AIC | BIC |
|---|---|---|---|---|---|---|---|---|---|---|
| B PTC 1-factor model | 977.19*** | 350 | .739 | .094 | .101 | .094–.109 | 43134.09 | 43399.45 | ||
| B1 PTC 2-factor model | 968.72*** | 349 | .742 | .093 | .101 | .093–.109 | 43123.13 | 43391.65 | ||
| B2 PTCa 1-factor model items 3, 4, 5, 6, 9, 13, 14 removed due to insignificant factor loadings | 543.97*** | 189 | .817 | .073 | .104 | .094–.114 | 32671.70 | 32870.72 | ||
| B3 PTC 1-factor model items 26,15,27,28,2,10 removed based on standardized residuals | 206.17*** | 90 | .908 | .063 | .086 | .071–.102 | 233316.79 | 23458.95 | ||
| B4 PTC 1-factor model B3 with added correlation between error terms of items | 193.71*** | 89 | 27.30*** | 1 | .917 | .063 | .082 | .066–.098 | 23297.48 | 234442.79 |
| B5 PTCb 1-factor model on the removed 13 items (2, 3, 4, 5, 6, 9, 10, 13, 14, 15, 26, 27, 28) | 259.79*** | 65 | .634 | .111 | .131 | .115–.148 | 19953.62 | 20076.82 |
Note. χ 2 = Satorra-Bentler scaled chi-square; df = degrees of freedom; CFI = comparative fit index; SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation; CI = confidence interval; AIC = Akaike’s information criterion; BIC = Bayesian information criterion.
a1-factor model was chosen due to correlations close to the value of 1 between the 2 factors of the PTC.
bSubsequent analyses indicated that the items did not form a separate theoretically meaningful factor.
All models based on N = 174; ***p < .001; **p < 0.01.
