Abstract
The parent-report Mood and Feelings Questionnaire (MFQ-P) is one of the few well-established available measures specifically designed to assess childhood depression from the parent’s perspective. However, to date, few studies have analyzed the factorial structure of the MFQ-P. The aim of this study was to examine for the first time the psychometric properties of the scores and factorial structure of the Spanish-adapted version of the MFQ-P in a community sample of Spanish-speaking children. Parents of 181 children (54.1% boys) aged 6–8 years participated in this study. The MFQ-P was translated into Spanish and administered along with the Strengths and Difficulties Questionnaire-parent version (SDQ-P) and the Spence Children’s Anxiety Scale-parent version (SCAS-P). The scale showed high internal consistency (α = .92) and acceptable test–retest reliability, and factor analysis confirmed the original single-factor structure after removing one item. Convergent and divergent validity was supported. The findings provide initial support for the use of a 33-item version of the MFQ-P in the Spanish population, adding further international evidence for this promising scale.
Keywords
Depression is common among children and adolescents, with prevalence ranging from 2.7% to 12% (Bronsard et al., 2016; Merikangas et al., 2010). Although depression appears to be more common from the age of 13 years onwards (Costello et al., 2006), it is estimated that 5% of children aged 5–12.9 years suffer a first depressive episode, and more than half of these episodes occur at ages 5–10 years, lasting longer than at later ages (see Rohde et al., 2013). Strong comorbidity has been reported between depression, anxiety, substance use, and behavioral disorders in youths, while presenting a depressive episode during childhood increases the risk of depression later in life (Avenevoli et al., 2015; Rohde et al., 2013). Moreover, the presence of depressive symptoms can predict the onset of the disorder. Therefore, an accurate screening is important for case detection and to provide a treatment, but it may also help to detect children at risk and prevent future problems (van Lang et al., 2007; Williams et al., 2009).
Although parent–child rating discrepancies have been found in the assessment of internalizing problems, valuable data can be obtained from both informants and their discrepancies for diagnostic and intervention decisions (Achenbach, 2011; Grills & Ollendick, 2003). Despite the modest correlations reported between parent and child reports of child depressive symptoms, parents can offer a valuable and unique perspective of their child (Kim et al., 2016). In this regard, previous research has found that parents’ ratings of their children’s depressive symptoms significantly predicted the onset of childhood mood disorders in a comparable degree to the child’s report, and were better predictors in younger children compared to self-reports (Lewis et al., 2012). Thus, in the assessment of infantile-juvenile depression, parents can be a useful source of information, especially in younger children who may have more difficulty expressing their thoughts, feelings, or behaviors accurately and may tend to a greater somatization (Bhatia & Bhatia, 2007; Dougherty et al., 2008).
There are a number of depression-specific screening tools available, but only a few of them provide a version for parents and have been developed specifically for children and adolescents, rather than being adapted from adults’ tools (see Williams et al., 2009). Another limitation that has been raised is the scarcity of open access available measures, which restricts their use (Jeffreys et al., 2016). One of the most widely used and well-established instruments that overcome these limitations is the Mood and Feelings Questionnaire (MFQ; Angold et al., 1995), specifically designed for the screening of a broad range of depression symptoms (i.e., cognitive, vegetative, suicidal, and affective), based on criteria of the Diagnostic and Statistical Manual of Mental Disorders (DSM), in children and adolescents aged 6–17 years. Both the 33-item self-report and the 34-item parent-report versions are available open access. Despite that the MFQ was originally developed for epidemiological studies, it has been extensively used to screen for depression in clinical and non-clinical settings (e.g., Daviss et al., 2006) and has been highlighted as being among the most useful tests for screening childhood depression (Bernaras et al., 2019). Indeed, the National Institute for Health and Clinical Excellence (NICE) has recommended the suitability of using this measure to screen for depression and to monitor treatment response (Lawton & Moghraby, 2016).
Similar to the self-report version of the MFQ, the parallel parent-version has shown high internal consistency (α = .94–.96), test–retest reliability, and good validity, discriminating between depressed and non-depressed children and adolescents both from clinical and non-clinical settings (Daviss et al., 2006; Kent et al., 1997; Tavitian et al., 2014). It has also shown acceptable accuracy in predicting suicidal ideation (Hammerton et al., 2014). Despite that the reliability and validity of the scale have been well established, few studies have analyzed the factorial structure of the MFQ. The unidimensional structure of the scale was supported for the parent and child forms in Angold et al.’s (1995) original study in a set of items (28 and 30 items, respectively). The 33-item self-report version also received support (Banh et al., 2012). In contrast, one study using a sample from a clinical setting found a five-factor solution for the 34-item parent-report version of the MFQ; however, several items loaded similarly on two factors, and three items were deleted (Jeffreys et al., 2016). These findings show the need for further research to verify whether the original unidimensional structure is supported for the 34-item parent-report version in other samples.
In recent years, the original English version of the parent-report Mood and Feelings Questionnaire (MFQ-P) has been translated into Arabic (Tavitian et al., 2014), Danish (Frøkjær et al., 2017), and Brazilian Portuguese (Rosa et al., 2018). However, despite the prevalence of depression among Spanish children and adolescents, ranging from 5% to 13% (Del Barrio, 2015), the importance of parents’ involvement in its evaluation (e.g., Lewis et al., 2012), and the potential value of the MFQ-P, it is not yet available for the Spanish-speaking population. Therefore, the current study extends the literature by examining for the first time the properties of the scores and factorial structure of the MFQ-P in a community sample of Spanish-speaking children aged 6–8 years. Specifically, the goals were to translate the MFQ-P to European Spanish and to examine the internal consistency, test–retest reliability, convergent and divergent validity, and unidimensional structure of the scale (Angold et al., 1995) in a Spanish sample.
Method
Participants
One hundred eighty-one children (boys = 98, 54.1%, and girls = 83, 45.9%) aged 6–8 years with a mean age of 6.87 years (standard deviation (SD) = 0.79) and their parents (81.8% mothers and 18.2% fathers) participated in this study. The % by age was 38.1% 6-year-olds, 37% 7-year-olds, and 24.9% 8-year-olds. The sample was recruited from 10 primary schools, both public and private, from urban areas. Of all the children, 44.8% attended first grade of primary education, 35.4% second grade, and 19.8% third grade. All the children were Spanish speakers, most of them were born in Spain (97.8%), and the rest (2.2%) were foreign born. Most parents were married (86.2%), and the rest were divorced (12.2%) or single parents (1.6%). More than half of the parents had a high level of education (52.5%), while the rest had secondary (30.9%) or primary education (16.6%). The middle socioeconomic status was predominant in the sample.
Measures
MFQ-P
The MFQ-P consists of a 34-item measure to assess depressive symptoms of children and adolescents from the parents’ perspective (Angold et al., 1995). Parents are asked to rate how their child has acted or felt in the past 2 weeks on a 3-point scale with the following alternatives: 0 (not true), 1 (sometimes), and 2 (true). The total MFQ-P score is calculated by adding all the responses (score range: 0–68), with higher scores indicating greater severity of depressive symptoms.
The European Spanish version of the MFQ-P was translated from English into Spanish following the back-translation method (Hambleton, 2005). Permission from the authors was obtained previously. The 34 items of the MFQ-P English version were translated into European Spanish by a bilingual psychologist and, subsequently, another bilingual psychologist who did not know the English version translated it back into English. Finally, the bilingual translators compared the back-translated version with the original scale and resolved any small differences by discussion.
The Spence Children’s Anxiety Scale-parent version
The Spence Children’s Anxiety Scale-parent version (SCAS-P) is a 38-item measure that assesses the severity of anxiety symptoms in children aged 6–18 years (Nauta et al., 2004). It comprises six dimensions, measuring symptoms of subtypes of anxiety disorders such as panic and agoraphobia, obsessive–compulsive disorder, generalized anxiety/overanxious symptoms, separation anxiety, social phobia, and fears of physical injury. Parents are asked to rate the items on a 4-point scale ranging from 0 (never) to 3 (always). The total score is calculated by adding all the item scores (score range: 0–114). Internal consistency for the current sample was also satisfactory (α = .86). The Spanish version of the SCAS-P was used for this study (Orgilés et al., 2019).
Considering the frequent co-occurrence of anxiety and depression (Garber & Weersing, 2010), the participants of the study completed the SCAS-P to assess the convergent validity of the MFQ-P.
The Strengths and Difficulties Questionnaire-parent version
The Strengths and Difficulties Questionnaire-parent version (SDQ-P) presents 25 items assessing emotional and behavioral difficulties and positive behaviors in children aged 4–17 years (Goodman, 1997). It comprises five dimensions: emotional symptoms (i.e., anxiety and depression), conduct problems, hyperactivity/inattention, peer relationship problems, and prosocial behavior. All the items are rated from 0 (not true) to 2 (certainly true). Higher scores indicate more negative aspects, except for the prosocial subscale. Cronbach’s alpha in the current sample was .75. In this study, the SDQ-P emotional symptoms and prosocial behavior subscales were employed to assess the convergent and divergent validity of MFQ-P, respectively.
Procedure
Of the 12 urban schools that were invited to participate, given their ability to represent the socioeconomic structure of the Spanish population, 10 accepted. All are located in the southeast region of Spain. The school principals’ approval was obtained. The parents of children aged 6–8 years received all the information about the purposes and procedure of the study from the schools on paper and by e-mail. Parents who were interested and who agreed to participate voluntarily completed a short online form consisting of a set of three questionnaires along with the MFQ-P. Informed consent of all the parents was obtained before they completed the form. A randomly selected subsample of parents was asked to fill in the form again 8 weeks after the initial assessment to examine test–retest reliability of the MFQ-P. The ethics boards of the Miguel Hernández University reviewed and approved this study (reference DPS.MO.02.14).
Statistical analysis
We examined the factor structure of the Spanish version the MFQ-P to determine whether the original unidimensional (single factor) latent structure (Angold et al., 1995) fit the Spanish data. Confirmatory factor analysis (CFA) was conducted in R Studio environment (R Studio Team, 2016), using the diagonally weighted least square (DWLS) method. This estimator is recommended because of its robustness with small samples, ordinal data, or when the principle of normality is not met (Forero et al., 2009; Li, 2016). The goodness of fit of the model was assessed by the following indices and values: χ2/degrees of freedom (df) (⩽3), comparative fit index (CFI ⩾ .90), Tucker–Lewis index (TLI ⩾ .90), and root mean square error of approximation (RMSEA ⩽ .08) (Hu & Bentler, 1999).
The psychometric properties of the items were analyzed. Internal consistency was examined using Cronbach’s alpha. Composite reliability based on standardized factor loadings and error variances was calculated using an online calculator at: http://www.thestatisticalmind.com/calculators/comprel/composite_reliability.htm. To test the temporal stability of the MFQ-P, 23.2% of the initial sample was retained (n = 42; mean age = 6.88 years, SD = 0.80; 40.5% were females). Intraclass correlation coefficient (ICC) for test–retest was calculated using baseline and 8-week post-assessment. An acceptable value of ICC was set at .60 or above, following Anastasi (1998). Equivalence in scores of the MFQ-P, the SCAS-P, and the SDQ-P between the test–retest subsample and those who were not involved in the second evaluation was tested. The total score of SCAS-P and two subscales of the SDQ—emotional symptoms and prosocial behavior—were included to evaluate the convergent and divergent validity of the MFQ-P by calculating Pearson correlations. The scores of MFQ-P were compared by gender and age using t test and one-way analysis of variance (ANOVA). Data were analyzed via SPSS v25.
Results
Descriptive statistics for the MFQ-P
Table 1 shows the means and SDs for each item and for the total MFQ-P score, as well as the corrected item-total correlation. For the total sample, the mean score on the MFQ-P was 9.54 (SD = 9.05). Girls and boys obtained a mean score of 9.96 (SD = 9.88) and 9.18 (SD = 8.33), respectively. No differences were found in MFQ-P scores by children’s gender (t (0.95; 179) = −.57; p = .56) or age (F (2; 179) = .41; p = .66). The corrected item-total correlations (
Scale properties of the MFQ-P.
MFQ-P: parent-report Mood and Feelings Questionnaire; M: mean; SD: standard deviation;
Reliability
Regarding the internal consistency of the MFQ-P, Cronbach’s alpha reliability coefficient of the scale if the item is removed was computed, along with the Cronbach alpha coefficient for the total score and the ICC for test–retest reliability (Table 1). Initially, the Cronbach alpha of the scale was .91. However, analyses showed that the reliability of the scale improved (α = .92) if item 1 (“She or he felt miserable or unhappy”) was removed. Therefore, the removal of this item was considered, also based on the above-mentioned fact that this item reached the lowest item-total correlation, suggesting a low relevance of the item within the scale. Subsequent analyses were conducted excluding this item. Composite reliability using the original version of the MFQ-P (except item 1) was .92.
Test–retest reliability was analyzed using a subsample of 42 participants. There were no statistically significant differences as a function of age (t (0.95; 179) = −.55; p = .58), gender (χ2(1) = 2.03; p = .15), MFQ-P scores (t (0.95; 179) = −1.73; p = .08), SCAS-P (t (0.95; 179) = -.75; p = .45), or the emotional symptoms (t (0.95; 179) = −1.70; p = .09) and prosocial behavior (t (0.95; 179) = 1.54; p = .12) subscales of the SDQ-P between the test–retest subsample and those who were not involved in the second evaluation. Test–retest stability of the MFQ-P was adequate (ICC = .76, 95% confidence interval (CI) = [.55, .87]).
Convergent and divergent validity
Pearson correlations were computed to examine convergent and divergent validity (Table 2). Evidence for convergent validity was found through moderate significant correlations between the MFQ-P, the SDQ-P emotional symptoms subscale, and the total SCAS-P scale. Also, it should be noted that the correlation with the SDQ emotional symptoms subscale was slightly higher than the correlation with a specific measure of anxiety like the SCAS-P, suggesting further support for its convergent validity. Analysis showed a significant but small negative correlation between the MFQ-P and the SDQ-P prosocial behavior subscale, supporting the divergent validity of the MFQ-P.
Pearson correlations among the MFQ-P and the total score of SCAS-P and the emotional symptoms and the prosocial behavior subscales of the SDQ-P.
MFQ-P: parent-report Mood and Feelings Questionnaire; SCAS-P: Spence Children’s Anxiety Scale-parent version; SDQ-P: Strengths and Difficulties Questionnaire-parent version.
Correlation is significant at the .01 level (two tailed).
CFA
CFA was performed to determine whether the unidimensional structure fit the Spanish data. As described above, considering both the low item-total correlation and the increase of the reliability if the item was deleted, item 1 was dropped before running the CFA. The results suggested a good fit for the 33-item single-factor structure: χ2 = 620.35, df = 495, χ2/df = 1.25, CFI = .99, TLI = .98, and RMSEA = .03; 95% CI = [.027, .047]. Table 3 provides the factor loadings, all statistically significant, with standardized values exceeding .30.
Factor loadings for the single-factor model.
The original unifactorial version of the MFQ-P with 34 items (including item 1) was tested, but the model was not converged. The adjustment of the model proposed by Jeffreys et al. (2016) with 31 items was estimated, and the fit was also excellent: χ2 = 463.51, df = 424, χ2/df = 1.09, CFI = .99, TLI = .99, and RMSEA = .02; 95% CI = [.010, .031]. From a statistical point of view, the adjustment of the model proposed by Jeffreys et al. (2016) was slightly higher than the original version of the MFQ-P (without item 1). However, the Jeffreys et al.’s (2016) model removed three items (items 4, 20, and 26 from the 34-item version) that did not work properly with a sample of families with youth aged 5–18 years seeking treatment at an outpatient community mental health clinic for a range of emotional and behavioral problems in the United States. These three items were relevant for the Spanish version of the MFQ-P and were retained based on statistical and clinical criteria. Therefore, the original MFQ-P model (without item 1) was selected in the current study.
Discussion
The main purpose of the present research was to examine the psychometric properties of the scores and factorial structure of the Spanish-adapted version of the MFQ-P. The overall mean score obtained (9.54) suggests that the level of depressive symptoms in Spanish children aged 6–8 years is not elevated if we use the cut-point of 27 proposed by Daviss et al. (2006). Although we found no research reporting data of the MFQ-P in this specific age range, this overall score is similar to or lower than that of studies including children of other origins who were not depressed (9.5–16.70) (Daviss et al., 2006; Kent et al., 1997) or who were seeking treatment in a clinical setting (18.14) (Jeffreys et al., 2016). No gender or age differences were found for the total score. This finding is in line with prior research showing that depressive symptoms remain stable during childhood, and that differences are manifest toward adolescence (e.g., Twenge & Nolen-Hoeksema, 2002).
The reliability and validity of the Spanish version of the MFQ-P were supported in this study. Although the full 34-item version reached a high internal consistency, a higher coefficient (α = .92) was found after removing one item. Internal consistency was in keeping with the high coefficients reported by previous validation studies with Cronbach’s alphas ranging from .94 to .96 (Daviss et al., 2006; Jeffreys et al., 2016; Tavitian et al., 2014). The study by Jeffreys et al. (2016) also analyzed the reliability in several age groups, including ages 5–8 years, with all alphas above .90. Other epidemiological and intervention studies using the MFQ-P also reported high reliability (α = .93–.97) (Orchard et al., 2017; Smith et al., 2015). The MFQ-P also showed good test–retest stability in this study (ICC = .76) over an 8-week period, consistent with that reported by other authors (ICC = .80) (Daviss et al., 2006).
Convergent validity of the MFQ-P was examined through its correlations with two measures, the SDQ-P emotional symptoms subscale (r = .63) and the SCAS-P total score (r = .55). The positive moderate correlations yielded acceptable convergent validity. These findings are consistent with previous research reporting similar moderate correlations between the MFQ-P and the SDQ-P emotional symptoms subscale (Tavitian et al., 2014) and other measures of anxiety (Jeffreys et al., 2016). Moreover, the higher correlation with the SDQ-P emotional symptoms subscale (i.e., anxiety and depression) provides further evidence for the convergent validity. In contrast, correlations between the MFQ-P and the SDQ-P prosocial behavior subscale were low and negative (r = −.36), supporting the divergent validity of the measure and consistent with the results of studies using other measures of depression and anxiety (Du et al., 2008; Muris et al., 2003).
CFA supported the hypothesized single-factor structure of the MFQ-P in the Spanish sample, showing good fit indices, with item loadings of .31–.90. Thus, this study suggests that the MFQ-P is a unidimensional measure tapping general depression and including a broad range of symptoms based on DSM criteria. Our findings are in line with studies analyzing the internal structure of the MFQ-P (Angold et al., 1995) and the self-report version (Banh et al., 2012), which found a single-factor structure and strong loadings for the items. However, a five-factor solution that excluded three items from the 34-item version of MFQ-P was supported by Jeffreys et al. (2016). Unfortunately, to date, no further studies are available to compare these findings. In the current study, item 1 (“She or he felt miserable or unhappy”) was removed because of its low item-total correlation and negative effect on the reliability. Although there is no clear explanation for this finding, it may be reflecting the advisability of revising the item translation. In addition, parents may find difficulties to interpret their children’s feelings and behaviors (Rey et al., 2015), and this might have occurred when this item was rated. Further research with Spanish children should be performed to better examine this issue. On the one hand, the good fit of the single-factor structure seems to justify the removal of item 1 and the use of the MFQ-P in the assessment of depression in Spanish children. On the other hand, it would be interesting to maintain the original 34-item structure for clinical and research purposes and to facilitate cross-cultural comparisons of the MFQ-P, at least until there is more evidence of the psychometric properties of the items.
This research presents some limitations to be considered. The main limitation concerns the small sample size and the short age range (6–8) used, given the scope of the project underlying this research, which could limit the generalizability of the findings. More research with larger sample sizes and several age groups is needed. Second, the data were collected from a normal community sample, and thus, it remains unclear whether the results can be generalized to clinical samples. This should be addressed in future studies. Third, in this study, we could not use other depression-specific measures to examine convergent validity. It would be interesting for future research to examine validity by including depression-specific measures and other standards (e.g., a DSM-based diagnostic interview) as criteria (see Angold et al., 1995).
Conclusion
To conclude, despite the above limitations, this research extends the literature by providing additional evidence of the properties of the scores and factor structure of the MFQ-P. In this study, the reliability, test–retest reliability, validity, and single-factor structure of the Spanish-adapted version of the MFQ-P were found to be strong after removing one item. Thus, for the first time, initial evidence is provided of the usefulness of a version of the MFQ-P for the assessment of depression in Spanish children as of early ages.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Ministry of Economy and Competitiveness (MINECO) of Spain (grant number PSI2014-56446-P) and the Ministry of Education, Culture and Sport of Spain (grant number FPU14/03900).
