Abstract
The aim of this study was to determine the diagnostic accuracy of statutory health assessments in identifying existing mental health disorders in pre-school foster children. It was examined whether a foster carer completed screening instrument could enhance accuracy. A representative sample of 43 pre-schoolers under the care of one inner-city local authority underwent comprehensive multidimensional mental health assessments as the reference standard. Statutory health assessments gave false negative results for 65% (95% confidence interval (CI) 44–82%) of children diagnosed with at least one mental health disorder according to the reference standard and 18% (95% CI 3–52%) of children with developmental delay. The Ages & Stages Questionnaire completed by the foster carers failed to identify 65% (95% CI 44–82%) of the children with diagnosed mental health disorders. There was no evidence of selective underreporting by foster carers in relation to specific diagnostic categories. In conclusion, statutory health assessments in their current form may fail to identify the majority of pre-school foster children with mental health disorders. Adding a screening instrument to the assessment process may not be adequate to improve diagnostic accuracy.
Introduction
High prevalence and low detection rates of mental health disorders among children in care
Poor long-term outcomes have been documented for individuals who were placed in out-of-home care as children (National Institute for Health and Clinical Excellence (NICE), 2010). These have included suicide, mental health hospitalisation, substance abuse, unemployment, homelessness and imprisonment. This has been linked to the high prevalence of mental health disorders among children in care (Blower, Addo, Hodgson, Lamington, & Towlson, 2004; McMillen et al., 2005; Meltzer, Corbin, Gatward, Goodman, & Ford, 2003; Mount, Lister, & Bennun, 2004; Sempik, Ward, & Darker, 2008). Although children in care constitute a high risk population levels of undetected and untreated psychiatric morbidity have remained high (NICE, 2010; McMillen et al., 2005; Mount et al., 2004).
Importance of screening pre-school children in care for mental health disorders
It has become increasingly apparent that pre-school aged children in care show rates of mental health disorders which are comparable to school-aged children in care, with up to 65% meeting criteria for a diagnosis (Chernoff, Combs-Orme, Risley-Curtiss, & Heisler, 1994; Hillen, Gafson, Drage, & Conlan, 2012; Klee, Kronstadt, & Zlotnick, 1997; Milburn, Lynch, & Jackson, 2008; Reams, 1999; Sempik et al., 2008; Urquiza, Wirtz, Peterson, & Singer, 1994). The chances for effective intervention are enhanced if mental health difficulties are addressed at an early stage of a child’s development (Anderson et al., 2003; Durlak & Wells, 1997). This is because of the high developmental plasticity during the pre-school years. However, only a small proportion (7–16%) of pre-school children in care accessed mental health services in population-based studies (Hillen et al., 2012; Stahmer et al., 2005). This is why current guidelines put a particular emphasis on improving the screening and treatment of mental health disorders among pre-school children in care (NICE, 2010).
Assessment of children in care in medical clinics
In England each year approximately 15,000 pre-school children in care participate in statutory health assessments (Department of Children, Schools and Families, 2009). The purpose of these assessments is to determine whether the children show physical, developmental or mental health disorders which warrant treatment or further specialist assessment. The accuracy of these statutory health assessments as screening investigations for mental health and developmental disorders has not yet been systematically established. The most commonly used approach for the assessment of children in care involves medical clinics that tend to focus on physical health and developmental disorders (Clyman, Jones Harden, & Little, 2002). Research set within the general population has found that 46–86% of mental health disorders went undetected when children were assessed in medical clinics (Sheldrick, Merchant, & Perrin, 2011; Sayal & Taylor, 2004; Weitzman & Leventhal, 2006).
Challenges of screening pre-school children for mental health disorders
Research has shown that identifying mental health disorders among very young children in care presents particular challenges for clinicians (Carter, Briggs-Gowan, Irwin, & Davis, 2004; Clyman et al., 2002; Horwitz, Owens, & Simms, 2000; Horwitz, Gary, Briggs-Gowan, & Carter, 2003; Kerker & Dore, 2006). These challenges often involve difficulties with gaining accurate information. For instance, coherent medical and developmental histories may be unavailable due to placement moves and a dearth of adults who know the children well and can advocate on their behalf. In addition, there may be under-recognition or reluctance to report mental health difficulties from carers due to insufficient training, fear of stigma, cultural and language barriers, and the carers’ apprehension about being held responsible for a child’s lack of progress. Services may aggravate this situation through being inaccessible and fragmented. The inability of young children to express their feelings verbally, and pathology at a young age being less overt, further complicate the assessment process.
Use of screening instruments to improve detection of mental health disorders
Routinely administering standardised screening instruments is one way of addressing the under-recognition of mental health disorders in children (Stancin & Palermo, 1997, Weitzman & Leventhal, 2006). Among the available validated instruments, only one measure, the Ages and Stages Questionnaire: Socio-Emotional (ASQ: SE), covers the full age range from birth to five years old and is designed to take the high velocity of the socio-emotional development during the early years into account (Squires, Bricker, & Twombly, 2003). The accuracy of this instrument in detecting mental health disorders among pre-school children in care has not been established.
Rationale of the present study
The present study used comprehensive multidimensional mental health assessments as a reference standard in order to evaluate the diagnostic accuracy of statutory health assessments among pre-school children in care. The study examined whether use of the ASQ: SE could improve diagnostic accuracy. The study also investigated evidence of response bias in foster carers’ reports of difficulties relating to specific mental health disorders.
Method
Design, setting and participants
The present study drew on data from a cross-sectional study investigating mental health disorders and unmet needs in a representative sample of pre-school children in care (Hillen et al., 2012). All children under the age of six years who were in the care of a single inner city local authority on 12 May 2008 were included in the study unless a set of pre-defined exclusion criteria applied. Children placed outside a 15 mile radius of the clinic were excluded for practical reasons. Children who were about to be adopted were excluded in order to minimise disruption of the adoption process. The researchers did not have permission to include a proportion of children because they had restricted case files. This applied to children who required additional measures to preserve both their anonymity and safety, or those with high profile cases involved in court proceedings. Ethical approval was obtained from the local research ethics committee on 6 May 2008 (reference number 08/H0723/29). Full details about the sampling method, assessment procedure, data collection and coding were described in our previous paper.
There were 77 pre-school children under the care of the Local Authority on 12 May 2008. The flow diagram in Figure 1 shows the proportion of children who completed the ASQ: SE as index test and the multidimensional clinical assessment as reference standard. Table 1 provides socio-demographic data and allows comparisons between participants and non-participants of the study. Furthermore, there were no significant differences between the total ASQ: SE decimal scores for the 43 children who participated in the multidimensional clinical assessment (median 1.35, 25% centile, 0.58; 75% centile, 2.24) and the nine children who had exited care before it could be completed (median 1.35, 25% centile, 0.00; 75% centile, 2.93; Mann-Whitney U=185.5; p= 0.85).

Flow diagram showing the number of children satisfying criteria for inclusion who completed the index test and the reference standard.
Sample characteristics.
Age at time of enrolment into study.
Multidimensional clinical assessments as reference standard
The study employed a comprehensive multidimensional assessment protocol as the reference standard, which was in line with guidelines for the assessment of pre-schoolers (Thomas et al., 1997). It drew on a developmental and relational perspective and included structured carer interviews, clinical observations, carer-completed instruments and cognitive testing. The following instruments were administered.
The Pre-school Age Psychiatric Assessment (PAPA) is an interviewer-based diagnostic measure, which covers the full range of mental health problems. The clinician takes the child’s caregiver through a highly structured interview with the aim of obtaining information about symptoms and impairment. The PAPA is a validated instrument and can be used in both research and clinical settings and its reliability is comparable to interview schedules commonly used with older children or adults (Egger & Angold, 2006).
The Parent-Infant Relationship Global Assessment Scale (PIR-GAS) was used to measure the quality of the infant-carer relationship (Zero to Three, 2005). It is based on three components of an infant-parent/carer relationship: behavioural quality of the interaction, affective tone and psychological involvement. The scale provides a continuous distribution from grossly impaired (0) to well adapted (100).
The Mullen Scales of Early Learning (AGS edition) is a measure of cognitive functioning for infants and pre-school children, aged up to 68 months. The Mullen Scales provide normative scores for a child’s abilities in five specific cognitive domains. The Mullen Scales are based on extensive diagnostic observation and research and are founded upon information processing and neuro-developmental theories (Mullen, 1995).
The so-far unpublished Placement Stability Rating Scale (PSRS) is a self-report questionnaire, which was completed by foster carers. It is aimed at the identification of foster placements at risk of disruption or breakdown. The PSRS was developed by the mental health service for children in care in the study area and is being used for the identification of placements needing additional support. The PSRS consists of 30 items covering the following domains: external support, carer’s relationship with social worker, coping in carer, problems with contact arrangements, impact on carer’s family, foster child’s behaviour, denial of problems, and relationship between foster child and carer.
Data obtained during the multidimensional clinical assessments were reviewed by two postgraduate mental health professionals based in a dedicated service for children in care. They diagnosed independently mental health and developmental disorders according to the International Classification of Diseases, ICD-10 (World Health Organisation, 1992). Children showing depression, posttraumatic stress or anxiety disorders were assigned to the category of emotional disorders. Those identified with oppositional defiant, conduct or hyperkinetic disorders were reported under the category of behavioural disorders. Disinhibited and reactive attachment difficulties were combined into the category of attachment disorders. The category of adaptive disorders included feeding disorders, sleep disorders, elimination disorders and habit disorders. Children categorised as having an emotional, behavioural, attachment or adaptive disorder were reported as showing ‘at least one mental health disorder’.
The category of developmental disorders included global developmental delay, language disorders and pervasive developmental disorders.
Statutory health assessments as index test
The statutory health assessments, which were completed for all children independently of the study, served as the first index test. They took place at annual intervals or more frequently, and the one closest to the multidimensional mental health assessment served as the index test. Consequently children could undergo their statutory health assessments before or after the reference standard.
The statutory health assessments were undertaken by community physicians blind to the outcomes of the reference standard. The community physicians gave a narrative recording of their findings on either the form ‘Initial Health Assessment- Child’ (IHA-C) or the form ‘Review Health Assessment- Child’ (RHA-C) both published by the British Association for Adoption and Fostering (2004). The forms included open-response sections on ‘Emotional and behavioural development’, ‘Developmental/functional assessment’, and ‘Special educational needs/additional support needs for learning’. The community physicians summarised their assessments also under the following headings: ‘Developmental and educational history’, ‘Emotional and behavioural development’, and ‘Parenting issues in current placement’.
Two postgraduate study clinicians, who were independent of the community physicians and also blind to the outcomes of the reference standard, analysed retrospectively the forms pertaining to the children’s most recent statutory health assessment. They coded whether psychopathology had been recorded without specifying diagnostic categories since the information on the forms did not provide this level of detail. The two study clinicians also coded whether a developmental disorder had been documented.
The ASQ: SE as index test
The ASQ: SE was utilised as the second index test. The ASQ: SE was designed to screen for mental health difficulties but not developmental disorder. Parallel versions for eight different age-bands were available to evaluate functioning across seven domains: self-regulation, compliance, communication, adaptive functioning, autonomy, affect and interaction with people (Squires, et al., 2003). These data were collected prospectively and the children’s carers were asked to complete the ASQ: SE before the multidimensional clinical assessment. Two sets of age-specific cut-offs were obtained from the scoring manual for the parallel versions of the ASQ: SE. One was based on receiver operating curve (ROC) analyses and the other on semi-interquartile ranges.
Analysis of evidence for selective underreporting of symptoms by foster carers
Six ASQ: SE domains could be assigned to corresponding mental health disorders diagnosed according to the reference standard. The ASQ: SE domain ‘communication’ was not included because its items did not map on any of the mental health disorders under investigation. Behavioural and attachment disorders had two ASQ: SE domains assigned each.
Normative data have not been published for the individual ASQ: SE domains and the number of items per domain varied between the parallel versions for the eight age-bands. In order to analyse whether foster carers tended to underreport symptoms relating to specific ASQ: SE domains, decimal scores were calculated in the following way. The scores for the relevant items were added up and the resulting sum was subsequently divided by the number of items. This allowed the comparison of domain specific decimal scores between children with and without the corresponding mental health disorder across age-bands.
Statistical analyses
The sensitivity and specificity of the statutory health assessments and ASQ: SE in detecting existing mental health disorders among the children was examined through chi-square 2×2 tables. The corresponding confidence intervals were calculated according to the efficient score method (Newcombe, 1998).
The Shapiro-Wilk test of normality was used to determine whether ASQ: SE decimal scores were normally distributed. Since they were not, the Mann-Whitney U test was used to assess whether ASQ: SE decimal scores were larger in children with a disorder than those without.
The study did not include a sufficient number of children to undertake ROC analyses for the eight parallel versions of the ASQ: SE in order to determine optimal cut-offs for the children in care population. Hence, a ROC analysis based on ASQ: SE decimal scores was performed.
Results
Prevalence of mental health disorders and developmental disorders according to reference standard
The multidimensional clinical assessments were completed by 31 March 2009. Inter-rater reliability regarding the diagnostic categories was high. The Kappa score for emotional disorder was κ=0.79, for behavioural disorder κ=0.92, for attachment disorder κ=0.87 and for adaptive disorder κ=0.78. Administration of the reference standard did not lead to adverse effects and the majority of children and foster carers engaged well in the assessment procedure. Table 2 shows the number of children who fulfilled diagnostic criteria for mental health disorders or developmental disorders.
Sensitivity and specificity of statutory health assessments in detecting mental health and developmental disorders: comparison with reference standard.
CI: confidence interval.
Diagnostic accuracy of statutory health assessments
The median time interval between the statutory health assessments and the multidimensional clinical assessments was one day (25% centile, –120 days; 75% centile, +122 days). Table 2 examined the accuracy of the statutory health assessments when compared to the reference standard. Whereas over 80% of developmental disorders were identified, the sensitivity for mental health disorders was low.
Due to the high rates of co-morbid mental health disorders, many children not fulfilling criteria for a specific diagnostic category fulfilled criteria for another category. For this reason, specificity data were not broken down by diagnostic category.
Diagnostic accuracy of the ASQ: SE
The median time interval between the return of the ASQ: SE and the multidimensional clinical assessments was 30 days (25% centile, 0 days; 75% centile, 122 days). No indeterminate test results were obtained. Table 3 shows the sensitivity and specificity of the ASQ: SE in detecting mental health disorders among pre-school children in care (CIC). The rates of false negative screening results were unsatisfactory with either of the two published cut-offs.
Sensitivity and specificity of the Ages and Stages Questionnaire: Socio-Emotional (ASQ: SE) in detecting mental health disorders in pre-school children in care (CIC) using two cut-offs: comparison with reference standard.
CI: confidence interval; ROC: receiver operating curve.
The ROC analysis based on ASQ: SE decimal scores gave an area under the curve of 0.64 (95% confidence interval (CI) 0.46–0.81; p=0.14). The optimal cut-off for the ASQ: SE decimal score based on the ROC analysis was 1.25 and with a resulting sensitivity of 65.4% (95% CI 45.9–81.6%) and specificity of 64.7% (95% CI 40.5–84.3%). Hence, in the current study the screening utility of the ASQ: SE proved unsatisfactory among children in care.
Examination of evidence for selective underreporting of symptoms by foster carers
The results in Table 4 demonstrated that the decimal scores for ASQ: SE domains, which were based on a higher number of items, tended to show significant differences between children with, and without, the corresponding mental health disorder. Overall, there was no evidence for selective response bias in the reporting of symptoms by foster carers.
Analysis of evidence for selective under-reporting of symptoms by foster carers: Ages and Stages Questionnaire: Socio-Emotional (ASQ: SE) domain scores contrasted between children who did and did not have disorders according to reference standard.
Discussion
Detection of mental health disorders by statutory health assessments
This study examined the accuracy of statutory health assessments in identifying mental health disorders among pre-school aged children in care. Children in foster care are a high risk group for mental health disorders and there is an increasing emphasis on early detection and intervention to improve outcomes (Clyman et al., 2002; NICE, 2010). Statutory health assessments examined in the present study failed to reliably identify psychopathology among the children and the overall sensitivity of the assessments was low at 35%. Physicians undertaking the statutory health assessments were more likely to record evidence of psychopathology among children who had been diagnosed with two or more mental disorders according to the reference standard.
Detection of developmental disorders by statutory health assessments
The statutory health assessments were more adequate in detecting developmental disorders than mental health disorders. The sensitivity was 82%. This was in keeping with previous studies which found that community physicians were better at recognising developmental as compared to mental health disorders among children in care (Clyman et al., 2002).
The ASQ: SE as a screening instrument for mental health disorders
This study was among the first to evaluate a mental health screening tool, the ASQ: SE, among pre-school foster children. Data for other widely used instruments in this age group are presently lacking. The results showed that the ASQ: SE failed to identify children at risk of displaying a disorder when completed by foster carers. Normative data were only available for the total ASQ: SE summary score. As a consequence children could have total ASQ: SE scores below the cut-off when they were functioning well in most domains but had difficulties in specific areas severe enough to accrue a mental health diagnosis according to the reference standard. The present study did not provide evidence for response bias in foster carers’ reports about specific disorders.
The current study evaluated the screening utility of the ASQ: SE only and unfortunately other pre-school screening measures such as the Infant Toddler Emotional Assessment (ITSEA) or the Brief Infant Toddler Social and Emotional Scale (BITSEA) were not examined (Briggs-Gowan & Carter, 2007). One advantage of the ITSEA is that it includes a competence scale. This allows assessment not only for the presence of undesirable behaviours, but also for the absence of desirable behaviours.
The disappointing performance of the ASQ: SE highlighted that the validity of screening instruments needed to be established for the specific population under investigation. The use of the ASQ: SE or any of the other published screening tools without further validation studies in the care population should be discouraged.
Explanation of the under-recognition of mental health disorders
The obstacles for the early and reliable identification of mental health disorders among pre-schoolers in foster care have been detailed in the introduction. One of the main difficulties was the lack of direct clinical data and resultant reliance on the report of carers who had limited knowledge about the children. Community physicians could test in the clinic whether children had reached relevant developmental milestones. However, they had to ask carers about mental health difficulties. Previous studies highlighted difficulties with relying on carers for the identification of mental health needs (Carter et al., 2004; Horwitz et al., 2000; Mount et al., 2004). The current study demonstrated that even when carers completed a detailed screening tool such as the ASQ: SE, the majority of disorders went undetected.
Methodological strengths and limitations of the present study
The present study included a representative sample of pre-school foster children ranging from recent entrants into care to children who had been in care for over one year. The children presented with an appropriate range of psychopathology, which was in keeping with other studies examining the mental health of pre-school children in foster care (Chernoff et al., 1994; Klee et al., 1997; Milburn et al., 2008; Reams, 1999; Sempik et al., 2008; Urquiza et al., 1994). The methodological sophistication of the reference standard ensured that it was possible to distinguish reliably between children with and without mental health disorders (Thomas et al., 1997). The inter-rater reliability for diagnosing mental health disorders in the current study was high. Unfortunately data on the inter-rater reliability regarding the Parent-Infant Relationship Global Assessment Scale was not collected.
The potential for expectation bias was minimised because the community physicians conducting the statutory health assessments and the clinicians coding their outcomes were unaware of the results of the reference standard. However, clinicians were not blinded to the ASQ: SE and had access to the children’s scores when determining the consensus diagnoses of the multidimensional assessments. This could have led to an overestimation of the level of agreement between the ASQ: SE and the reference standard. Since one of the main findings of the present study was that levels of agreement were low, it appeared unlikely that the lack of blinding substantially affected the results.
Verification bias is possible when participants with a positive index test are more likely to be verified by the reference standard than those with a negative index test. In the current study children were invited for their multidimensional clinical assessments independently of the outcomes of their statutory health assessments and their ASQ: SE scores.
It was possible that the study results were influenced by the time interval between the index tests and the reference standard. Thus, if the children’s presentations changed significantly during the time interval, this would have resulted in an underestimation of the accuracy of the index tests.
Unfortunately, no information was available about the level of training and supervision arrangements for the community physicians, who conducted the statutory health assessments of the children.
The main limitation of the present study was its small sample size and the resultant wide CIs around the estimates for the sensitivity and specificity of the two index tests under investigation. Nevertheless, even when the upper limits of the CIs for the sensitivity of statutory health assessments (56%) and the ASQ: SE (56%) were considered, it remained the case that a significant proportion of children presenting with at least one mental health disorder were overlooked. The study was set in a single inner city borough and it would be important to examine the sensitivity and specificity of statutory health assessments in other boroughs in England.
Conclusions
The current study demonstrated that current statutory health assessments may represent a missed opportunity for the early screening of mental health disorders in pre-school foster children. The administration of a carer-completed questionnaire may not suffice to improve diagnostic accuracy. Multidimensional clinical assessment programmes bringing physical health, mental health, developmental and child welfare perspectives together resulted in the improved recognition of mental health disorders in pre-school foster children in the US, Canada and Australia (Horwitz, 2000; Milburn et al., 2008; Urquiza et al., 1994; Wotherspoon, O’Neill-Laberge, & Pirie, 2008; Zeanah et al., 2001).
In order to improve the diagnostic accuracy of statutory health assessments in England a more coordinated and systematic approach will be required. Physicians undertaking assessments may need to undergo further training and collaborate more closely with other disciplines including mental health professionals. Guidelines and protocols for the assessment of pre-school children in care should be developed locally and nationally, and should specify how multi-disciplinary and multi-dimensional assessments can at least be offered to those pre-schoolers who are at high risk of presenting with mental health disorders.
Footnotes
Acknowledgements
The authors are indebted to the children participating in the study as well as to their families and carers.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
