Abstract
Objective:
The current intense period of drug development for fragile X syndrome (FXS) and other neurodevelopmental disorders (NDDs) indications has highlighted the importance of behavioral outcome measures with strong psychometric properties and specifically content validity. The Aberrant Behavior Checklist—Community Edition (ABC-C), which has successfully been applied to autism spectrum disorder drug trials, has been revised for FXS (ABCFX) and is widely used for both clinical and research purposes. Despite its strong psychometric validation, the ABCFX and its parent measure have not been subjected to qualitative content validity evaluations. The present study intended to fill this gap.
Methods:
Using two surveys administered sequentially and developed with guidance and review from the Food and Drug Administration (FDA), we asked 10 clinicians experienced in FXS and related NDDs to determine the adequacy of the ABCFX for assessing its behavioral constructs, its relevance to FXS, and its potential for detecting response to interventions. Various descriptive statistic parameters and ad hoc metrics were used to analyze categorical and Likert-like scale responses.
Results:
Experts considered that most items and all six ABCFX subscales indeed evaluated their explicit or implicit behavioral constructs. However, item and subscale specificity were relatively low (∼25%–30%). Relevance of items of the Hyperactivity subscale was relatively high while low for many items of the Socially Unresponsive/Lethargic subscale. These items were also considered of low responsiveness potential. Irritability, Hyperactivity, Stereotypy, and Social Avoidance were the subscales with the strongest profiles, although the experts estimated that Stereotypy items may not be that responsive to treatment. A novel Anxiety construct, representing mainly recently reported observable behaviors, contributed mainly by Irritability items, emerged as a potential measure.
Conclusions:
The present study demonstrated the overall adequacy of the ABCFX for its behavioral constructs, its relevance to FXS, and its potential for detecting response to treatment. It also showed that anxiety, a distinctive feature of FXS and other genetic NDDs, can also be measured by the ABCFX. These findings can help with the implementation and interpretation of the ABCFX, as well as with potential improvements to the measure in FXS and other NDDs.
Introduction
Fragile X syndrome (FXS) is the most common inherited neurodevelopmental disorder (NDD), affecting 1 in 7,000 males and 1 in 11,000 females in the United States (Hunter et al., 2014). The disorder is associated with the expansion of a CGG repeat (≥200, termed full mutation) in the 5′ untranslated region of the X chromosome-located fragile X messenger ribonucleoprotein 1 (FMR1) gene, which leads to epigenetic silencing and reduction to the absence of the FMR1 gene product (Hagerman et al., 2009; Tassone et al., 2012). Characteristic of an X-linked condition, males are more affected than females (Hagerman et al., 2009). Among the clinical manifestations in FXS, there is a variable range of physical abnormalities (e.g., joint laxity) and medical problems (e.g., recurrent ear infections) (Kidd et al., 2014). Nonetheless, neurological and behavioral features (Hagerman et al., 2009) have the greatest impact on functioning and quality of life (Fitzpatrick et al., 2020; Weber et al., 2019). They include intellectual disability (ID), which is found in >90% of males (predominantly moderate ID) and ∼50% of females (mainly borderline-mild ID), language impairment and other cognitive dysfunctions (e.g., deficit in executive function), sleep problems, and seizures (Berry-Kravis et al., 2021; Budimirovic et al., 2022; Sherman et al., 2017).
Behavioral phenotype of FXS
The most common behavioral abnormalities in FXS include attention-deficit/hyperactivity disorder (ADHD) symptoms, namely hyperactivity, impulsivity, and attentional difficulties; anxiety; perseverative and stereotypic behavior; disruptive behavior, also termed irritability/agitation, aggression, and self-injury (IAAS); increased sensory reactivity (i.e., hypersensitivity); and autistic behaviors (Boyle and Kaufmann, 2010; Hagerman et al., 2009). The latter could be severe, leading to the diagnosis of autism spectrum disorder (ASD) in a high proportion of individuals (approximately 50% of males, 17% of females) (Kaufmann et al., 2017; Sherman et al., 2017). Atypical behaviors have received particular attention in FXS because of their potential modification by psychoactive drugs and nonpharmacological interventions (Dominick et al., 2021; Hagerman et al., 2009). This, in conjunction with multiple drug development programs over the last two decades (Berry-Kravis et al., 2018; Erickson et al., 2017), has led to the continuous implementation, evaluation, and refinement of behavioral outcome measures (Budimirovic et al., 2017).
The Aberrant Behavior Checklist—revisions for FXS
The Aberrant Behavior Checklist—Community Edition (ABC-C) is the most widely used instrument for evaluating behavioral problems in individuals with NDDs, particularly those with ID (Aman et al., 1985a; Farmer and Aman, 2021), and served as the primary outcome measure in the trials that led to the approval of atypical neuroleptics for the treatment of IAAS in ASD (McCracken et al., 2002; Owen et al., 2009). The ABC-C is an observer-reported outcome, most commonly a caregiver-completed questionnaire, which was specifically developed for assessing an individual’s response to different types of interventions in ID (Aman and Singh, 2017). The ABC-C is composed of five subscales, which were empirically derived by principal component analysis (Aman et al., 1985a). Each subscale evaluates a different behavioral construct relevant to NDDs. Because they are not weighted to account for their different number of items, each ABC-C subscale should be considered a separate measure (Aman et al., 1985b). Considering the constellation of distinctive features in the behavioral phenotype of FXS, investigators sought to evaluate the adequacy of the ABC-C for FXS. Studies have examined its content validity (Budimirovic et al., 2006) and factor structure (Sansone et al., 2012), leading to a revised version of the ABC-C with scoring specific to FXS (ABCFX). Factor analyses using a large mixed research and clinical FXS dataset (n = 630) resulted in the removal of 3 of the 58 original ABC-C items (2 from the original Lethargy/Social Withdrawal subscale and 1 from the original Stereotypy subscale) and the emergence of a new and sixth subscale, termed Social Avoidance (Sansone et al., 2012). The latter includes four items of the original Lethargy/Social Withdrawal subscale, previously identified by a qualitative clinician assessment of behavioral rating scales as representing social avoidance (Budimirovic et al., 2006). These factor analyses also led to four items from the original Hyperactivity subscale shifting into the revised Irritability subscale in ABCFX, which expanded the range of disruptive/IAAS behaviors covered by the subscale (Sansone et al., 2012). Subsequent studies analyzing different research and clinical FXS samples have supported the refactoring in the ABCFX (Aman et al., 2020; Wheeler et al., 2014). The most recent and comprehensive evaluation of the ABCFX (Aman et al., 2020) used data from the clinic-based FXS natural history (FORWARD) project (n = 797) (Sherman et al., 2017). It concluded from its confirmatory factor analyses that the “…FXS-specific algorithm produced the most consistent factor structure for the sample… but model fit was only marginally better than that derived by the original ABC scoring algorithm…” Furthermore, the study concluded that the FXS-specific algorithm “…may be appropriate when anxiety and/or social avoidance constructs are the central and unequivocal domains of interest…”
Content validity of the ABC with scoring specific to FXS
During the last decade, the FXS field has experienced a period of intense drug development research (Berry-Kravis et al., 2018; Kaufmann et al., 2024). This has highlighted the importance of deploying behavioral outcome measures with strong psychometric properties. Therefore, the ABCFX has become the main instrument for evaluating atypical behaviors in FXS and the effects of clinical and experimental interventions on these features of the disorder (Berry-Kravis et al., 2018; Budimirovic et al., 2017) having shown greater sensitivity than the original ABC-C in some drug development programs (Berry-Kravis et al., 2012, 2017). Despite statistical confirmation of structural adequacy (Aman et al., 2020) and the inclusion of many caregiver-targeted symptoms (Weber et al., 2019), current emphasis by the Food and Drug Administration (FDA) on qualitative assessments of content validity (FDA Guidance for Industry 2009; Berry-Kravis et al., 2022) has made a formal and comprehensive evaluation of ABCFX’s content necessary. In this regard, recent agency recommendations for FXS drug development programs have included not only caregiver assessments but also expert-led content evaluations. Only the ABC-C’s Lethargy/Social Withdrawal subscale has been subjected to qualitative item assessment by clinicians with expertise in the disorder (Budimirovic et al., 2006). While the latter study concluded that the four items in the ABCFX Social Avoidance subscale (Sansone et al., 2012) represented social avoidance behaviors, two of them were considered to also reflect social indifference (Budimirovic et al., 2006). Furthermore, despite the ABCFX lacking an Anxiety component, a recent comprehensive survey of behaviors consistent with anxiety and observable by caregivers demonstrated that many of them are covered by items in the ABCFX (Lozano et al., 2022). Moreover, the restructured Irritability subscale in the ABCFX appears to be an adequate measure of frequent behaviors leading to poor outcome in FXS (Eckert et al., 2019; Lachiewicz et al., 2024).
Clinician evaluation of the ABCFX content validity
Data reviewed in the preceding sections underscore the need for a systematic assessment of the content validity of the original ABC-C and its revised version for FXS, the ABCFX. Acknowledging that expanding the published qualitative caregiver evaluations beyond meaningful change thresholds for ABCFX subscales is also necessary (FDA Guidance for Industry 2009), with guidance and review from an FDA committee, we conducted a comprehensive qualitative evaluation of the instrument by clinicians. The group of experts assessed the content validity of all ABC-C items and of the ABCFX subscales. The open approach design of the survey included not only evaluations of explicit or implicit behavioral constructs of the original ABC-C (Aman and Singh, 2017) and the ABCFX but also other potential constructs of relevance to FXS (e.g., nonsocial anxiety). The latter was derived from the implementation of the original ABC-C and the ABCFX in multiple studies in FXS (Boyle and Kaufmann, 2010; Lozano et al., 2022; Wheeler et al., 2014). In addition, after presenting the initial survey here (main questionnaire) to the FDA, the committee suggested to also ask whether the experts thought that items, subscales, or behavioral constructs were indeed relevant to FXS or amenable to change by treatment (supplementary questionnaire). We intended for the results of the present study to be, ultimately, integrated with emerging content validity caregiver data on the ABCFX (Arpone et al., 2022; Merikle et al., 2021). We expect that, altogether, these data will serve for better implementation and interpretation of the ABCFX as well as for future improvements of the instrument or the development of new measures.
Methods
Survey
To our knowledge, no published study has qualitatively assessed the content validity of an already psychometrically validated behavioral measure, or from adaptations of a behavioral instrument to a particular population. Therefore, with input from the FDA, we developed two complementary clinician surveys, and their analytical plans, on the basis of relevant literature (Bevans et al., 2019; Carminati et al., 2024; Donadon et al., 2020; Knight et al., 2008; Sandler et al., 2017; Zamanzadeh et al., 2015).
In the first questionnaire, termed “main,” we addressed the first concern from FDA regulators. Whether the items in the ABCFX evaluated the behavioral constructs represented by each subscale. For this purpose, we asked experts to evaluate each of the 55 items of the ABCFX in terms of their adequacy for assessing their subscale explicit or implicit behavioral construct, based on the literature on either the original ABC-C or the ABCFX (Aman et al., 1985b, 2020; Aman and Singh, 2017; Sansone et al., 2012). In addition to this, clinicians were also asked to categorize the items in terms of two or three other additional constructs considered relevant to the item’s subscale and identified in studies applying either the ABC-C or the ABCFX (Boyle and Kaufmann, 2010; Lozano et al., 2022; Wheeler et al., 2014). The behavioral construct questions had two answer options: Yes or No (as to whether the item evaluated represented one or more of the proposed behavioral constructs) (Bevans et al., 2019). In addition, there were open text options for suggesting “Another Construct.” Items were then assessed in terms of their general adequacy (e.g., unclear description of a behavior) by recording “Problems with item” as an open text. Each of the six subscales of the ABCFX was also considered in terms of their completeness. Specifically, as open text, raters were asked to identify any behaviors not covered by the items that are relevant to the construct represented by the subscale. Other general comments regarding each ABCFX subscale were collected via open text. The main questionnaire and its instructions are included as Supplementary Appendix SA1.
After reviewing the main questionnaire and the responses from the 10 experts, the FDA reviewers suggested additional questions that were incorporated into a second questionnaire, termed “supplementary.” The latter expanded the scope of the assessment by asking whether each item or each subscale/construct was relevant to FXS. Examples of this type of evaluation were also found in the literature and used to structure the questionnaire (Carminati et al., 2024; Zamanzadeh et al., 2015). Responses were in the form of a 1–5 Likert scale, with 1 being “Not relevant at all” and 5 being “Extremely relevant.” Items and subscales/constructs were also assessed in terms of their potential responsiveness to interventions (i.e., pharmacological or nonpharmacological treatments) by asking the experts to rate them from 1 to 5, with 1 being “Extremely unlikely” to 5 “Extremely likely.” Throughout the article, potential responsiveness to intervention is referred to as simply “responsiveness.” As for the main questionnaire, as suggested by FDA reviewers, all 58 original ABC-C items (Aman et al., 1985a,b) were included in this supplementary questionnaire. This avoided potential bias, in both questionnaires, due to the exclusion of three ABC-C items in the ABCFX (Sansone et al., 2012), The supplementary questionnaire and its instructions are included as Supplementary Appendix SA2. In the tables, the verbatim item wording has been replaced by item summaries used in previous publications (Kaat et al., 2014), after obtaining permission to reproduce these item summaries from Michael Aman, ABC-C’s copyright holder.
The protocol was reviewed and considered to be exempt, and classified as a quality improvement project, by the WCG central IRB. All participants provided written consent to participate in both components of the study.
Behavioral constructs
The following are descriptions of the ABCFX subscales and their associated behavioral constructs. Irritability subscale, representing the IAAS construct. It refers to the complex of disruptive, externalizing behaviors commonly observed in individuals with NDDs (Eckert et al., 2019). Socially Unresponsive/Lethargic subscale, representing the Social Indifference/autistic behavior construct. Behaviors in this subscale are characterized by reduced social interaction or reactivity or social aloofness, as observed in ASD (Budimirovic et al., 2006). Stereotypy subscale, representing the stereotypic/restricted and repetitive behaviors (RRBs) construct. It refers to either simple and complex RRBs as defined in The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) (American Psychiatric Association, 2013; Reisinger et al., 2020). Hyperactivity subscale, representing the hyperactivity/impulsivity construct. It refers to behaviors resembling their counterpart in idiopathic ADHD, as defined in DSM-5 (Boyle and Kaufmann 2010; American Psychiatric Association, 2013). Inappropriate Speech subscale, representing a speech dysregulation construct. It refers to a speech-specific neurobehavioral impairment, more closely related to obsessive-compulsive disorder/anxiety than to speech sound disorder as these disorders are specified in DSM-5 (American Psychiatric Association, 2013; Hoffmann 2022). Social Avoidance subscale and construct. It refers to reduced social interaction characterized by passive or active avoidance, more closely linked to social anxiety than to autistic behavior (Aman et al., 2020; Budimirovic et al., 2006; Roberts et al., 2018, 2019; Sansone et al., 2012). ADHD construct (not in ABC-C or ABCFX). It refers to the broad range of behaviors observed in idiopathic ADHD, including both attentional deficit and hyperactivity/impulsivity as described in DSM-5 (American Psychiatric Association, 2013). Mood Abnormality construct (not in ABC-C or ABCFX). It refers to any manifestation of mood instability, depression, and related behaviors, as described in DSM-5 (American Psychiatric Association, 2013). Anxiety construct (not in ABC-C or ABCFX). It refers to any manifestation of anxiety disorder, including social anxiety, as defined in DSM-5 and in the FXS literature, including the recent report on observable symptoms of anxiety (American Psychiatric Association, 2013; Boyle and Kaufmann 2010; Lozano et al., 2022).
Instructions in Supplementary Appendix SA1 include these definitions provided as guidance to clinicians.
Types of constructs and related terminology
The following are definitions of terms used in the study. Original explicit construct. It corresponds to the name of the subscale or to an obvious extension of it. Original explicit constructs include IAAS, Hyperactivity/Impulsivity, Stereotypic/RRB, and Social Avoidance. Original implicit construct. It corresponds to types of behaviors aligned with the items but not obvious in the name of the subscale. They include Social Indifference/ASD and Speech Dysregulation. Novel constructs: Anxiety (broad definition, not necessarily social), Mood Abnormality, ADHD (broad, including attention deficit). Primary construct. Original explicit or implicit construct aligned to an ABCFX subscale. Secondary constructs. Additional constructs aligned to an ABCFX subscale, as suggested by the literature. Alternate constructs: Constructs suggested for a subscale by the surveyed clinician in open text responses.
The two questionnaires were implemented sequentially.
Experts
Experts were invited to participate because of their membership in the National Fragile X Foundation’s Clinical Trials Committee, their role as site principal investigators of the FXS natural history study (FORWARD), and/or their research track record in observational and interventional studies using the ABC-C or the ABCFX (i.e., at least 10 publications on FXS or related NDDs). We attempted to recruit experts representing the range of disciplines involved in the assessment and management of individuals with FXS. The professional background of the 10 selected clinicians was: 3 developmental/behavioral pediatricians, 2 child psychiatrists, 2 child neurologists, 1 geneticist, and 2 psychologists.
Analyses
Parameters: In order to analyze the data quantitatively, we delineated parameters on the basis of the literature mentioned above and our experience in related studies.
Main questionnaire
Representative construct. It corresponds to any construct, whether original or novel, primary or secondary, endorsed for an item by at least five clinicians. It excludes alternate constructs since these represented occasional and variable suggestions from the experts.
Top item. An item endorsed for a particular construct by at least nine clinicians.
Sensitive item. An item considered as representative of the primary construct by at least five clinicians. Mean number of positive responses was calculated for each primary construct.
Specific Item. An item endorsed as representing mainly the primary construct, with less than five positive responses for any other construct. Mean number of positive responses was calculated for all the secondary constructs.
Sensitive subscale. A subscale was considered as sensitive for its primary construct when each item was endorsed by at least five clinicians. Mean number of positive responses was calculated for each primary construct.
Specific subscale. A subscale was considered as specific for its primary construct when each item had less than five positive responses for any other construct. Mean number of positive responses was calculated for all the secondary constructs.
Number of comments. This parameter was calculated for potential future revisions of the ABCFX or the development of new behavioral instruments. Comments could include a wide range of issues, from problematic features of an item to behaviors not usually thought to be evaluated by the item through triggers of the behaviors. In general terms, the larger the number of comments the more concerns were raised by the wording of the item.
Supplementary questionnaire
Relevance/Responsiveness item or subscale/construct. Any item or subscale/construct with a mean score above 4 (very relevant/likely) was considered of high relevance or high potential for detecting response to intervention; conversely items or subscales/constructs with a mean below 3 (moderately relevant/unsure) were deemed low relevance/low potential.
Agreement among experts on items or subscale/constructs. Any item or subscale/construct with a variance-to-mean ratio (VMR), also termed index of dispersion (Sandler et al., 2017), <0.1 was considered to represent high agreement. On the contrary, a VMR >0.4, was deemed low agreement since it reflected a large score dispersion or diversity of opinions.
Statistical analyses
We used descriptive statistics. For nominal variables, we calculated frequencies/percentages. For ordinal variables, we calculated means, standard deviations, variances, and VMRs. Tables were also employed to analyze, interpret, and display the questionnaires’ responses. Analyses were conducted using Excel and the Calculator.net website.
Results
Item sensitivity and specificity to behavioral constructs
Overall, most items of the ABCFX (50/55 or 90.9%) were considered sensitive. In other words, they were endorsed by at least five experts as representative of the primary construct. Among them, three items were considered by all clinicians as representative of their primary construct, but also of another construct by nine experts. Approximately, one-fourth of the items (14/55 or 25.5%) were specific, with fewer than five positive responses for constructs other than the primary one. When including the three novel constructs proposed in the survey (Broad ADHD, Mood Abnormality, Anxiety), most of the items (32/55 or 58.2%) were associated to two constructs, but only 7/55 or 12.7% when including only the six original ABCFX subscales/constructs. Eight items (14.5%) were associated with more than two constructs but only when novel constructs were included. One item represented a single construct but not its primary one.
Sensitivity and specificity to behavioral constructs of the ABCFX subscales
In line with the high proportion of sensitive items, all six ABCFX subscales met criteria for sensitivity with at least five experts responding that they aligned with their primary construct. The two subscales with the largest number of items, namely Irritability (18 items) and Socially Unresponsive/Lethargic (13 items) had the lowest mean number of positive responses when including all the subscale items (8.89 and 8.54, respectively). They were also the only two subscales with <80% top score items (i.e., those endorsed by 9 or 10 clinicians). Only three of the six ABCFX subscales met criteria for specificity (Irritability, Socially Unresponsive/Lethargic, and Stereotypy), with less than five positive responses for any construct other than the primary one. The other three subscales also represented one (Hyperactivity, Social Avoidance) or two (Inappropriate Speech) secondary constructs. Table 1 depicts the mean number of positive responses for the primary construct (sensitivity) and for all the secondary constructs (specificity).
Results from Main Survey by Subscale
The main survey included only the 55 items in ABCFX . Thus, data on items 3. Sluggish and 26. Resists contact were not collected.
The main survey included only the 55 items in ABCFX . Thus, data on item 27. Rolls head was not collected.
ABC-C, Aberrant Behavior Checklist—Community Edition; ADHD, attention-deficit/hyperactivity disorder; ASD, autism spectrum disorder; IAAS, irritability/agitation, aggression, and self-injury; RRB, restricted and repetitive behaviors; SD, standard deviation; SIB, self-injurious behavior; VMR, variance-to-mean ratio.
Novel constructs and ABCFX items
Experts endorsed a substantial number of items (34/55 or 61.8%) as representing constructs that were not identified or covered explicitly by the ABC-C: Broad ADHD (8/55), Mood Abnormality (9/55), and Anxiety (17/55). One item (#25) was considered to represent only Mood Abnormality, whereas another item (#10) to represent both Mood Abnormality and Anxiety, and a third one (#23) to represent all three Social Indifference/ASD, Social Avoidance, and Anxiety. Most items (8/10) of the Hyperactivity subscale were considered to also represent the novel Broad ADHD construct.
ABCFX subscale profiles
Integrating the data presented above with other findings, the following are the profiles of each ABCFX subscale. Irritability subscale (18 items): This subscale met criteria for both sensitivity and specificity. Two out of 18 items (#47, #57) represented only the IAAS construct and, of the 16 items (89%) representing at least one additional construct, three items more than two constructs. Irritability subscale items constituted the majority of those in the novel Anxiety (9 out of 17) and Mood Abnormality (5 out of 9) constructs. None of the Irritability subscale items represented the Broad ADHD construct. Thus, items in the Irritability subscale appear to represent predominantly the IAAS construct but also emotional domain behaviors (Anxiety and Mood Abnormality constructs).
Experts had comments about all 18 items, with 9 of them receiving three or more comments each. According to the experts, items that could be added to the subscale included those describing Destructiveness (e.g., breaking windows), Types of Aggressive Behavior (e.g., hits, kicks), or Resistance (e.g., sitting or lying down in a heavy or careless way). Raters did not identify any other construct that better defines the subscale. Socially Unresponsive/Lethargic (13 items): This subscale also met criteria for both sensitivity and specificity, although the mean number of positive responses for its primary construct (8.54) and percentage of top score items were the lowest (69%) among the six subscales. Seven out of 13 items (54%) represented only the Social Indifference/ASD construct. One item (#25) did not represent the primary construct, but only the novel Mood Abnormality one. A total of five items (38%) represented at least one additional construct, two of them also Social Avoidance. Among the novel constructs, four items represented Mood Abnormality while one Anxiety. None of the items in this subscale represented the Broad ADHD construct. Thus, this subscale was considered to have relatively high specificity in terms of its own and related social constructs.
Experts had comments about 12 of the 13 items, with 4 of them receiving three comments each. They suggested as items that could be added to the subscale Lack of Eye Contact, Lack of Use of Facial Expression, Insensibility to Social Cues, Lack of Demonstration of Affection to Others, Lack of Assistance to Others When in Need, Lack of Recognition of Emotions, and Poor Tactile Sensitivity. Stereotypy (six items): This subscale also met criteria for both sensitivity and specificity, with a very high number of positive responses for its primary construct (mean 9.67) and the lowest number of positive responses for secondary constructs (means 0.17–3.50). Three out of six items represented only the Stereotypic/RRB construct. Three items also represented Anxiety but none either Mood Abnormality or Broad ADHD. No problematic item was identified.
Experts had comments about all six items, with three of them receiving three comments each. They proposed as items that could be added to the subscale those representing Fixation on Routines, Perseverative Speech, Perseverative Behaviors, and Ritualistic Behaviors. Hyperactivity (10 items): This subscale met criteria for sensitivity (primary construct mean 9.40) but not for specificity, due to a large number of positive responses for the Broad ADHD construct (mean 5.60). Only 2 out of 10 items represented only Hyperactivity/Impulsivity, which contrasted with 8 out of 10 items being considered to represent the novel Broad ADHD construct. On the other hand, none of the items represented either Mood Abnormality or Anxiety. Therefore, this subscale is both relatively sensitive and specific for the Broad ADHD construct that includes hyperactivity, impulsivity, and attention deficits.
Experts had one to two comments for 3 of the 10 items. They considered that items that could be added to the subscale included those representing Aggressive Impulsive Behavior. Inappropriate Speech (four items): This subscale is one of the two most sensitive in terms of its primary construct (mean 10.00 or unanimous positive responses for each item). However, its specificity was the lowest among the six ABCFX subscales. None of its items represented only the Speech Dysregulation construct, with mean positive responses of 5.75 and 6.00 for Stereotypic/RRBs and Anxiety, respectively.
Experts had one to two comments about each of the four items. They proposed as items that could be added to those representing Anticipation Related to Anxiety, Immediate or Delayed Echolalia, and Interruption of Others’ Speech. Clinicians concluded that all items of this subscale could be “absorbed” by other constructs. Social Avoidance (four items): This subscale is the other most sensitive in terms of its primary construct (mean 10.00 or unanimous positive responses for each item). Nevertheless, none of its items represent only the Social Avoidance construct. Its low specificity is related to the high mean number of positive responses for the Social Indifference/ASD construct (6.75).
Experts had one comment on three of the four items, and they proposed as items that could be added descriptions of Eye Contact Avoidance, Worried Appearance/Repetitive Questions about Upcoming Social Event, Feeling Uncomfortable Around Others, Upset Stomach/Vomiting Before Social Event, Facial Flushing, Verbal or Behavioral Refusal to Participate in Social Event, Anger or Aggressive Behavior in Social Situations, Difficulty Initiating Communication with Others, and Selective Mutism. It was also suggested that Sensory Problems, specifically Lack of Inhibition or Habituation to Sensory Stimuli could better define the subscale.
Novel constructs
Broad ADHD: According to the experts, eight items aligned with this novel construct. All of them from the 10-item Hyperactivity subscale. The exception was the Hyperactivity items #48 and #54, which were thought to depict only Hyperactivity/Impulsivity.
Mood Abnormality: Nine items aligned with this novel construct, five of them from the Irritability subscale and four from the Socially Unresponsive/Lethargic subscale.
Anxiety: Seventeen items (∼31% ABCFX) aligned with this novel construct. Although its definition in the main survey was broad, the items identified by the experts represented mainly nonsocial context type of anxiety. The majority (9/17) corresponded to items from the Irritability subscale, while 4/17 were from the Inappropriate Speech subscale, 3/17 from the Stereotypy subscale, and 1/17 from the Socially Unresponsive/Lethargic subscale.
Table 1 presents the findings from the main questionnaire.
Relevance to FXS
Ten Items of the ABCFX (18.2%) met the mean score criterion for high relevance while 13 items of the ABCFX (23.6%), and 2 of the 3 ABC-C deleted items met the criterion for low relevance. As shown in Tables 2 and 3, seven high-relevance items derived from the Hyperactivity subscale while two from the Inappropriate Speech subscale and one was from the Irritability subscale. Low-relevance items were predominantly from the Socially Unresponsive/Lethargic subscale (n = 9), while the rest from the Irritability subscale (n = 3) and the Stereotypy subscale (n = 1). Thus, most high-relevance items described ADHD-like behaviors while lethargy and social indifference type of behaviors constituted the majority of those of low relevance. These profiles were reflected in the novel constructs, with Broad ADHD having a large proportion of high-relevance items and Mood Abnormality of low-relevance items. The Irritability, Stereotypy, Hyperactivity, and Social Avoidance subscales, and the Broad ADHD and Anxiety constructs, met the high-relevance criterion while no subscale/construct was found to be low relevance.
Results of Supplementary Survey by Subscale/Construct
ASD, autism spectrum disorder; IAAS, irritability/agitation, aggression, and self-injury; RRB, restricted and repetitive behaviors; SD, standard deviation; VMR, variance-to-mean ratio.
Results of Supplementary Survey By Item
Note: The supplementary survey included all 58 items of the ABC-C.
ABC-C, Aberrant Behavior Checklist—Community Edition; ADHD, attention-deficit/hyperactivity disorder; ASD, autism spectrum disorder; IAAS, irritability/agitation, aggression, and self-injury; RRB, restricted and repetitive behaviors; SD, standard deviation; VMR, variance-to-mean ratio.
Three ABCFX items met the dispersion score criterion for high agreement on relevance while 5 for low agreement. The latter items were distributed among Irritability, Socially Unresponsive/Lethargic, Stereotypy, and Inappropriate Speech. Item #26, deleted in the ABCFX, also had a VMR >0.4. The Hyperactivity, Social Avoidance, and Anxiety subscales/constructs met criterion for high agreement on relevance; none for low agreement.
Potential responsiveness to interventions
The profile of item responsiveness was different from that of relevance (Tables 2 and 3). Only three ABCFX items met the criterion for high responsiveness while 16 were considered low responsiveness. As for low relevance, most low responsiveness items (n = 9) were from the Socially Unresponsive/Lethargic subscale; five were from Stereotypy, and two from Social Avoidance. In line with their subscale of origin, the seven low responsiveness items were included in the novel Mood Abnormality and Anxiety constructs. All three ABC-C items excluded from the ABCFX were also considered to have low potential for responsiveness.
The Irritability and Hyperactivity subscales and the Broad ADHD and Anxiety constructs were scored as high responsiveness. None of the subscales/constructs met criteria for low potential for responsiveness, although the mean score for Stereotypy was borderline at 3.0.
Eleven ABCFX items, mainly from the Irritability and Hyperactivity subscales, met dispersion score criterion for high agreement on responsiveness. There were no items meeting low agreement criterion. The Irritability, Socially Unresponsive/Lethargic, Hyperactivity, Social Avoidance, Broad ADHD, and Anxiety subscales/constructs met criterion for high agreement on potential for responsiveness, and none for low agreement.
Tables 2 and 3 depict the findings from the supplementary questionnaire.
Discussion
The last two decades have been characterized by intense drug development in the field of NDDs, including FXS. The ABC-C revised for FXS or ABCFX (Sansone et al., 2012) has become one of the most widely used instruments to evaluate response to treatment in FXS (Berry-Kravis et al., 2018; Budimirovic et al., 2017). While the structural adequacy of the ABCFX has been replicated in independent samples (Aman et al., 2020; Wheeler et al., 2014), its content validity in terms of types of rated behaviors, relevance to FXS, and potential response to interventions (i.e., sensitivity) has not been qualitatively assessed. The present study intended to begin filling in this gap by conducting a survey of clinicians experienced in FXS. The 10 experts we surveyed indicated that most of the ABCFX items and its six subscales actually assessed their explicit or implicit primary behavioral constructs. However, item and subscale specificity was lower (approximately one-quarter and one-third, respectively). The ABCFX Irritability, Socially Unresponsive/Lethargic, and Stereotypy subscales appear to be those measuring more selectively their constructs, while a large number of items from different subscales appear to also assess other FXS-relevant constructs in particular Anxiety. Clinician evaluations of relevance to FXS revealed that mainly Hyperactivity items were of high relevance and that most Socially Unresponsive/Lethargic items were of low relevance. The latter were also deemed as having low potential for responsiveness, as well as items in the Stereotypy subscale (only three items were thought to have high responsiveness). Overall, the Irritability, Hyperactivity, Social Avoidance, and Anxiety subscales/constructs appear to have the best profiles of relevance and potential responsiveness to interventions.
Evaluation of behavioral constructs of the ABCFX
The ABC-C was developed on empirical bases without a priori behavioral constructs (Aman and Singh, 2017). Although this makes it difficult to determine whether the ABC-C measures the atypical behaviors that users intend to evaluate in FXS and other NDDs, years of the instruments’ implementation have led to the recognition of implicit constructs. For instance, the common use of the ABC-C or ABCFX Irritability subscale as a measure of IAAS behaviors (Eckert et al., 2019; McCracken et al., 2002; Owen et al., 2009). Identification of these implicit constructs in the literature (Aman et al., 2020; Wheeler et al., 2014), in conjunction with the emergence of Social Avoidance as a construct during the development of the ABCFX (Budimirovic et al., 2006; Sansone et al., 2012), served as the foundation for designing a survey for assessing content validity. The structure of the main questionnaire was simple, attempting to obtain unequivocal Yes/No answers to the relatedness of an item to the construct primarily linked to a subscale as well as to other constructs potentially associated with the subscale. Constructs suggested by the literature, and termed novel, were also independently evaluated. We also included open questions in order to obtain suggestions about items and subscales that could help to improve the ABCFX and other behavioral measures for FXS. Then, it was necessary to develop metrics for the analysis and interpretation of the survey’s responses. These were loosely based on the literature since no content validity study of an already developed rating scale for an NDD has been published. In line with the concept of content validity, parameters of item and subscale sensitivity and specificity to the primary behavioral construct were developed. Though small, we considered the assembled group of 10 clinical experts, most of them experienced members of the National Fragile X Foundation’s Clinical Trials Committee, adequate because they represented the range of medical and behavioral disciplines involved in the diagnosis and management of FXS.
Adequacy of the ABCFX for measuring multiple behaviors
The experts supported the notion that all the ABCFX subscales do indeed measure what they “intended to measure.” In other words, most items show sensitivity to their primary behavioral constructs. However, specificity was rather limited according to our “50% metrics.” Three subscales, namely Irritability, Socially Unresponsive/Lethargic, and Stereotypy, characterized mainly their primary construct, while Hyperactivity, Social Avoidance, and Inappropriate Speech assessed one or two additional behavioral constructs. While not an original construct, but in line with a recent caregiver qualitative assessment of observable anxiety-related behaviors (Lozano et al., 2022), nonsocial anxiety was represented by a large proportion of ABCFX items (17 out of 55 total, 9 from the Irritability subscale). The survey also showed that, despite the well-established statistical foundation of the Social Avoidance subscale (Aman et al., 2020; Sansone et al., 2012; Wheeler et al., 2014), distinguishing Social Indifference/ASD from Social Avoidance is difficult. This was to some extent anticipated in the original qualitative assessment of items of ABC-C’s Lethargy/Social withdrawal subscale. In this study, experts identified the four items in the Social Avoidance subscale as characterizing such behaviors, but two items (#16, #42) as representing both Social Avoidance and Social Indifference (Budimirovic et al., 2006). Distinction between social anxiety and autism-type of social deficits remains a challenge in FXS and other NDDs (Roberts et al., 2018, 2019). In the present study, experts also thought that most items of the Hyperactivity subscale reflect, in fact, the broad range of behaviors under the ADHD label including attentional deficit. This could be interpreted as a labeling issue without negative consequences for assessments. The multiple suggestions for subscale improvement collected in the survey provide the basis for instrument enhancement, even for those already strong aspects of the ABCFX. In sum, the sensitivity and specificity profile of the ABCFX subscales makes the instrument adequate for its original purpose, allowing also the evaluation of additional behavioral constructs.
Relevance to FXS and to interventions
The main survey attempted to determine the validity of an instrument with more than a decade of application to FXS (Budimirovic et al., 2017; Kaufmann et al., 2024). As the ABC-C was developed for evaluating response to treatment in NDDs, it is assumed that any adaptation developed for individuals with FXS would be appropriate for measuring change of atypical behaviors following interventions in this disorder. Nonetheless, the lack of formal assessment of content validity in FXS is a gap in the field. Therefore, following feedback on the main questionnaire from FDA reviewers, we added a supplementary survey that asked for relevance of items and subscales to FXS and their potential for detecting response to treatment. Employing Likert-like scoring and statistics adapted from previous publications (Carminati et al., 2024; Sandler et al., 2017; Zamanzadeh et al., 2015), we found that about one-fifth of items, predominantly assessing ADHD features, were deemed of high relevance and surprisingly almost one-quarter, mainly items describing lethargy and social indifference, were of low relevance. These findings emphasize the importance of ADHD features in FXS and, as discussed in the previous section, the complexity of measuring social deficits in the disorder (Kidd et al., 2020; Roberts et al., 2018, 2019). Nonetheless, the fact that four subscales of the ABCFX, specifically Irritability, Stereotypy, Hyperactivity, and Social Avoidance, were considered of high relevance and none of low relevance support the continuous use of the instrument. A different perspective was found about potential responsiveness to interventions. Only 3 items met criteria for high responsiveness, but 16 out of 55 items (29.1%) were deemed low responsiveness. Items representing lethargy and social indifference or stereotypic behaviors were thought to have low potential for change with treatment. Again, supporting the use of the ABCFX, no subscale (as a whole) was considered low responsiveness and the Irritability, Hyperactivity, Broad ADHD, and Anxiety subscales/constructs were deemed high responsiveness. These profiles most likely reflect the experts’ accumulated experience in psychopharmacology of FXS and idiopathic ASD, which has shown the treatment resistance of RRBs (Thom et al., 2021) that contrast with the drug response of IAAS, ADHD, and anxiety behaviors (Eckert et al., 2019; Dominick et al., 2021). It is important to notice that the three ABC-C items deleted during the ABCFX refactoring (Sansone et al., 2012) demonstrated low scores for both relevance and responsiveness, supporting the outcome of the statistical process. The supplementary questionnaire also evaluated level of agreement between responders, which was in general high particularly for the questions on response to interventions. Overall, the supplementary questionnaire results indicate that experts thought that ABCFX subscales are relevant and useful treatment outcome measures but that a substantial number of items are potentially deficient in terms of sensitivity to treatment.
Limitations
The methodology for qualitative evaluations of behavioral measures by clinicians or experts is evolving. Although during the development of behavioral rating scales some steps, such as concept elicitation, can be applied to evaluations by both clinicians and caregivers/affected individuals (Bevans et al., 2019; Merikle et al., 2021), comprehensive assessments of large number of items are only feasible when surveying clinicians. This and the fact that the ABCFX has been already psychometrically evaluated and implemented to a particular population (not an adaptation to a different language; e.g., Donadon et al., 2020), created a unique situation that required ad hoc solutions for evaluating content validity. While we incorporated as much as we could from relevant published studies (Bevans et al., 2019; Carminati et al., 2024; Donadon et al., 2020; Knight et al., 2008; Sandler et al., 2017; Zamanzadeh et al., 2015), and from input from FDA reviewers, some elements of the questionnaires and the analyses could be questioned (e.g., 5/10 expert threshold; potential overlap of Social Avoidance and Anxiety constructs). We attempted to mitigate these potential shortcomings by characterizing the data in multiple ways, with the goal of providing convergent evidence. We cannot exclude the probability of biased responses, particularly on the potential of items to change in response to interventions, due to prior experience with the instrument or clinical trials for FXS. Expert content validity qualitative assessments are by definition opinions and not empirical data. Thus, potential for response to interventions (i.e., responsiveness) and other predictive statements should be considered with caution. The relatively small number of experts is also a limitation of the study. However, in the field of rare diseases, it is difficult to identify a large number of clinicians with such expertise and experience. While we attempted to avoid potentially biased responses in the design and implementation of the survey, experience in the use of the ABC-C or the ABCFX could lead to bias when compared with similar assessments of instruments under development. Despite these concerns, the overall coherence and consistency of the data, and alignment with the FXS literature, suggest that solid conclusions can be derived from this study. Follow-up investigations, with a larger number of experts, could ultimately address these potential limitations. This and subsequent expert surveys will need to be eventually integrated to corresponding caregiver evaluations since the latter are essential for determining the content validity of an instrument (FDA Guidance for Industry 2009).
Conclusions
In conclusion, the six subscales of the ABCFX are adequate (sensitive) for evaluating their target behaviors (primary behavioral constructs). However, the Hyperactivity and Social Avoidance subscales are not specific and seem to also measure, respectively, attention deficits and other related social behaviors. The experts concluded that the Inappropriate Speech subscale can be “absorbed” by other subscales/constructs. Almost one-third of ABCFX items assess observable anxiety behaviors, in particular nonsocial anxiety behaviors, with the Irritability subscale being the most comprehensive in this regard. This indicates that use of the ABCFX provides the opportunity of obtaining information about this prominent feature of the FXS’ behavioral phenotype. The Stereotypy, Irritability, Hyperactivity, Social Avoidance, and Anxiety subscales/constructs appear to be highly relevant to FXS and the last four also have high potential for detecting changes in response to interventions. Altogether, these data support the continuous use of the ABCFX. In conjunction with other collected comments and suggestions, the study provides the foundation for revisions of the instrument and the development of new measures of atypical behavior in FXS.
Footnotes
Disclaimer
L.M.O. is supported by the NIMH Intramural Research Program (ZIAMH002955). The opinions expressed in this article are the authors’ own and do not reflect the views of the National Institutes of Health, the Department of Health and Human Services, or the United States government.
Disclosures
W.E.K. was the Chief Scientific Officer of
Supplementary Material
Supplementary Appendix SA1
Supplementary Appendix SA2
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
