Abstract
There is an increasing number of shared decision-making (SDM) interventions in paediatrics. However, there is little consensus as to the best instruments to assess the feasibility and impact of these interventions. This narrative review aims to answer: (1) what feasibility, knowledge and decision-making instruments have been used to assess paediatric SDM interventions and (2) what are the psychometric properties of used decision-making instruments, guided by the ‘consensus-based standards for the selection of health measurement instrument’ criteria. We conducted a review of the peer-reviewed literature. We identified 23 studies that evaluated a paediatric intervention to facilitate SDM for a specific health decision. Eighteen studies assessed intervention feasibility, with a wide variability in assessment between studies. Twelve studies assessed objective knowledge, and four studies assessed subjective knowledge with all but one study aggregating correct responses. We identified nine decision-making instruments that had been assessed psychometrically, although few had been thoroughly evaluated. The Decisional Conflict Scale was the most commonly-used instrument and the only instrument evaluated in paediatrics. Our study revealed a lack of consistency in the instruments used to evaluate decision-making interventions in paediatrics, making it difficult to compare interventions. We provide several recommendations for researchers to improve the assessment of SDM interventions in paediatrics.
Background
Quality decision-making is paramount to effective healthcare. Across paediatrics, shared decision-making (SDM) is increasingly used as the preferred approach to care (Wyatt et al., 2015), with the potential to improve decision satisfaction, health outcomes and increase knowledge (an often unmet prerequisite for decisional involvement) (Wyatt et al., 2015). SDM has been conceptualized as a continuum of varying involvement of the patient/parent and physician in reaching an agreement on the treatment to implement (Makoul and Clayman, 2006). Makoul and Clayman (2006) identified nine essential elements required for SDM. These elements include defining and explaining the problem, presenting options, discussing pros and cons, incorporating patient values and preferences, discussing patient ability and self-efficacy, involvement of doctor knowledge and recommendations, checking and clarifying understanding, making or explicitly deferring decisions and arranging follow-up. Interventions that aim to facilitate SDM can, therefore, involve any of these elements, or a combination of elements, and to varying degrees.
In the paediatric setting, SDM usually occurs between healthcare professionals and parents. SDM in paediatrics may also extend to include the young person, depending on their cognitive and emotional maturity, health status and preference for involvement. Involving the young person can bring additional complexities to the decision process due to difficulties in determining when and how to include the young person, as well as in balancing preferences between parents and the young person, and ensuring the provision of developmentally appropriate information (Lipstein et al., 2015). While preferences for decision-making involvement can vary between/within families, and across decisions, SDM appears to be valued by most parents and health professionals (Lipstein et al., 2012). Young people also desire to be involved in their healthcare decisions, often with support from their parents (Wyatt et al., 2015).
By definition, SDM interventions should aim to target any aspect, or combination of aspects, of Makoul and Clayman’s (2006) nine essential elements for SDM. Despite a growing number of SDM interventions being developed in paediatrics (Wyatt et al., 2015), there are no clear guidelines as to the primary outcomes that should be used to assess and compare SDM intervention efficacy. This uncertainty also results in a lack of agreement as to which instruments should be used to evaluate SDM interventions. Lack of consistency in primary outcomes and use of instruments create difficulty in pooling data and comparing intervention efficacy across studies.
Several reviews have been conducted focusing on SDM instruments. One systematic review has been published on the instruments used to assess the process of SDM (Gärtner et al., 2018). This review identified 40 SDM instruments, revealing minimal evidence regarding the quality of each of these. A second review on SDM instruments (Scholl et al., 2011), building upon Simon et al.’s (2007) review, also highlighted that while there are many SDM instruments available, only a few have been extensively used, with a need for further psychometric evaluations. Another systematic review has also been conducted on instruments used to evaluate clinical interactions between patients and clinicians (Elwyn et al., 2001). This review found eight instruments, concluding that none allowed for the construct of patient involvement to be measured accurately. In addition, all of the aforementioned reviews focus on instruments for use with adults. Our current review expands upon these by addressing the measurement of both the feasibility and the efficacy (in regard to knowledge acquisition) of decision-making interventions, specifically within paediatrics. We examined feasibility, given that it is a necessary component of an effective intervention (Sekhon et al., 2017).
Aim
Specifically, we aimed to determine the following: What instruments have been used to assess paediatric SDM interventions, in regards to feasibility, knowledge and decision-making? What are the psychometric properties of the decision-making instruments used in this context?
Literature search
We conducted a narrative review and synthesis following the approach outlined by Ferrari (2015). Review findings and critical analysis of findings are, therefore, presented under relevant thematic sections. Narrative reviews are useful for obtaining a broad perspective of a topic. See Figure 1 for an overview of methods.

Methods overview.
Step 1: Literature search
We reviewed the literature to identify studies that evaluated a paediatric (regarding a child patient aged less than 18 years) SDM intervention for a specific health decision. Two authors (EGR and CS) searched MEDLINE, EMBASE and PSYCInfo, from January 2000 to May 2019 (see Supplement A for search terms and inclusion criteria). A period of approximately 20 years was selected due to the prolificacy of SDM in the paediatric literature over this time. We identified 1187 abstracts after removal of duplicates in the initial search, from which two authors (EGR and CS) identified 18 eligible articles. Five additional studies were included after searching reference lists of included articles and Google Scholar™.
Step 2: Extraction of feasibility, knowledge and decision-making measures
From each article, we extracted data about the sample, intervention and instruments used to assess the interventions. If applicable, we extracted the methods used to assess intervention feasibility, objective knowledge and/or subjective knowledge. We defined feasibility assessment as any measurement of the acceptability of the intervention (e.g. satisfaction), demand (e.g. intended use) or practicality (e.g. ability to complete the intervention), as these are the most common focus of feasibility studies (Bowen et al., 2009).
Step 3: Identification of evaluated decision-making measures
We identified all decision-making instruments that had had their psychometric properties evaluated in at least one study. Based on Scholl and colleagues’ (2011) framework, we classified decision-making instruments as those assessing decision antecedents (i.e. elements that surround the task of decision-making, such as role preference), the decision process (i.e. the observed and perceived process of arriving at a decision, such as decision involvement) or decision outcomes (i.e. the evaluation of the decision process, such as decisional conflict). We classified instruments using Scholl’s framework to provide a clear overview of which aspects of the decision-making process had been assessed.
Step 4: Identification of evaluation studies
Two authors (EGR and CS) identified the psychometric evaluation study that (1) had the aim to validate or psychometrically test the instrument, (2) reported at least one aspect of reliability or validity, as per the ‘consensus-based standards for the selection of health measurement instrument’ (COSMIN) criteria for good measurement properties, (3) was the most comprehensive (in regard to sample size, extensiveness of evaluation) and (4) most relevant to paediatric health decisions. We excluded studies that evaluated a non-English translation of an instrument. The analysis of the decision-making instruments, therefore, relies on the data presented in the selected psychometric evaluation study.
Step 5: Assessment of psychometric properties of decision-making instruments
We summarized the psychometric data of the decision-making instruments via the COSMIN criteria for good measurement properties. The COSMIN criteria allow researchers to evaluate whether measurement properties are ‘sufficient’, ‘insufficient’ or ‘indeterminate’. COSMIN provide a detailed list of requirements for each of these categories, for each measurement property (Prinsen et al., 2018). This assessment has been successfully used in a previous review (Gärtner et al., 2018). Two authors (EGR and CS) scored the evaluation studies.
Discussion (for the sake of brevity, references are not listed for ≥5 citations)
Section 1: Overview of articles
We included 23 studies in our narrative synthesis (see Supplement B). Studies covered a variety of diseases/treatment decisions such as immunizations (n = 4) (Jackson et al., 2010, 2011; Shourie et al., 2013; Wroe et al., 2005) and attention-deficit hyperactivity disorder treatment (n = 3) (Ahmed et al., 2017; Brinkman et al., 2013; Ossebaard et al., 2010). Many of the studies were before–after pilots (n = 9). Pilot studies are integral in health research in identifying the feasibility of an intervention and justifying further evaluation of efficacy in a randomized controlled trial (RCT) (Bowen et al., 2009; Lancaster et al., 2004). This is not surprising, given the growing importance of intervention acceptability measurement. Acceptability of an intervention impacts the likelihood of effectiveness and successful implementation (Sekhon et al., 2017). Eight studies were RCTs (n = 8).
We identified 17 studies evaluating an intervention targeted towards parents, 5 targeted towards both parents and the young patient and 1 targeted towards the patient only (Parker et al., 2017). No studies described an intervention targeted towards healthcare professionals. Most of the SDM interventions were decision aids (n = 18). Overall, our review revealed a lack of consistency in acceptability and knowledge assessment and shortage of validated decision-making measures.
Section 2: Feasibility/acceptability assessment
Eight studies evaluated the feasibility of the SDM intervention from the perspective of parents, seven evaluated from the perspective of the parents and healthcare professionals, one evaluated from the perspective of healthcare professionals (Delany et al., 2017) and two evaluated from the perspective of parents and the child patient (Feenstra et al., 2015; Robertson et al., 2019). All studies used purpose-designed questions to evaluate the feasibility of their intervention. This resulted in significant variability in how the interventions were assessed, limiting the ability to compare and contrast intervention feasibility. The number of questionnaire items parents were asked to assess feasibility ranged from 3 to 28, with one using an interview schedule (Robertson et al., 2018). For parents, response options for Likert-type scales varied from 2 to 5 options, with one study using a 0–100 rating scale (e.g. how useful was the information, where 0 = not at all and 100 = extremely) (Wroe et al., 2005). Several studies did not report the number of items parents were asked (Carlon et al., 2017; Johnston et al., 2009; Ossebaard et al., 2010; Westermann et al., 2013) or their response options (Carlon et al., 2017; Ossebaard et al., 2010). Healthcare professionals were asked between 2 and 34 items. Most response scales for healthcare professionals were 5-point Likert-type scales. The two studies that evaluated feasibility from the perspective of the child patient (Feenstra et al., 2015; Robertson et al., 2019) used 9 items using a 2- or 3-point Likert scale and 22 items using a range of 2- to 10-point response options, respectively.
A potential reason for the lack of consistency in measuring intervention feasibility may be due to poor conceptualization of ‘feasibility’ and/or ‘acceptability’. Bowen et al. (2009) describes eight general areas of focus addressed by feasibility studies. Future studies need to be explicit as to the purpose of their study, which aspect of feasibility is being assessed and then determine measurement accordingly (Bowen et al., 2009). Given that most studies reported on acceptability, we recommend that future SDM intervention studies consider the use of the Theoretical Framework of Acceptability to first conceptualize which aspect of acceptability is relevant for measurement. The Theoretical Framework of Acceptability provides a multifaceted definition of acceptability which involves affective attitude, burden, perceived effectiveness, ethicality, intervention coherence, opportunity costs and self-efficacy and highlights the differentiation between prospective, concurrent and retrospective acceptability (Sekhon et al., 2017). Better conceptualizing ‘feasibility’ and/or ‘acceptability’ may lead to more accurate assessment of interventions and allow for comparisons.
Section 3: Knowledge assessment
As an indication of intervention efficacy, 18 studies assessed feasibility using purpose-designed instruments, with a wide variability in assessment between studies. Twelve studies assessed objective knowledge, and four studies assessed subjective knowledge (Allingham et al., 2018; Ossebaard et al., 2010; Robertson et al., 2019; Sajeev et al., 2017). One study assessed objective knowledge using an odds ratio (Robertson et al., 2019). Previous reviews have also highlighted the common use of knowledge as an outcome measure across adult SDM interventions (Stacey et al., 2017). This may be because knowledge acquisition may be more modifiable than more complex constructs such as decisional conflict. Similar to feasibility and acceptability assessment, assessment of knowledge varied across studies. Content differed, given that items were tailored to the specific health decision.
The number of items used to determine objective knowledge in our studies ranged from 5 to 17, with a range of response options. Several studies did not report the number of items (Jackson et al., 2010, 2011) or response options (AlFaleh et al., 2011; Jackson et al., 2010). In all, 11 of the 12 studies that evaluated objective knowledge reported scores as a summation of correct responses, and, where appropriate, conducted comparisons of means between groups or time points. Measuring knowledge in this way provides a descriptive overview of the potential improvements in knowledge. However, comparing mean scores may not provide a thorough representation of the efficacy of an intervention. Aggregating correct items within studies treats each item equally, regardless of the difficulty level (relative to the baseline knowledge of the population under study) or response options. This limits the ability to compare and contrast intervention efficacy due to varying difficulty of questions and response options across studies.
The use of alternative statistical effect analyses, such as adjusted odds ratios obtained from a logistic regression model, may be more appropriate for interventions where knowledge is the primary outcome. This was done in one study (Robertson et al., 2019). Rather than summarizing knowledge by aggregation at each time point, an odds ratio represents the increase in the odds of getting each particular question correct. By expressing changes in knowledge on an odds ratio scale, boundary effects are avoided (i.e. if one study has lower baseline knowledge scores, there is more ‘room for improvement’ and you could see bigger changes in knowledge, even if the effect of the intervention is the same) and knowledge acquisition may be more comparable between studies, regardless of the questions asked and response options used. Another advantage of using a logistic regression model is that it can provide more statistical power by considering changes in responses to individual questions. When measuring knowledge, studies also need to justify how ‘unsure’ or ‘not sure’ responses are handled. Of the studies that included ‘unsure’ or ‘not sure’ in their response options, three treated these responses as incorrect (Allingham et al., 2018; Johnston et al., 2009; Robertson et al., 2019) and two did not specify (Brinkman et al., 2013; Hess et al., 2018). Previous research has suggested the scoring of half points for ‘unsure’ responses as uncertainty is preferred over certainty of false beliefs (Joffe et al., 2001). Regardless of response classification, we recommend that this should be decided a priori and clearly justified.
Four studies assessed subjective knowledge (Allingham et al., 2018; Ossebaard et al., 2010; Robertson et al., 2019; Sajeev et al., 2017), varying in number of items and response options. Subjective knowledge or the ‘feeling of knowing’ plays an important role in memory and problem-solving and may facilitate decisional involvement (De Frias et al., 2003). Conversely, wrongly perceived high subjective knowledge may be detrimental to the decision process, limiting information seeking and continuing misconceptions. Future studies evaluating knowledge should consider evaluating both objective and subjective knowledge.
Section 4: Decision-making instruments
Through our review of the 23 original studies, we identified 9 decision-making instruments (see Figure 2 and Supplement B). Seven instruments had been evaluated in at least one evaluation study. Two instruments were minor adaptations (Control Preferences Scale–Pediatrics (CPS-P) and Dyadic OPTION scale), of which the original instruments had been evaluated (CPS and OPTION scale). Of the nine instruments, two were decision antecedent instruments, one was a decision process instruments and six were decision outcome instruments. Nineteen studies used at least one of these decision-making instruments. Given the complexity of SDM, it may be necessary for interventions to use multiple outcome instruments (e.g. Decision Regret Scale (DRS) and Satisfaction With Decision Scale (SWDS)), as found in seven of our studies. However, the studies’ rationale behind the use of specific instruments was not clear. As outlined below, there is a lack of evidence to support any of the instruments currently used to evaluate paediatric SDM interventions. We therefore strongly encourage validity studies of instruments within the paediatric context. Until then, instruments should, therefore, be selected based on the purpose of the intervention, not whether it has been ‘validated’ in adults or not. Given this, we recommend that researchers first identify which aspects of SDM their intervention is likely to change (i.e. decision antecedents, decision process or decision outcome) and what is of clinical relevance. Defining the smallest clinically meaningful difference in change scores for before–after pilots may also be necessary (Gärtner et al., 2018). Until further validations are conducted, we highly recommend that researchers also test the psychometric properties of instruments used within their sample to ensure that they are performing appropriately. If the instruments are not performing well, we urge researchers to present their findings with caution, acknowledging this significant limitation.

Overview of findings of the COSMIN criteria for good measurement properties. COSMIN: consensus-based standards for the selection of health measurement instrument.
Six of the 23 interventions were developed for use by the young patient, with 4 collecting data from the children/adolescent patient (Bejarano et al., 2015; Shirley et al., 2015; Parker et al., 2017; Robertson et al., 2019). Of the instruments identified, none appear to have been psychometrically evaluated for use in children/adolescent patients, with only the Decisional Conflict Scale (DCS) having been evaluated with parents making a child proxy decision. Validations of decision-making instruments in the paediatric setting are clearly warranted. We also found that none of the decision-making instruments addressed the potential triadic nature of decision-making in the paediatric setting. However, Robertson et al (2019) used a purpose-designed tool to identify adolescents’ and parents’ preference for decision-making role within the triadic interaction. Gärtner et al. (2018) have reported a trend towards dyadic measures (i.e. assessing both the patient’s and the clinician’s perspective), but there is clearly a need for triadic measures, either adapted or newly developed, within the paediatric setting.
Section 5: Psychometric properties of decision-making instruments (see Table 1 for descriptions of instruments)
Decision antecedents (DSES and CPS-P)
The ‘Decision Self-Efficacy Scale’ (DSES) (O’Connor, 1995b) was used in one study (Coxeter et al., 2017). The DSES has been evaluated in a study by Bunn and O’Connor (1996), with 94 adult patients with schizophrenia. This study assessed internal consistency (i.e. the degree of interrelatedness among the items) scoring ‘sufficient’ (α = .84). They also hypothesized that individuals who were unsure or delayed their decision regarding continuing medication would have higher decision self-efficacy than those who decided to continue, which was confirmed.
Description of decision-making instruments.
Note: DSES: Decision Self-Efficacy Scale; CPS-P: Control Preferences Scale–Paediatrics; DCS: Decisional Conflict Scale; DCS-LL: Decisional Conflict Scale–Low Literacy; DRS: Decision Regret Scale; PDPAI: Provider Decision Process Assessment Instrument; SWDS: Satisfaction With Decision Scale.
The CPS-P (Pyke-Grimm et al., 1999) was used in one study (Ahmed et al., 2017). The CPS-P does not appear to have undergone any psychometric testing. The CPS-P is a minor adaptation of the CPS (Degner et al., 1996). The CPS has been evaluated by Kremer and Ironson (2008), with 79 HIV-positive individuals considering antiretroviral treatment. The CPS in this study scored ‘sufficient’ (Kendall’s tau-b coefficients all >.6) on both inter- and intra-rater reliability (i.e. the degree to which different raters give consistent estimates of the same behavior and the degree to which the same rater give consistent estimates on different occasions, respectively).
Decision process (OPTION scale)
The ‘OPTION scale’ (Elwyn et al., 2005), which measures the extent that clinicians involve patients in the decision-making process, was used in two studies (Brinkman et al., 2013; Hess et al., 2018), with one using the updated scoring system (Brinkman et al., 2013). The OPTION scale was developed and evaluated by Elwyn et al. (2005), using 186 recorded consultations with 21 general practitioners. In this study, the OPTION showed ‘sufficient’ structural validity (i.e. the degree to which the scores of the instrument are a reflection of the dimensionality of the construct being measured) (via an exploratory factor analysis) and inter- and intra-rater reliability (mean Cohen’s κ = .66). One study used the Dyadic OPTION scale (Feenstra et al., 2015), which Melbourne et al. (2011) developed as an extension of the OPTION scale, but has not itself been psychometrically evaluated.
Decision outcomes (DCS, DCS-LL, DRS, PDPAI, SURE test and SWDS)
The DCS (O’Connor, 1995b) was the most commonly used decision outcome instrument used in 11 of the 23 studies. The DCS is also a commonly used primary instrument in adult patient decision aids (Kryworuchko et al., 2008). The DCS has been evaluated in numerous studies across a variety of diseases and decisions in adults and in numerous languages. This is the only instrument in our review that has been evaluated with a paediatric population, specifically with 266 parents of a child living with a life-limiting illness (Knapp et al., 2009). The DCS in this study scored ‘sufficient’ on structural validity (via a confirmatory factor analysis and root mean square error of approximation) and internal consistency (all subscales α > .85). No data were provided for the DCS total score, even though total score calculations are included in the development study (O’Connor, 1995a).
The ‘Decisional Conflict Scale–Low Literacy’ (DCS-LL) (Linder et al., 2011) is the low literacy version of the DCS and was used in two studies (Coxeter et al., 2017; Feenstra et al., 2015). The DCS-LL was developed and evaluated by Linder et al. (2011), with 149 men eligible for prostate cancer screening. In this study, the DCS-LL met requirements of ‘sufficient’ for structural validity (via a confirmatory factor analysis and mean- and variance-adjusted weighted least squares estimator), with subscales showing good internal consistency (α > .80), except the ‘supported’ subscale (α < .60). Linder et al. (2011) tested the hypothesis that the three DCS-LL subscales that contribute to uncertainty (Informed, Values Clarity and Supported) are strongly and positively associated with the DCS-LL Uncertainty subscale (hypothesis partially met, with the Informed and Values Clarity subscales strongly correlated with the Uncertainty subscale). They also hypothesized that men who made a decision would have higher DCS-LL scores than men who were not sure, which was confirmed.
The DRS (Brehaut et al., 2003) was used in four studies (Allingham et al., 2018; Bejarano et al., 2015; Robertson et al., 2019; Sajeev et al., 2017). The DRS was evaluated with 177 menopausal women deciding on hormone replacement therapy, 395 women with breast cancer deciding on adjuvant therapy, 200 women deciding between a lumpectomy and mastectomy and 56 men considering prostate cancer treatment (Brehaut et al., 2003). The DRS in this study scored ‘sufficient’ internal consistency (α > .8). Brehaut et al. (2003) hypothesized that higher DRS scores would be associated with (1) more negative psychological and physical health outcomes (confirmed), (2) lower decision satisfaction (confirmed), (3) lower information provision satisfaction (confirmed with three of the four participant groups), (4) lower satisfaction with doctor’s visit (confirmed) and (5) higher DCS scores (confirmed). They also hypothesized that patients with higher DRS scores would take greater roles in the decisions rather than rely on their physician (not confirmed), have more negative attitudes towards the decision made (confirmed) and be more likely to initially make a treatment decision but then change their mind (confirmed).
The Provider Decision Process Assessment Instrument (PDPAI) (Dolan, 1999), which assesses a providers’ degree of comfort with a medical decision, was used in one study (Westermann et al., 2013). The PDPAI is the only evaluated instrument identified in our review that is from the perspective of clinicians. The PDPAI was developed and evaluated by Dolan (1999) with 14 residents, 1 physician, 6 teaching faculty members and 1 fellow across 2 hospitals. This study assessed internal consistency, scoring ‘sufficient’ (α = .90). They also tested the hypothesis that there would be a negative correlation between decision conflict and the decision maker’s satisfaction, which was confirmed.
The ‘SURE test’ (Légaré et al., 2010), a screening test for decisional conflict, was used in two studies (Bejarano et al., 2015; Shirley et al., 2015). The SURE test was developed and evaluated by Légaré et al. (2010). This study tested the SURE test with 123 French-speaking pregnant women and 1474 English-speaking patients considering treatment options for a variety of conditions. The study had unclear reporting for structural validity (via a principal components analysis) and scored ‘sufficient’ on criterion validity (i.e. the degree to which the scores of the instrument are related to the ‘gold standard’ instrument) (comparing with the DCS), but ‘insufficient’ for internal consistency (α ≤ .65). The study also tested the hypothesis that SURE scores would discriminate between patients who made choices of treatment and those who did not, which was confirmed.
The SWDS (Holmes-Rovner et al., 1996) was used in two studies (Sajeev et al., 2017; Wroe et al., 2005). Holmes-Rovner et al. (1996) evaluated the SWDS with 252 women deciding about management of menopause with hormone replacement therapy. Their study scored ‘sufficient’ for internal consistency (α = .86). They also tested the hypothesis that the SWDS scores would be (1) negatively correlated with the DCS score (confirmed), (2) positively correlated with the decision confidence (confirmed), (3) independent of scores for the satisfaction with provider (not confirmed) and desire to participate (not confirmed) and (4) confounded by perceived inability to make a decision for health status reasons (confirmed).
Limitations
Our review should be considered in the light of several limitations. The heterogeneity between validation studies made it difficult to compare and contrast the appropriateness of instruments. We also limited our search to English-only studies and validation of English instruments. The validation studies were also typically done in Caucasian populations. Given the aim of our research in evaluating the instruments used in published SDM interventions, it is possible that we missed relevant validated decision-making instruments or more relevant validation studies.
Conclusion
The role of SDM in paediatric healthcare is becoming increasingly valued to ensure quality of informed consent and improve decision-making satisfaction. We identified 23 studies which evaluated an SDM intervention for a specific treatment decision within the paediatric population. We identified the instruments used to assess intervention feasibility and efficacy and examined the psychometric properties of the decision-making antecedent, process and outcome instruments used within these interventions.
We acknowledge that conceptualization of ‘validity’ is a contentious issue. COSMIN (2010) defines validity as ‘the degree to which an outcome measure measures the construct it purports to measure’ (p. 28) and includes content validity (including face validity), construct validity (including structural validity, hypotheses testing and cross-cultural validity) and criterion validity (including concurrent and predictive validity). However, this is slightly inconsistent with the 2014 revised ‘Standards for Educational and Psychological Testing’, who refer to validity as the degree to which accumulated evidence and theory support a specific interpretation of test scores for a given use of a test (AERA, APA, and NCME, 2014: 225). As such, disagreements over the term should be acknowledged when considering any evidence for ‘validity’ (Newton and Shaw, 2016).
The heterogeneity between psychometric evaluation studies made it difficult to compare and contrast the appropriateness of instruments. While most evaluation studies assessed only a few COSMIN criteria within a specific sample and setting, they mostly scored ‘sufficient’. However, the extent of the evaluations of psychometric properties across all studies was minimal, with most studies evaluating two psychometric properties. Responsiveness was also not assessed in any evaluation study, limiting the validity of instruments in before–after studies (which accounted for nine studies in our review). Future psychometric evaluations should focus on the quality of reporting and/or increase the breadth of evaluation. This will ultimately improve researchers’ instrument selection and thereby the validity of their intervention evaluation. Similar findings have been reported by Scholl et al. (2011) with regard to adult patient decision-making, with the recommendation to improve the quality of reporting and consistency of psychometric evaluations of instruments.
We found a lack of standardization in assessing the feasibility of decision-making interventions. Several studies assessed knowledge as an indicator of intervention efficacy, calculating total knowledge scores as a summation of correct responses. We identified nine evaluated decision-making instruments, with the DCS (decision outcome measure) being the most frequently used and only measure evaluated in a paediatric setting. Decision-making instruments also typically focused on the decision outcome, rather than decision antecedents or decision process measures. Further exploration as to what types of interventions are likely to impact specific aspects of decision-making may be worthwhile. To do so, the development and appropriate validations of decision-making antecedent and decision process instruments in particular are warranted.
Providing more consistency to measurement will allow for better opportunities to compare and contrast interventions. It will also ensure that interventions are truly acceptable and worthy of further evaluation (Bowen et al., 2009), and ultimately implementation. Based on the findings of our review, we recommend that researchers consider (1) using the Theoretical Framework of Acceptability when assessing feasibility, (2) using statistical methods such as odds ratios when assessing knowledge before and after an intervention and (3) choosing instruments based on the construct that their intervention is likely to impact, and then testing the psychometric properties of instruments within their study to ensure that it is performing appropriately. More extensive validations within paediatrics are clearly warranted as well as the development of instruments acknowledging the triadic interaction of decision-making in paediatrics.
Supplemental Material
supplement_material - What instruments should we use to assess paediatric decision-making interventions? A narrative review
supplement_material for What instruments should we use to assess paediatric decision-making interventions? A narrative review by Eden G Robertson, Jennifer Cohen, Christina Signorelli, David M Grant, Joanna E Fardell and Claire E Wakefield in Journal of Child Health Care
Footnotes
Acknowledgements
The authors would like to thank Mark Donogohoe, Pirathat Techakesari and Mark Gabriel for their support on this project.
Author Contributions
EGR and CS undertook the database searching, screening, full-text reading and data extraction. EGR, CS and DMG assisted with conceptualization of the review. EGR prepared the manuscript. JC, CS, DMG, JEF and CEW reviewed the manuscript and assisted in writing. All authors read and approved the final manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: ER is supported by a Cancer Institute NSW Translational Programme Grant. Experimental therapeutics for Myc-driven childhood cancer, 10/TPG/1-13. CW is supported by a Career Development Fellowship from the National Health and Medical Research Council of Australia (APP1143767). CS and JF are supported by The Kids’ Cancer Project. The Behavioural Sciences Unit is supported by the Kids with Cancer Foundation and the Kids Cancer Alliance.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
