Abstract
Abstract
Purpose:
Laparoscopic common bile duct exploration (LCBDE) decreases overall costs and length of stay in patients with choledocolithiasis. However, utilization of LCBDE remains low. We sought to evaluate a previously developed general surgery LCBDE simulator among a cohort of pediatric surgical trainees. The study purpose was to evaluate the content validity of an LCBDE simulator to support or refute its use in pediatric surgery education.
Materials and Methods:
After IRB exempt determination, 30 participants performed a transcystic LCBDE using a previously developed simulator and evaluated the simulator using a self-reported 28-item instrument. The instrument consisted of two primary domains (Quality and Ability to Perform) that were rated using twenty-five 4-point rating scales and one 4-point global rating scale. Validity evidence relevant to test content was evaluated using a many-facet Rasch model. Interitem consistency was estimated using Cronbach's alpha. P < .05 was considered statistically significant.
Results:
The highest combined observed averages were for the Value subdomain (OA = 3.79), whereas the lowest ratings were for the Physical/visual attributes subdomain (OA = 3.19). The averaged global rating was 3.14, consistent with this simulator can be considered for use in pediatric LCBDE training, but could be improved slightly. Rasch indices were favorable and supported evidence relevant to test content. Interitem consistency estimates were also favorable, with α values of 0.94 and 0.56 for Qualities and Ability, respectively.
Conclusions:
Overall, participants rated the LCBDE simulator highly valuable for pediatric surgical education and felt that it could be used as an educational tool with minor modifications.
Introduction
T
However, LCBDE is a procedure that requires advanced laparoscopic skills, which may explain the low utilization of the procedure. 7 To address this educational gap, an LCBDE simulator was developed, evaluated, and incorporated into a mastery learning curriculum used for training senior surgical residents at our institution. 8 Universal attainment of the mastery standard among residents was achieved with a comprehensive curriculum and deliberate practice on the simulator. 9
In this study, we sought to evaluate the clinical utility of the previously developed LCBDE simulator for use in training of pediatric surgery trainees, with an emphasis on assessment of the simulator's content validity, and to determine whether or not physical modifications to the pre-existing simulator would be necessary before it could be used in pediatric surgical education.
Materials and Methods
Study participants
Following review and exempt determination from Northwestern University's Institutional Review Board, 31 second-year pediatric surgery trainees participated in a 2-day pediatric surgery course hosted by Northwestern Simulation (Chicago, IL) in September of 2015. During the course, 30 participants (96.8%), representing more than half of all pediatric surgery programs in the United States and Canada, evaluated the LCBDE simulator.
LCBDE simulator
The previously developed LCBDE simulator is a self-contained model of the liver, gallbladder, extrahepatic biliary system, and duodenum (created out of purely synthetic materials) that is then placed inside a standard Fundamentals of Laparoscopic Surgery (FLS) box trainer (VTI Medical, Waltham, MA). 8 This arrangement allows for the use of the FLS camera to simulate the laparoscopic view obtained from a periumbilical camera port. A second video camera system provides a simulated real-time fluoroscopic view that is displayed on a second monitor and is controlled by the user with a foot pedal. A fiber-optic or video choledocoscope (Karl-Storz, Tuttlingen, Germany) provides an endoscopic view that is displayed in a picture-in-picture manner in conjunction with the laparoscopic view on a single monitor (Fig. 1).

LCBDE simulator views
The laparoscopic instruments and endoscopic equipment (Cook Medical, Bloomington, IN) necessary to perform a complete transcystic LCBDE were provided. Participants were then asked to retrieve a 6-mm multifaceted bead, simulating an impacted gallstone in the common bile duct.
Assessment of simulator
After completing the simulated LCBDE, participants evaluated the simulator using a paper instrument. The 26-item instrument consisted of two targeted domains; simulator qualities (Qualities), and ability to perform LCBDE-relevant tasks (Ability). Additionally, there were two demographic items targeting participants' familiarity with LCBDE-relevant equipment and setup and current comfort level with performing an LCBDE for choledocolithiasis (both scored 1 = Not comfortable to 3 = Very comfortable).
Participants evaluated the characteristics and qualities of the simulator (Qualities) across three subdomains (Physical/visual attributes, Realism of experience, and Value). The four items targeting Physical/visual attributes and 11 items targeting Realism of experience were rated using 4-point rating scales ranging from 1 (Not realistic) to 4 (Highly realistic), while the four Value items, including relevance to practice and value of the simulator as a training and a testing tool, were rated on 4-point scales (with added do not know option) and a global rating (4-point rating scale). Participants also rated their personal ability to perform seven individual tasks during the LCBDE procedure (Ability) using a 4-point rating scale, ranging from 1 (Very difficult to perform) to 4 (Very easy to perform).
Statistical analysis
Validity evidence relevant to test content and internal structure was evaluated using indices from a modern measurement model, while additional evidence of internal structure (interitem consistency) was estimated using Cronbach's alpha. These types of evidences are described below.
Evidence relevant to test content
To evaluate validity evidence relevant to test content, we employed an application from modern test theory: a Rasch model. 10 Analysis was performed using the Facets software v. 3.68.2 (Linacre, 2011). For this study, we applied a many-facet Rasch model consisting of four facets (participants × comfort with equipment × self-efficacy × items) to acquire three indices used to evaluate content validity—observed averages (OAs), item outfit statistics, and point–measure correlations. These indices, described in greater detail by previous work, 11 were adapted from Wolfe and Smith 12 and are summarized as follows:
Observed averages
The OA for each of the 15 items indicates the participants' averaged ratings. Higher OAs suggest that the perceived representativeness and the perceived realism of the simulator's features are high, while lower OAs suggest lower representativeness.
Item outfit statistics
As described by Linacre, item mean square outfit (Outfit MS) statistics show the size of the randomness or variability in items' ratings. 13 With expected value of 1.0, values <1.0 suggest that ratings are predictable (in high agreement), while values greater than 1.0 indicate unpredictability (highly variable). In this study, we considered the existence of items with Outfit MS values higher than 2.0 a threat to content validity.
Point–measure correlations
The point–measure correlation, also called the item–measure correlation, provides a correlation that identifies the degree in which the scores on an item are consistent with the average scores of the remaining items. Positive point–measure correlations are ideal and indicate that items contribute useful information to the construct measured by the test as a whole. For this application, a negative value for a particular item may suggest that the item may be measuring a different construct than the other items and fails to offer evidence of content validity.
Evidence relevant to internal structure
Using a traditional method based on classical test theory, we evaluated interitem consistency estimated by Cronbach's alpha for the two primary domains—simulator quality and participants' ability to perform tasks. With a possible range of 0.0–1.0, an acceptable internal consistency estimate (0.70) would suggest that the combined items adequately measure the single intended construct. This, considered in combination with nonextreme (| ± 2.0|) item Outfit MS statistics, would support evidence of internal structure.
Results
The majority of participants self-reported a high familiarity with, and comfort toward, the use of LCBDE in the pediatric population. Twenty participants (66.7%) self-reported their familiarity with LCBDE-relevant equipment as at least somewhat comfortable, while 22 participants (77.3%) self-reported their ability to successfully complete an LCBDE on a pediatric patient as at least somewhat comfortable. There were no statistical differences at item or domain levels when comparing ratings of participants with low familiarity and/or comfort with those with high familiarity/comfort (P = .13, .92). Given these data, findings are reported as combined OAs.
Evidence relevant to test content
Findings indicated that the highest combined OAs were for the Qualities–Value subdomain (OA = 3.79), whereas the lowest ratings were for the Qualities–Physical/visual attributes subdomain (OA = 3.19) (Table 1). Lowest rated items were Performing intraoperative cholangiogram and Realism of balloon dilation (OA = 2.96 and 3.00, respectively). The averaged global rating was 3.14, consistent with this simulator can be considered for use in pediatric LCBDE training, but could be improved slightly. Participants' self-reported ability to complete the seven LCBDE tasks was high, indicated by high OAs [3.25, 3.38], aligning with somewhat easy to perform (Table 2).
LCBDE, laparoscopic common bile duct exploration; MS, mean square.
CBD, common bile duct; LCBDE, laparoscopic common bile duct exploration; MS, mean square.
For both Qualities and Ability domains, all outfit MS values fell below the acceptable threshold of 2.0, [0.41, 1.84]. The lowest outfit MS value was associated with Qualities—Expected overall experience of LCBDE in 15-year-old child, indicating a high degree of agreement with participants' OA of 3.30, which aligned with Adequate realism, but could be improved. The highest outfit MS value was associated with Qualities—Relevance of simulator to practice, indicating a high degree of variability in participants' perceived relevance of the simulator to their own practice, in spite of the high observed average (OA = 3.84), which aligned with Has a great deal of relevance. A review of the Ability domain's outfit MS indices indicated that six of the seven items had values over 1.36, suggesting relatively high variability in participants' self-reported ability to perform each of the required tasks. In spite of this finding, all indices were well under the threshold of 2.0, indicating reasonable variability.
Evidence relevant to internal structure
Point–measure correlations: Analysis of all 26 items of Qualities and Ability domains indicated that all items had positive point–measure correlations (Tables 1 and 2). For the 19 items of the Qualities domain, items ranged from 0.19 to 0.76. The point–measure correlations for the seven items of the Ability domain were lower, ranging from 0.17 to 0.48. Positive point–measure correlations for the 26 items suggest that these items contribute to a single construct and offer evidence of content validity. Interitem consistency for the 19 items used to measure simulator quality was high (α = 0.94), while interitem consistency for the seven items used to measure participants' ability to perform tasks was lower (α = 0.56), but adequate for this preliminary study.
Discussion
In this study, we sought to evaluate a previously developed adult LCBDE simulator in regard to its applicability for training of pediatric surgery fellows to perform the task in adolescent patients. After performing an LCBDE procedure on the simulator, participants rated it as highly realistic and relevant to the needs of pediatric surgery trainees.
Childhood obesity continues to be a major health concern in the United States; current estimates of the prevalence of overweight (body–mass index [BMI] >85%) and obesity (BMI >95%) in children are 17% and 31%, respectively. 14 While a significant proportion of gallstone disease in the pediatric population is attributed to children with hemolytic disorders, multiple studies have identified obesity as a major risk factor in the development of symptomatic biliary lithiasis, leading to an increase in the number of cholecystectomies being performed in children.15,16
The incidence of cholelithiasis in the adult population has also been steadily increasing, mirroring the troubling upward trend in adult obesity rates. As a result of these trends, an estimated 403,000 cholecystectomies are performed in the United States annually. 17 Contemporary data suggest that 5%–17% of patients who undergo a laparoscopic cholecystectomy (LC) will be found to have choledocolithasis intraoperatively.18,19 Numerous randomized prospective trials have confirmed that the use of a single-stage procedure (LC+LCBDE) results in equivalent common bile duct (CBD) stone clearance rates with the added benefit of a shorter hospital stay and improved cost-effectiveness when compared with the two-stage approach (LC+ERCP).1–4 Similar studies have confirmed these findings in pediatric populations.5,6
Despite these data, the clinical utilization of ERCP far exceeds that of LCBDE or open CBDE, with recent estimates suggesting that ERCP is chosen 93% of the time compared with 7% of cases managed surgically. 7 One explanation for this large disparity is increasing surgeon unfamiliarity with operative management of the biliary tract. 20
To address the gap in surgical training, a low-cost LCBDE simulator incorporating the laparoscopic, endoscopic, and fluoroscopic views was developed and evaluated. 8 Utilizing the simulator, a curriculum based on achievement of the mastery standard was developed and tested on the general surgery residents at Northwestern University that demonstrated universal achievement of the mastery standard among senior surgical trainees after implementation of a didactic curriculum and deliberate practice on the LCBDE simulator. 9 Use of the mastery learning standard in simulation-based education has been shown to have implications for clinical practice, including improved patient outcomes. 21
For the purpose of this study, we focused on the evaluation of validity evidence relevant to test content and internal structure. The relatively high Outfit MS values associated with the Ability domain suggest that participants' ability was variable, but not extreme. Taking the OAs into account (all aligning with somewhat easy to perform), variability in self-reported ability levels seems to reflect an authentic skill variability among participants that parallels studies performed in the past with participants with varying degrees of experience.22–24
There are a number of limitations in the interpretation and application of the findings in this study. The first limitation is associated with the small sample size. Although a sample of 30 is considered adequate for low-stakes settings such as this, 25 a larger sample size would have increased the precision of the measures. The second limitation is associated with homogeneity and composition of the sample. Although the participants were from a number of institutions, they consisted primarily of trainees. With the exception of one participant who had self-reported performing 10 LCBDE pediatric cases, the majority of participant raters (27, 90%) had not performed more than two pediatric LCBDE cases before evaluating the simulator. The narrow range of experience may have decreased the variability of ratings or may not reflect authentic ratings from expert raters.
In spite of these limitations, we highlight that there were no statistically significant differences when comparing the Qualities ratings from participants with high and low self-efficacy, suggesting that ratings of the simulator's qualities were not dependent on participants' comfort with the equipment, nor their self-efficacy toward performing LCBDE on pediatric patients. We speculate these findings would be consistent if the study was expanded to include a broad sample of experienced pediatric surgeons.
In conclusion, participants rated the LCBDE simulator highly valuable for pediatric surgical education. Initial validity evidence relevant to test content and internal structure suggests that the LCBDE simulator could be used as an educational tool with minor alterations. Based on these findings, pediatric modifications to the existing simulator are ongoing.
Footnotes
Disclosure Statement
No competing financial interests exist.
