Master and Apprentice or a Slave to Technology? A Randomized Controlled Trial of Minimal Access Surgery Simulation-Based Training Techniques

Abstract

Introduction:

This study set out to assess the efficacy of three different approaches to simulation-based minimal access surgery (MAS) training using a three-dimensional printed neonatal thoracoscopic simulator and a virtual simulator.

Materials and Methods:

Randomized controlled trial of medical students (N = 32), as novices to MAS. The participants performed two construct validated tasks on a thoracoscopic simulator and were then randomly allocated into four intervention groups: (1) three consultant-led sessions on a thoracoscopic simulator; (2) three self-directed learning sessions on the same simulator; (3) self-directed “virtual training” on the “SimuSurg” application; and (4) control. Postintervention participants repeated both tasks. Videos of all task attempts were de-identified and marked by a blinded consultant pediatric surgeon.

Results:

There were no statistically significant differences in baseline objective structured assessment of technical skills (OSATS) scores or demographics in any group. For the “ring transfer” task, Groups 1 and 2 showed significant improvement after intervention, with no significant change in Groups 3 or 4. There was no significant difference between Groups 1 or 2 in postintervention scores. For the “needle pass” task, no group demonstrated a statistically significant improvement after intervention.

Conclusion:

Practice on a physical simulator either consultant-led or self-directed led to improved scores for MAS novices compared with a virtual simulator or no intervention for a simple “ring transfer” task. This suggests that time on the physical simulator was the most important factor and implies that trainees could usefully practice simple tasks at their convenience rather than require consultant supervision. This improvement is not seen in more challenging tasks such as the “needle pass.”

Introduction

The rapid technological advances in three-dimensional (3D) printing and computing are enabling surgical simulators, both physical and virtual, to be developed. These include simulators of pediatric surgical procedures, particularly for minimal access surgery (MAS). A simulator offers obvious advantages to the established apprentice model of operative training, including greater patient safety, more time dedicated to learning (in particular, the complex visuospatial and psychomotor skills required for MAS), and a lower stress level for the novice while still retaining a relatively realistic learning environment. It is critically important to ensure that simulators are fit for purpose and attain validity. For example, construct validity ensures that the task can successfully distinguish between novices and experts and provides confirmation that it is testing a clinically relevant skill set. Establishing construct validity for these simulators helps to define their role in upskilling surgical trainees and preparing them for surgical training programs.

The primary simulator used in this study was a 3D printed thoracoscopic simulator developed to simulate repair of esophageal atresia/tracheoesophageal fistula (EA/TEF) in a neonate.¹ This study builds on previous work on this 3D printed thoracoscopy model that established two construct validated tasks (presented at BAPS Congress 2019). A smartphone application developed by Cmee4 Productions and accredited to the Royal Australasian College of Surgeons (RACS) called “SimuSurg,” a simple virtual simulator for MAS skills that was chosen for its ease of access and ease of use, was also included for comparison.² This study looks at the efficacy of these two simulators in improving the MAS skills of novices, by using three different approaches. Two focus on the physical simulator, a consultant-led approach (more typical of the classical apprenticeship model) and a self-directed learning model. The other assesses the efficacy of virtual simulation on the “SimuSurg” application. The potential advantages and disadvantages of each training approach are summarized in Table 1.

Table 1.

Comparison of Different Simulation Training Approaches

Approach	Advantages	Disadvantages
Physical simulation with consultant-led (apprenticeship) model	• Direct transfer of expert skills from expert to trainee • Personalized real-time feedback • Physical model is more similar to real-life MAS	• Resource intensive—the expert has to invest considerable time • Potentially stressful for the trainee
Physical simulation with self-directed learning model	• More freedom for individualized learning. • Less stressful for the trainee • Physical model is more similar to real-life MAS	• No direct passage of expert skills • No direct expert feedback • Resource intensive
Virtual simulation with smartphone application	• Fewer resources required • Portable, can be used anytime, anywhere • Less stressful for the trainee • Application developed using the skills of many experts, which the trainee can learn from	• Skills practice is less like real life MAS—no haptic feedback, only simple visuospatial feedback, that is, low fidelity • No direct expert feedback • Amount of practice time is hard to measure and relies on apprentice motivation

MAS, minimal access surgery.

The hypothesis was that the consultant-led approach would lead to superior improvement in trainees' skills, but the magnitude of its superiority would influence which of the approaches would be more suitable for incorporation into surgical training.

Materials and Methods

This was a randomized controlled trial (RCT) conducted between May and June 2019. Initial ethical approval was granted by the University of Otago (D18/008) as a flow-on of the previous affiliated project.¹ Novices to MAS, defined as having never participated in any formal surgical training outside of interest workshops attended during medical school, were recruited from the medical student body of the Christchurch School of Medicine of the University of Otago. Before entering the study, students were given a participant information sheet and signed consent forms. Thirty-two students participated in the testing and training (twelve 4th-year students, eleven 5th-year students, and nine 6th-year students).

The locally developed 3D printed thoracoscopic simulator was used, along with 3 mm MAS instruments (Maryland forceps, needle holder). The simulator consisted of a synthetic model of a neonate's thorax, including 3D printed plastic ribcage, silicon skin, and two 3D printed inserts for MAS tasks.¹ “SimuSurg” has been developed by Cmee4 Productions for the RACS as an engaging, interactive way to introduce MAS to novices, in the form of a smartphone application. Trainees advance through the four levels of the training application, familiarizing themselves with MAS instruments and learning about the different axes of movement and movement scaling, with increasingly challenging MAS tasks.²

Two tasks that have previously achieved construct validity on the EA/TEF simulator were defined in writing, and consultant-performed video exemplars of the tasks were obtained for participants to view. The first task, called the “ring transfer,” involved using the Marylands to move a ring between two pegs inside the model, first from peg to peg, then passing the ring between each Marylands before placing it on the next peg. The second task, and the more difficult of the two, called the “needle pass,” had a different set up with two adjacent loops inside the model. It involved mounting a needle on the needle holder, passing it under one loop to the Marylands, mounting it a second time, then passing it under a second loop to the Marylands¹ (Fig. 1).

FIG. 1.

Camera images from both tasks: ring transfer (A) and needle pass (B).

Both tasks have previously attained construct validity in distinguishing novice MAS surgeons from intermediate and expert surgeons. In addition, both tasks achieved good inter-rater reliability meaning that a single assessor could be entrusted to mark the tasks using a modified objective structured assessment of technical skills (OSATS).¹

Each of the participants attended a 20-minute baseline session where they had 5 minutes to familiarize themselves with the MAS instruments and the thoracoscopic simulator. Then, for each task, the participants were shown the same short, written description of the task followed by watching a consultant-performed exemplar video of the task. They then had one videoed attempt for each task, with the recording finishing at completion of the task or at the participant's request.

The participants then underwent random allocation. Each of the 32 participants was assigned a random number using an online randomizer,³ these anonymized numbers were then used to randomly allocate eight participants each into the four designated intervention groups.

Interventions

Group 1: Training led by a consultant pediatric surgeon. A pediatric surgeon (senior author) provided three standardized one-on-one 20-minute training sessions on the EA/TEF simulator for each participant over the following 4 weeks. The first session involved orientation to the instruments, tasks, and most ergonomic way to complete the tasks. Subsequent sessions were tailored to the individual needs for skill development.

Group 2: Self-directed learning with the EA/TEF simulator. Participants had three 20-minute sessions with the EA/TEF simulator across the following 4 weeks, observed by a researcher, including access to all equipment needed for both task set-ups. They were free to use this time to practice in any manner they chose.

Group 3: Virtual training through the “SimuSurg” application. Participants were instructed to download this app to their smartphones and advance through the four levels of the application (each containing six lessons) in a gradual manner over the following weeks until their comparison session. This was self-driven with one email to check in on progress and no observation of their learning.

Group 4: No training (control group). Participants had no further training on any type of MAS simulator, physical or virtual.

All participants were instructed not to discuss the details of their interventions with the other participants to avoid introducing bias. Thirty of 32 participants completed their interventions and agreed to proceed to comparison testing. These participants read the same task descriptions and watched the same exemplar videos before performing their one comparison attempt for each task on the EA/TEF simulator.

All videos collected from both baseline and comparison testing were de-identified using alphanumeric codes and their timestamps were digitally removed to prevent bias based on knowledge of baseline versus comparison status. The de-identified videos (including the baseline videos of the 2 participants who did not complete comparison testing) were marked together by a blinded consultant pediatric surgeon.

A modified OSATS score was used, giving each video numeric scores (1, minimum to 5, maximum) for each of time, dexterity, flow of task, spatial orientation, and overall performance of the task.⁴ This gave a total possible score of a minimum of 5 to a maximum of 25. During the previous validation study,¹ time was not included in scoring but was included in this study as it was felt to be a potentially important discriminator for the novice surgeons in their skill development. These were then added to give a total score. Pre- and postintervention scores were matched to their original participants and groups to produce the raw data for analysis (Fig. 2). Ethics were reviewed again retrospectively after intervention and no concerns were raised.

FIG. 2.

Flowchart of study design. Population, preintervention data collection, intervention groups, postintervention data collection and analysis.

Statistical analysis

Medians (interquartile range) were used as the primary measure of improvement as the data were skewed, means and ranges were also calculated. P values were derived using the Kruskal–Wallis test for continuous variables as data were not normally distributed. If the overall P value was less than .05, post hoc multiple testing was carried out to ascertain which pairwise difference was significant. Pairwise P was then compared against .0083 (i.e., 0.05/6 using Bonferroni correction) to confirm significance. When normality assumptions did not hold (such as when comparing pre- and postintervention scores), P values were derived using either paired t-test for means or Wilcoxon signed rank test for medians. If P was less than .05, this indicates the pre–post change per se significantly differs from zero.

Results

Thirty of 32 participants completed their interventions and comparison testing (6.25% lost to follow-up). Twenty participants were women and 12 were men. There was no significant difference in the measured demographics, gender, or year of study, across the four groups.

Task 1: Ring transfer

There was no statistically significant difference in the baseline OSATS scores, in any domain or overall, for the four groups. There was a statistically significant improvement between pre- and postintervention scores, across all domains in Groups 1 and 2 (Table 2). Group 3 had no significant change in pre- and postintervention scores in any domain. Group 4 had a significant decrease in the spatial awareness and time score in the postintervention assessment but no difference in other domains (Table 2 and Fig. 3).

FIG. 3.

Task 1: box plots of OSATS scores (group medians and interquartile ranges) pre- and postintervention by group (left to right: consultant-led, self-directed, virtual, and control). OSATS, objective structured assessment of technical skills.

Table 2.

Task 1 Objective Structured Assessment of Technical Skills Scores Preintervention and Postintervention

	Group 1		Group 2		Group 3		Group 4
	Pre	Post	Pre	Post	Pre	Post	Pre	Post
Time, median (IQR)	2 (1–3)	3 (3–4)	2 (1–2)	4 (2–4)	2 (1–3)	1 (1–2)	3 (2–4)	2 (1–2)
Dexterity, median (IQR)	1 (1–2)	3 (3–4)	1 (1–2)	3 (2–3)	2 (2–3)	1 (1–2)	2 (2–2)	2 (1–2)
Flow of task, median (IQR)	2 (1–3)	3 (3–4)	2 (1–2)	3 (3–3)	2 (1–3)	1 (1–2)	3 (2–3)	2 (1–3)
Spatial orientation, median (IQR)	2 (1–2)	3 (3–3)	1 (1–2)	3 (2–4)	2 (1–3)	1 (1–1)	2 (2–3)	1 (1–2)
Overall performance, median (IQR)	2 (1–2)	3 (3–3)	1 (1–2)	3 (2–3)	2 (1–3)	1 (1–2)	2 (2–3)	1 (1–2)
Total score, median (IQR)	7 (5–12)	15 (14–17)	6 (5–10)	16 (11–17)	9 (6–13)	6 (5–7)	12 (9–13)	7 (6–11)

IQR, interquartile range.

Pairwise comparison of the groups' postintervention scores showed significantly higher scores in Group 1 for all domains and total score compared with Groups 3 and 4 (Table 3). Likewise, Group 2 had statistically significantly higher postintervention scores than Group 3 for all domains except flow of task and for all domains than Group 4 (Table 3 and Fig. 3). There was no difference in the improved scores in any domain between Groups 1 and 2 and no significant change in scores in any domain for Groups 3 and 4 (Table 3).

Table 3.

Task 1 Scores Change—Pairwise Comparison of Pre- and Postchanges Between Groups

Pairwise, P	1 versus 2	1 versus 3	1 versus 4	2 versus 3	2 versus 4	3 versus 4
Time	0.4534	0.0115^*	0.0021^*	0.0217^*	0.0064^*	0.5506
Dexterity	0.7342	0.0022^*	0.0058^*	0.0037^*	0.0091^*	0.3094
Flow of task	0.3215	0.0259^*	0.0030^*	0.1044	0.0108^*	0.8289
Spatial orientation	1.0000	0.0007^*	0.0007^*	0.0042^*	0.0035^*	0.9549
Overall performance	0.8874	0.0066^*	0.0031^*	0.0166^*	0.0083^*	0.9129
Total	1.0000	0.0038^*	0.0019^*	0.0097^*	0.0045^*	0.9580

P < .05.

Task 2: Needle pass

There was no statistically significant difference in the OSATS scores at baseline. Pairwise comparison of the groups' postintervention scores showed no significant difference for any of the measures of performance including total score, although there was a trend toward higher total scores for Groups 1 and 2 compared with Groups 3 and 4; median total score 9 in Groups 1 and 2 versus median total score 7 in Groups 3 and 4 (Fig. 4). As for the improvement between pre- and postintervention scores, none of the groups demonstrated a statistically significant improvement after intervention (Table 4).

FIG. 4.

Task 2: box plots of OSATS scores (group medians and interquartile ranges) pre- and postintervention by group (left to right: consultant-led, self-directed, virtual, and control).

Table 4.

Task 2 Objective Structured Assessment of Technical Skills Scores Preintervention and Postintervention

	Group 1		Group 2		Group 3		Group 4
	Pre	Post	Pre	Post	Pre	Post	Pre	Post
Time, median (IQR)	1 (1–2)	1 (1–2)	1 (1–2)	1 (1–2)	1 (1–2)	1 (1–2)	1 (1–3)	1 (1–2)
Dexterity, median (IQR)	1 (1–2)	2 (1–2)	1 (1–2)	2 (1–2)	1 (1–1)	1 (1–2)	1 (1–2)	2 (1–2)
Flow of task, median (IQR)	2 (1–2)	2 (1–3)	1 (1–2)	2 (1–3)	1 (1–2)	2 (1–2)	2 (1–2)	1 (1–2)
Spatial orientation, median (IQR)	1 (1–2)	2 (1–3)	1 (1–2)	2 (1–2)	1 (1–2)	1 (1–2)	2 (1–3)	2 (1–2)
Overall performance, median (IQR)	1 (1–2)	2 (1–3)	1 (1–2)	1 (1–2)	1 (1–1)	1 (1–1)	1 (1–2)	1 (1–2)
Total score, median (IQR)	7(5–8)	8(6–12)	5(5–8)	7(6–11)	6(5–7)	6(5–8)	6(5–11)	6(5–9)

Discussion

This RCT set out to investigate whether simulation improves the technical skills of novices relevant to MAS and to determine which approach to simulated training is most effective in improving MAS skills, using validated tasks on a physical simulator.

Analysis of the physical simulator

For the “ring transfer” task, a task suited to novices, both consultant-led teaching and self-directed learning using the physical EA/TEF simulator led to statistically significant improvements in performance postintervention. There was no significant difference in the improvement seen in Groups 1 and 2, which suggests that “time on the tools” practicing the task was an effective method to improve skills and a key factor in the improvement seen.

For the second “needle pass” task, a more challenging task, none of the groups demonstrated statistically significant improvement in performance after intervention, although there was a trend to higher postintervention scores in Groups 1 and 2. This task requires needle manipulation in a small space and perhaps is too challenging for MAS novices, but it could be more appropriate for more advanced surgical trainees who already have more experience in MAS. The trend to improved scores in Groups 1 and 2 is promising in this regard, although the lack of significant improvement in the scores may reflect that the study was underpowered to detect the degree of difference.

As predicted, the consultant-led physical simulator approach showed a significant improvement in skills; however, the self-directed physical simulator approach enabled a comparable improvement. This has interesting implications in that the self-directed approach is much more cost- and time-efficient, but according to these results, it may be just as effective at improving the skills of novices as the more resource intensive consultant-led alternative.

This suggests that a training approach based mainly around self-directed learning, with occasional opportunities for consultant-led training, could be sufficient to improve the MAS skills of novice surgical trainees.

Analysis of the virtual simulator

Intuitively, practicing the tasks on a physical simulator, which closely resembles the true surgical environment (high-fidelity), should improve performance on this simulator, and so did in this study. However, against our predictions, the smartphone application “SimuSurg” conferred no detectable improvement in MAS skills. This may indicate that although the simple application can familiarize a novice to the ideas and concepts of MAS, it may not be able to confer MAS skills that can be measured on a physical simulator using validated tasks. The role of such simple, low-fidelity (less similar to the true operating room environment) smartphone applications needs to be studied further to ensure that there is value for the surgical trainees who use them.

Of note, it has been suggested that haptic feedback is more important in virtual simulators than visual feedback.⁵ A 2018 systematic review of simulators teaching laparoscopic skills indicated that haptic feedback is important in complex tasks as these require more precision, and this precision is influenced by haptic feedback. Conversely, it must also be noted that this review also found that haptic feedback seems to result in only minor improvements for novice surgical trainees.⁶ The simple smartphone application tested cannot provide tactile information back to players during tasks.

The unsupervised nature of a smartphone application is also likely a contributor to the lack of improvement; perhaps a slightly more structured intervention, such as having a log of time spent on the app and opportunities to reflect on learning, might facilitate better skill acquisition. There have been studies of structured approaches on high-fidelity virtual reality simulators that have shown promise for acquisition of skills, for example, where construct-validated tasks were used to devise a stepwise, goal-directed virtual reality curriculum specific to learning the skills required for a laparoscopic appendicectomy.⁷ The carefully designed curriculum was key to promoting learning and involved a stepwise increase in task complexity, distributed rather than massed practice, and a specified period of human instruction and feedback.

These methods could be applied to create a more effective approach to using the smartphone application for training, taking advantage of its ease of access, ease of use, and low-resource intensiveness compared with the more expensive higher fidelity, virtual simulators.

Limitations of the study

The strengths of this study included its novice population (allowing a more standard baseline skill level with more potential to see a substantial degree of improvement postintervention), the RCT approach with minimal loss to follow-up, use of objective scoring, and blinded marking. Although there is a growing number of pediatric surgery-specific simulators, few have achieved a sufficient level of evidence of their validity to be endorsed in training programs.⁸ The relatively small sample size (making it susceptible to influence by unpredictable variations in individual participant performance at the time of testing), the short duration of intervention with limited training opportunities for each of the three groups, and the difficulty of standardizing the interventions across participants for each group—especially the nonobserved app group that depended on individual participant motivation—represent potential limitations of the study.

Conclusion

Both consultant-led and self-directed learning using a physical simulator of MAS (EA/TEF thoracoscopic simulator) achieved a statistically significant improvement in the skills of novices in a novice-suited MAS task (“ring transfer”). As a consequence, both approaches could be used in structured MAS training. Application-based virtual learning was not shown to improve skills; the utility of these virtual applications needs to be studied further. No improvement was demonstrated for the more challenging task (“needle pass”), but it may be that this type of task is more applicable to surgical trainees with previous MAS training: a significant improvement in skills might be seen if this population were tested.

Footnotes

Acknowledgments

The authors thank Canterbury District Health Board for the resources associated with and the use of the 3D printed neonatal thoracoscopic simulator.

Disclaimer

This organization had no role in the collection, analysis, or interpretation of data, in the writing of the report or in the decision to submit the article for publication.

Disclosure Statement

D.V.K.N., N.J.C., S.W.B., R.J., and J.M.W. are directors of the company . This company has designed a model for simulation of EA/TEF repair. However, there was no financial input into this project by Symulus.net.

Funding Information

No funding was received.

References

Nair

, Cook

, Yi

, Scott

, Beasley

, Wells

. Construct validation of a three-dimensional printed neonatal thoracoscopic simulator. Can it distinguish expertise? Oral Presentation, Nottingham, UK: BAPS, 2019.

Royal Australasian College of Surgeons [Internet]. Melbourne Vic: RACS, 2018. SimuSurg app wins industry award;

July 4

, 2018. https://www.surgeons.org/News/media-releases/2018-07-04-simusurg-app-wins-industry-award (accessed July 20, 2020).

Random.Org [Internet]. Dublin Ireland: Random.Org; 2019. Random Integer Set Generator; 2019. https://www.random.org/integer-sets (accessed July 22, 2020).

Martin

, Regehr

, Reznick

, MacRae

, Murnaghan

, Hutchison

, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg, 1997; 84:273–278.

Agha

, Fowler

. The role and validity of surgical simulation. Int Surg, 2015; 100:350–357.

Overtoom

, Horeman

, Jansen

, Dankelman

, Schreuder

HWR

. Haptic feedback, force feedback, and force-sensing in simulation training for laparoscopy: A systematic overview. J Surg Educ, 2019; 76:242–261.

Sinitsky

, Fernando

, Potts

, Lykoudis

, Hamilton

, Berlingieri

. Development of a structured virtual reality curriculum for laparoscopic appendicectomy. Am J Surg, 2020; 219:613–621.

Patel

, Aydın

, Desai

, Dasgupta

, Ahmed

. Current status of simulation-based training in pediatric surgery: A systematic review. J Pediatr Surg, 2019; 54:1884–1893.