Video-Based Skill Assessment of Endoscopic Suturing in a Pediatric Chest Model and a Box Trainer

Abstract

Purpose:

Pediatric endoscopic surgery requires special surgical skills because of the small working space and tissue fragility. This article presents a video-based skill assessment method for endoscopic suturing using a pediatric chest model.

Materials and Methods:

A commercial suture pad was placed in a rapid-prototyped pediatric chest model of a 1-year-old patient to simulate the thoracoscopic repair of esophageal atresia type C. Twenty-eight pediatric surgeons (9 experts, 9 intermediates, and 10 trainees) performed an endoscopic intracorporeal suturing and knot-tying task both in the pediatric chest model and in a box trainer. The tasks were video-recorded and rated by two blinded observers using the 29-point checklist method and a suturing errors score sheet method. The task completion time and the number of needle manipulations were measured.

Results:

The expert group showed better performance than the intermediate and trainee groups in the pediatric chest model, and the differences were larger than those in the box trainer. Significant differences between the expert and the trainee groups were observed in the items related to safety such as the skills for keeping the needle in view at all times. Significant differences between the expert and intermediate groups were observed in the items related to task quality and efficiency such as the smoothness of knot tying and the number of needle manipulations.

Conclusions:

Video-based skill assessment of endoscopic suturing using the pediatric chest model and a box trainer distinguished pediatric endoscopic surgeons according to their clinical experience, and pediatric-specific skills were identified.

Introduction

Pediatric endoscopic surgery requires special surgical skills because pediatric patients are smaller and their tissues are more fragile compared with adult patients. Operations on neonates and infants are particularly difficult and often cause complications,¹ but expert pediatric surgeons with abundant clinical experience can perform surgical tasks precisely and gently even in such a small body cavity. Japan has a pediatric endoscopic surgical skill qualification system to certify pediatric surgical experts by evaluating the number of such surgeries performed (the surgical volume) and by inspecting videoed operations.² In the latter, nominated expert pediatric endoscopic surgeons evaluate videos submitted by the applicant in order to judge whether sufficient skills are apparent. Because applicants need to have a surgical volume of at least 20 advanced pediatric endoscopic operations prior to review, this system is not available for inexperienced surgeons who want to assess their own surgical skills.

Another problem in the pediatric domain is related to the training of minimally invasive surgical skills. Inexperienced pediatric surgeons have to improve those skills, but acquiring sufficient experience is difficult in the pediatric endoscopic surgery field. The wide variety of pediatric diseases often accompanied by congenital anomalies limits opportunities to experience many specific operations, and pediatric-specific training tools are not available. Commercially available box trainers are designed for the training of adult laparoscopy, and child-size trainers of this type are not generally available. Ieiri et al.³ reported that pediatric surgeons who underwent endoscopic surgical training together with general surgeons demonstrated better economy and faster speed in forceps manipulation, but the number of errors was increased. This suggests that the development of pediatric-specific training method is essential.

Few studies on pediatric-specific skill assessment and training have been reported. Some studies showed that small box trainers or virtual reality trainers simulating a pediatric small body cavity are useful.^4,5 Training using animal models is also useful,⁶ but it is costly and time-consuming and has ethical problems. Recently, several realistic training models have been reported. Ieiri et al.⁷ developed a suture ligature model of the crura of the diaphragm to assess the skills required for laparoscopic fundoplication. Davis et al.⁸ and Barsness et al.⁹ developed three-dimensional rapid-prototyped models simulating esophageal atresia and congenital diaphragmatic hernia. Unfortunately, these models are not commercially available, probably because of the small market for pediatric applications. Furthermore, companies and engineers don't recognize the need for pediatric-specific training tools because pediatric-specific skills have not been properly delineated in the field.

For these reasons, we have developed a three-dimensional rapid-prototyped pediatric chest model equipped with numerous sensors for quantitative assessment of pediatric surgical skills. This model has been used to evaluate intracorporeal suturing tasks by analyzing the sensor data.¹⁰ Here, we present an analysis of video-based assessment of the suturing task in the pediatric chest model, compared with performance in a conventional box trainer. We demonstrate the usefulness of the pediatric chest model for assessing the specific skills required for pediatric minimally invasive surgery.

Materials and Methods

The protocol of this study was approved (protocol number 10033) by the Ethical Committee of the Graduate School of Medicine and Faculty of Medicine, The University of Tokyo. Examinees were well briefed on the experiment, and written informed consent was obtained from all.

Development of the pediatric chest model and assessment method

This section summarizes previously reported work¹⁰; the data were collected in that study but are analyzed here. A pediatric chest model comprising a three-dimensional rapid-prototyped pediatric rib cage with accurate anatomical dimensions was developed using computed tomography volume data of a 1-year-old male patient. A commercially available suture pad (Suture Evaluation Simulator M57; Kyoto Kagaku Co., Kyoto, Japan) was placed inside the rib cage in front of the third thoracic vertebra to simulate the thoracoscopic repair of esophageal atresia type C (Fig. 1a). The pediatric chest model was arranged in the left hemidecubitus position. A 5-mm port was inserted as a camera port in the fifth intercostal space at the lower edge of the scapula, and a 5-mm 30° endoscope was used. Two 3-mm ports were inserted into the right thoracic cavity in the third intercostal space of the midaxillary line and in the fifth intercostal space of the dorsal area. This port placement is commonly used for surgery of esophageal atresia type C. For comparison with this pediatric model, a commercial box trainer with a flexible camera (K-ZWEI; B. Braun Aesculap, Tuttlingen, Germany) with the same suture pad was prepared (Fig. 2a).

FIG. 1.

Pediatric chest model setup. (a) The suture pad is inserted through the head side of the pediatric chest model. (b) Experimental view.

FIG. 2.

Box trainer setup. (a) The suture pad is placed on the center of the box trainer. (b) Experimental view.

A skills assessment experiment was conducted at a domestic pediatric conference by recruiting 28 pediatric surgeons whose characteristics are shown in Table 1. Examinees were stratified into three groups according to their laparoscopic fundoplication experience: an expert group of 9 surgeons with a caseload of 20 or more; an intermediate group, also of 9 surgeons, with 1–19 cases; and a trainee group of 10 surgeons with no laparoscopic fundoplication experience.

Table 1.

Surgeons' Characteristics

	Expert	Intermediate	Trainee
Number	9	9	10
Certified by JSES	5	0	0
Clinical experience (years) (mean±SD)	23.4±7.4	9.4±2.9	6.7±2.4
Fundoplications (cases) (mean±SD)	75.6±62.7	8.2±5.5	0±0

JSES, Japan Society for Endoscopic Surgery; SD, standard deviation.

Each surgeon performed an endoscopic intracorporeal suturing and three knot-tying task in a box trainer setup (Fig. 2b) and then in the pediatric chest model setup (Fig. 1b). Before each measurement, they practiced suturing for 5 minutes. The examinees were instructed to perform the task accurately, safely, and quickly and to close the open cut of the suturing model using the intracorporeal slip knot technique if possible. A 5-0 PDS II suture with a 13-mm, 3/8-inch circle needle (Z126H; Ethicon Endo-Surgery, Cincinnati, OH), whose thread was cut at 100 mm, was used. The polyurethane rubber disposable suturing skin model was replaced with a new one for each trial. In the pediatric chest model setup, the thoracoscope was manipulated by the same pediatric surgeon. When a subject failed to finish the task, he or she restarted the task from the beginning. The tasks were video-recorded.

Video-based skills assessment

The videos were rated by two blinded pediatric surgeons using two evaluation methods: the 29-point checklist method and the suturing errors score sheet method.^11–14 The rubrics we used are shown in Tables 2 and 3. The observers learned each scoring method using sample videos until the scoring results between them agreed. The 29-point checklist method was first reported by Moorthy et al.¹¹ in 2004. This checklist consists of six categories of 29 items, which are scored as 1 or 0. The total score and the subtotal score of each category were compared among the three test subject groups for each setup. We modified the suturing errors score sheet method originally proposed by Sickle et al.¹⁴ for the assessment of suturing and knot-tying in the laparoscopic Nissen fundoplication. The original definition of the errors was used, and a new item was added as shown in Table 4. Each recorded video was divided into nine steps as defined in Table 5, and the observers checked for errors at each step. Multiple errors in a single step were counted as one error. The total error count and the subtotal score of each error item were compared among the three groups for each setup. In case of repeated trials for the task, the video of the successful task was used for analysis, but a penalty was given by decreasing the checklist score by 20% and increasing the error score by 20%. Additionally, the task completion time and the number of the needle manipulations¹⁴ were measured from the videos by one observer and compared among the three groups for each setup.

Table 2.

The 29-Point Checklist

Category, number	Explanation	Yes=1	No=0
Needle position-1 (entry to incision)
1	Held at one-half to two-thirds from the tip
2	Angle=90°±20°
3	Uses tissue or other instrument for stability
4	Attempts at positioning (3 or <3)
Needle driving through tissue-1 (entry to incision)
5	Entry at 60°–90° to the tissue plane
6	Driving with one movement
7	Single point of entry through tissue
8	Removing the needle along its curve
Needle position-2 (incision to exit)
9	Held at one-half to two-thirds from the tip
10	Angle=90°±20°
11	Uses tissue or other instrument for stability
12	Attempts (3 or <3)
Needle driving-2 (incision to exit)
13	Driving with one movement
14	Removing the needle along its curve
Pulling the suture through
15	Needle on needle holder in view at all times
16	Using pulley concept or walking along the suture
Technique of knots
17	Two-handed overwrap/underwrap followed by same or if one–handed, one followed by the other
18	Correct C loop (no S or O loops)
19	Smoothly executed throw, no fumbles
20	Correct inverse C loop (no S or O loops)
21	Smoothly executed throw, no fumbles
22	Knot squared (capsized reef/surgical)
23	Correct third C loop (no S or O loops)
24	Smoothly executed throw, no fumbles
Knot slippage
25	Knot left loose to slip
26	Knot slippage attempts 3 or <3
Knot quality
27	All throws squared
28	Not too tight or too loose
29	All knots laid on the side (not over the incision)

Table 3.

The Suturing Errors Score Sheet

	Missed grasp	Tear/injure tissue	Instrument not assisting	Excess manipulation	Incomplete/repeat bite	Needle out of view	Missed loop	Tail looped	Fail to slip knot	Fail to square knot
Insert/orient
First bite
Second bite
First throw
Second throw
Slip knot
Cinching
Squaring
Third throw
Total

Table 4.

Definitions of the Errors

Error	Explanation
Missed grasp	The jaws of the instrument are opened and closed without retaining the desired target (either suture/tissue/needle).
Tear/injure tissue	Tearing tissue with either manipulation or retraction, a placed suture tearing through the tissue, or tissue injury by contact with the needle
Instrument not assisting	The instrument not holding the suture is not actively engaged in assisting in the performance of the step or is out of view while not actively participating in the procedure (i.e., holding exposure, holding the suture).
Excessive manipulation	Either the needle or the suture is grasped more than two times during a step. The contact of the grasper with the suture to slide the knot down does not count.
Incomplete/repeated bite	Once the tip of the suture needle engages the target tissue, it either is disengaged or fails to completely traverse (i.e., the tip is not seen, or once seen does not remain visible) the tissue without additional manipulation.
Needle out of view	A grasped needle is completely out of view. A grasped suture with a hanging needle does not count. If the needle is out of view because of a primary scope problem, no error is scored.
Missed loop	Once an attempt to loop the suture around the instrument is initiated, it is not completed.
Tail looped	When the suture tail is pulled through to make a knot, it loops and requires a release and additional manipulation to free the loop.
Failure to slip knot	The second knot does not close the slit of the suture pad enough, meaning the slip knot failed or was not carried out.
Failure to square knot	Once the slip knot is in place, it is not squared.

Table 5.

Definition of the Suturing Steps

Step	Explanation
Insertion/orientation	From the start of the exam until the tip touches the suture pad
First bite	From the time the tip touches the suture pad until it completely traverses the pad, is completely withdrawn, and touches the opposite side
Second bite	From the time the tip touches the opposite side until the suture proximal to the needle is grasped.
First throw	From the time the needle side suture is grasped until the non-needle tail is released after completion of the throw
Second throw	From the release of the first tail to the release of the second tail after completion of the throw
Creation of slip knot	From the release of the second tail until the knot is moved
Cinching of knot	From the beginning of knot movement until the short tail is grasped
Squaring of knot	From when the short tail is grasped until the long tail is grasped just proximal to the needle
Third throw	From proximal grasping of the needle side tail until the short tail is released after completion of the throw

Statistical analysis

The parameters were compared across the three groups using the Kruskal–Wallis test. The Steel–Dwass test was used to analyze the differences among the groups. To determine the interrater reliability of the checklist score method and the error score method, Cronbach's alpha coefficient was used. After confirmation of the interrater reliability of the methods, the average of the two observers' scores was used for assessment. All analyses were performed using JMP statistical software (SAS Institute, Inc., Cary, NC), and a P value of <.05 was deemed statistically significant.

Results

There was a significant difference across the three groups for the total score of the 29-point checklist in the box trainer (P=.024) and in the pediatric model (P=.016) (Fig. 3). The score of the expert group was significantly higher than the trainee group in the box trainer (P=.047) as well as in the pediatric model (P=.027). The score of the intermediate group was close to that of the expert group and higher than that of the trainee group in the box trainer (P=.101), but lower than the expert group in the pediatric model (P=.094).

FIG. 3.

Comparison of the total score of the 29-point checklist. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

There was a significant difference across the three groups for the total error score in the box trainer (P=.010) and in the pediatric model (P=.032) (Fig. 4). The expert group showed significantly lower scores than the trainee group in the box trainer (P=.030) and in the pediatric model (P=.024). The score of the intermediate group was close to the expert group and lower than the trainee group in the box trainer (P=.051), but close to the trainee group in the pediatric model.

FIG. 4.

Comparison of the total score of the suturing errors score sheet. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

There was also a significant difference in the completion time across the three groups in the box trainer (P=.025) and in the pediatric model (P=.004) (Fig. 5). The expert group completed the task significantly faster than the trainee group both in the box trainer (P=.034) and in the pediatric model (P=.005). The task completion time of the intermediate group was close to that of the expert group in the box trainer, but longer than that of the expert group in the pediatric model. There was a significant difference across the three groups for the number of needle manipulations in the pediatric model (P=.011) but not in the box trainer (Fig. 6). The number of needle manipulations of the expert group was significantly lower than that in the intermediate group (P=.045) and the trainee group (P=.019) in the pediatric model.

FIG. 5.

Comparison of the completion time. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

FIG. 6.

Comparison of the number of needle manipulations. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. The points denote outliers. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

Comparison of the subtotal score of the checklist categories and error items showed significant differences across the three groups in the pediatric chest model but not in the box trainer for the items “Pulling the suture through” (P=.008), “Technique of knotting” (P=.003) (Fig. 7), and “Needle out of view” (P=.004) (Fig. 8). The score of the expert group was significantly higher than that of the trainee group for “Pulling the suture through” (P=.011); also, the score of the expert group for “Technique of knotting” was significantly higher than that of both the intermediate and the trainee groups (P=.034 and P=.003, respectively). Finally, the error score of the expert group for “Needle out of view” was significantly lower than that of the trainee group (P=.010).

FIG. 7.

Comparison of the subtotal score of categories in the 29-point checklist. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

FIG. 8.

Comparison of the subtotal score of “Needle out of view” in the suturing errors score sheet. The central black line is the median, the data within the box are the interquartile range, and the ends of the vertical line denote the whole range. The points denote outlines. *P<.05 by Steel–Dwass test. EX, expert group; IN, intermediate group; TR, trainee group.

The interrater reliability for the 29-point checklist score method was 0.86, and that for the suturing error score method was 0.90. Both values were considered to be sufficiently high.

Discussion

The results show that the expert group performed endoscopic suturing significantly better than the trainee group in both setups for almost all metrics. This suggests that these video-based skill assessment methods are able to distinguish differences in relation to the clinical experience among the examinees. More important is that assessment in the pediatric chest model was able to detect relevant differences between the expert group and the intermediate group. The overall performance of the intermediate group was equivalent or slightly worse than that of the expert group in the box trainer, whereas it was clearly worse in the pediatric chest model. This suggests that surgeons in the intermediate group lacked the surgical skills required for optimal achievement of the task in a small workspace, whereas more experienced surgeons had acquired these skills. Thus, the items that distinguish the experts from the intermediates must be the pediatric-specific skills which the expert surgeons acquired by performing many clinical procedures. Hence, video-based skills assessment using the pediatric chest model setup is better for the assessment of pediatric-specific expert skills than using the box trainer setup.

In this study, the video-based skill assessment items were more useful than completion time for detecting differences among the examinees. Faster surgery is always preferable, and thus the task completion time is often used as a valid metric for surgical skill assessment.¹⁵ However, other metrics such as quality and safety are more important for clinical outcomes than the speed. Because expert pediatric surgeons perform surgery precisely and gently, the 29-point checklist method designed for quality assessment and the suturing errors score sheet method designed for safety assessment are better for pediatric skill assessment. Another advantage of these methods is the high interrater reliability, one of the reasons for which is probably that every checklist item was anchored by explicit criteria for that particular item to be scored as a 1 or a 0.¹¹

The analysis of the subtotal scores of all checklist categories and individual errors revealed significant differences between the expert and trainee groups in the pediatric chest model setup regarding items related to the ability to keep the needle in view at all times, techniques for avoiding possible tissue damage, and the knot-tying technique. These skills are related to safety and can be acquired in an initial phase of pediatric endoscopic surgical training. In the pediatric chest model setup, significant differences were observed between the expert group and the intermediate group in those items related to the number of needle manipulations and to the knotting technique. These skills are related to task quality and efficiency¹⁴ and are required to become an expert pediatric endoscopic surgeon. In other words, the pediatric-specific expert skills can be related to task quality, efficiency, and safety.

Although intracorporeal suturing, especially requiring the slip knot technique, is rather unfamiliar to pediatric surgeons, it is nonetheless preferable for pediatric skill assessment compared with other tasks such as a peg transfer, a pattern cutting, ligation loop, or extracorporeal suturing.⁴ Additionally, the PDS suture used in the experiment was not sufficiently flexible, and its manipulation took some time for inexperienced surgeons. As fumbles in the knot-tying task negatively influence both the 29-point checklist score and the suturing errors score, the difficulty of suture manipulation resulted in large differences of the scores between the expert group and the trainee group.

Limitations of the present study include the difficult grouping of the examinees. They were divided into three groups on the basis of their laparoscopic fundoplication experience because this operation is the most common among all advanced endoscopic applications in the pediatric surgical field. We considered surgeons who had performed 20 or more fundoplication operations as experts equivalent to certified surgeons, although fundoplication is rarely performed in neonates and infants. The Japanese qualification system for pediatric endoscopic specialists requires applicants to have experience of at least 20 advanced endoscopic operations such as fundoplication.² Thus, this definition of expert was applied in this study, although it does not refer specifically to pediatrics. Another limitation relates to the veracity of the pediatric chest model, which reproduced only some pediatric-specific features, including a small body cavity, interference by instruments and arms, a narrow endoscopic view, and the port placement. Simulation of hemorrhage, tissue fragility, and the tense atmosphere would improve the reality of the model in the future.

Future development of techniques for training for the identified pediatric-specific skills would make the pediatric chest model a good endoscopic surgical training and assessment platform for pediatric surgeons. The experimental results of the present study suggest that advanced learners such as the intermediate surgeons need training in a realistic setup, for which the pediatric chest model can be a good option. In the future, task performance improvement by training using the pediatric chest model will be investigated in both experimental and clinical settings.

In conclusion, video-based skill assessment of endoscopic suturing in a pediatric chest model and a box trainer was applied to evaluate pediatric endoscopic surgeons with different degrees of clinical experience. The expert group demonstrated a better performance than the intermediate and trainee groups in the pediatric chest model; these differences were larger than seen for the box trainer. The results suggest that this method can better assess pediatric-specific expert skills acquired by performing many clinical procedures. Safety was assessed by the ability of the operator to keep the needle in view at all times and by the techniques employed for avoiding possible tissue damage. Quality and efficiency, as a measure of advanced pediatric-specific skills, were assessed by smooth knot tying and efficient needle manipulation. The pediatric chest model with a training program for the identified pediatric-specific skills for safety, quality, and efficiency is a superior endoscopic surgical training and assessment platform for pediatric surgeons.

Footnotes

Acknowledgments

The authors thank Prof. Yuji Nirasawa of Kyorin University for kindly providing the opportunity for conducting the experiments. This study was partially supported by a Grant-in-Aid for Scientific Research (B) (number 26293378), Grant-in-Aid for Scientific Research (S) (number 23226006) from the Ministry of Education, Culture, Sports, Science and Technology, and the project “Assessment methodology for innovative minimally invasive therapeutic devices, materials, and nano-bio diagnostic devices” from the Accelerating Regulatory Science Initiative, Ministry of Health, Labour and Welfare, Japan.

Disclosure Statement

No competing financial interests exist.

References

Iwanaka

, Uchida

, Kawashima

, et al. Complications of laparoscopic surgery in neonates and small infants. J Pediatr Surg, 2004; 39:1838–1841.

Iwanaka

, Morikawa

, Yamataka

, et al. Skill qualifications in pediatric minimally invasive surgery. Pediatr Surg Int, 2011; 27:727–731.

Ieiri

, Nakatsuji

, Higashi

, et al. Effectiveness of basic endoscopic surgical skill training for pediatric surgeons. Pediatr Surg Int, 2010; 26:947–954.

Azzie

, Gerstle

, Nasr

, et al. Development and validation of a pediatric laparoscopic surgery simulator. J Pediatr Surg, 2011; 46:897–903.

Hamilton

, Kahol

, Vankipuram

, Ashby

, Notrica

, Ferrara

. Toward effective pediatric minimally invasive surgical simulation. J Pediatr Surg, 2011; 46:138–144.

Narayanan

, Cohen

, Shun

. Technical tips and advancements in pediatric minimally invasive surgical training on porcine based simulations. Pediatr Surg Int, 2014; 30:655–661.

Ieiri

, Ishii

, Souzaki

, et al. Development of an objective endoscopic surgical skill assessment system for pediatric surgeons: Suture ligature model of the crura of the diaphragm in infant fundoplication. Pediatr Surg Int, 2013; 29:501–504.

Davis

, Barsness

, Rooney

. Design and development of a novel thoracoscopic tracheoesophageal fistula repair simulator. Stud Health Technol Inform, 2013; 184:114–116.

Barsness

, Rooney

, Davis

. The development and evaluation of a novel thoracoscopic diaphragmatic hernia repair simulator. J Laparoendosc Adv Surg Tech A, 2013; 23:714–718.

10.

Harada

, Takazawa

, Tsukuda

, et al. A construct validity study of a sensorized pediatric chest model. Int J CARS, 2014; 9(Suppl 1):S139.

11.

Moorthy

, Munz

, Dosis

, Bello

, Chang

, Darzi

. Bimodal assessment of laparoscopic suturing skills: Construct and concurrent validity. Surg Endosc, 2004; 18:1608–1612.

12.

Aggarwal

, Hance

, Undre

, et al. Training junior operative residents in laparoscopic suturing skills is feasible and efficacious. Surgery, 2006; 139:729–734.

13.

Kroeze

, Mayer

, Chopra

, Aggarwal

, Darzi

, Patel

. Assessment of laparoscopic suturing skills of urology residents: A pan-European study. Eur Urol, 2009; 56:865–872.

14.

Van Sickle

, Baghai

, Huang

, et al. Construct validity of an objective assessment method for laparoscopic intracorporeal suturing and knot tying. Am J Surg, 2008; 196:74–80.

15.

Mason

, Ansell

, Warren

, Torkington

. Is motion analysis a valid tool for assessing laparoscopic skill?. Surg Endosc, 2013; 27:1468–1477.