Abstract
Abstract
Purpose:
We developed a system to objectively verify the endoscopic surgical skills of pediatric surgeons.
Materials and Methods:
We developed a thoracoscopic model of congenital diaphragmatic hernia mimicking a newborn's size. The examinees were divided into Experts (n=10) and Trainees (n=19), and each group performed two tasks (Task 1, reduction of a herniated intestine from the thoracic space to the abdomen; Task 2, perform three suture ligatures of a diaphragm defect using intracorporeal knot-tying). The end points were the time required to complete Task 1, time score calculated using the residual time from the time limit for Task 2, number of complete full-thickness sutures, maximum air-pressure tolerance, degree of diaphragm deformation, and the residual defect areas after suturing. We also evaluated the total path length and velocity of the forceps tips using a three-dimensional position measurement instrument.
Results:
The Experts had significantly superior results for the time for Task 1, time score, number of complete full-thickness sutures, maximum air-pressure tolerance, and degree of diaphragm deformation in Task 2 (all P<.05). We found that the total path length and average velocities for the left forceps were inferior to those of the right forceps in both tasks in the Trainees (both P<.05, respectively), whereas the Expert group showed no significant laterality in these tasks.
Conclusions:
Our model could validate the quality of endoscopic surgical skills and could differentiate between Expert and Trainee pediatric surgeons. The Experts could use their forceps equally well to perform tasks even in a small working space.
Introduction
E
The Japanese Society of Endoscopic Surgeons developed an endoscopic surgical skill qualification (ESSQ) system in 2008 for all surgical fields, including pediatric surgery, to ensure the ability of endoscopic surgeons. 6 There is currently a video-based evaluation system used by expert pediatric surgeons, but it is not an absolutely objective evaluation system. An economical and reproducible evaluation system equivalent to the ESSQ system is needed for safe and precise pediatric endosurgery.
Our group investigated and reported the effectiveness of basic endoscopic surgical skill training for pediatric surgeons and compared the findings with those for general surgeons that were reported previously. 5 We then developed and validated a pediatric surgery–specific disease model with an objective evaluation system for pediatric surgeons. 7 We also established a new full procedure disease model to evaluate the advanced endoscopic surgical skills of pediatric surgeons.
The aim of this study was to validate our new objective evaluation system for the endoscopic surgical skills of pediatric surgeons using this disease model.
Materials and Methods
A new endoscopic surgical skill validation system
We developed a thoracic model of congenital diaphragmatic hernia (CDH) repair mimicking a newborn case (body weight, 3 kg; diaphragm defect size, 1.5×1.0 cm) as shown in Figure 1. We established this model to replicate the full procedure of thoracoscopic repair of a CDH. Thoracoscopic repair of CDH in a neonate is a simple procedure but requires highly advanced endoscopic surgical skills. This model assesses several procedures, including the reduction of the intestine and suture ligature. We developed this evaluation model in collaboration with Kyoto Kagaku Co., Ltd. 8 The thoracic cavity was made based on computed tomography data from neonates and was covered by a soft skin sheet (Fig. 1a). The left and right thoracic cavities were divided by a mediastinum sheet. The model was placed in the right lateral position. Three trocars (5 mm in diameter, 5 cm in length; Karl Storz, Tuttlingen, Germany) were placed and fixed in the fifth intracostal space at the left posterior axillary line (for the left forceps), the fourth intracostal space at the left midaxillary line (for the camera port), and the fifth intracostal space at the left anterior axillary intracostal space (for the right forceps), the same as in the clinical situation (Fig. 1a). 9 The diaphragm unit is detachable and made in two layers. A sutured diaphragm unit was used for the image analysis and pressure tolerance test (Fig. 1c and d). The scope was 5 mm in diameter and was a 30° type fixed by an arm (Fig. 1b).

An overview of the thoracoscopic model of congenital diaphragmatic hernia repair with a three-dimensional position measurement instrument:
The AURORA® (Northern Digital Inc., Waterloo, ON, Canada) was used as the three-dimensional position measurement instrument and was placed at the abdominal side of the model to record the tracing of the tips of the forceps (Fig. 1b). The right and left forceps had sensors mounted on the tips, and their paths were traced on a computer with an electromagnetic tracking system reported previously, 5 which consists of sensor coils, a field generator, a system control unit, and a sensor interface unit that uses an electromagnetic measurement technology designed for applications requiring precise, real-time, spatial measurements.
Study participants
The participants were divided into two groups. The Expert group included 10 expert pediatric endoscopic surgeons certified using the ESSQ system or corresponding to that skill level. The Trainee group included 19 trainees who were specializing in pediatric surgery but had not experienced more than 10 cases of laparoscopic fundoplication. All of the participants were right-handed.
Tasks for participants
The participants had to perform two tasks as follows. Task 1 was a reduction of a herniated small intestine (5 mm in diameter, 30 cm in length) from the thoracic space into the abdomen cavity (Fig. 2a), and Task 2 was to perform three suture ligatures on the diaphragm defect using intracorporeal knot tying (Fig. 2b). The right forceps was a 3-mm needle driver, and the left forceps was a 3-mm Maryland type forceps (Karl Storz). The suture material was 3-0 poly(ethylene terephthalate) (Ethibond® BB; Ethicon Endosurgery, Cincinnati, OH), which measured 10 cm in length for each suture ligature. After these tasks, the participant's skills were evaluated by eight objective assessment points.

A scope view of each task:
The assessment points
The eight assessment points, which improved upon the methods previously reported by Uemura et al.,
10
were as follows:
1. The time required to complete Task 1 (Task 1 time), which was defined as the performance time from the start to completion of Task 1, measured in seconds. 2. The time score for Task 2 was calculated using the time remaining from 900 seconds (time limit, 15 minutes), measured in seconds. 3. The number of full-thickness sutures was counted from the reverse side (Fig. 3b). Participants had to close the hiatus with three sutures. A pair of stitches recognized by the image analysis system was counted as one complete full-thickness suture for Task 2. This value was measured using an image analysis system connected to a personal computer (Fig. 3a). 4. The maximum air-pressure tolerance, as determined by the internal air pressure inside the sutured artificial diaphragmatic defect. The pressure was considered to be the maximum pressure resistance of the sutured diaphragmatic defect model and was measured in kilopascals (kPa). This pressure value was monitored by the image analysis system and the air-pressure measurement unit connected to a personal computer (Fig. 3a). 5. The degree of diaphragm deformation was measured from the reverse side (Fig. 3b) and was defined by the deformation ratio. This point was chosen as a parameter of the suture tension. This ratio was calculated by dividing the deformed size (in mm2) after suturing by the original size (320 mm2) and determining the percentage. This evaluation was performed using the image analysis system connected to a personal computer (Fig. 3a). If the deformation ratio is high, then the suture tension is considered to be high. 6. The residual defect areas after suturing was defined as the size of the opening space measured from the reverse side (Fig. 3b) and was measured in square millimeters. This value was measured using the image analysis software program connected to a personal computer (Fig. 3a). 7. The total path length of the forceps was considered to be the total spatial movement measured in millimeters. 8. The average velocities of each tip of the forceps were defined as the velocities for each 0.05-second interval.

The suture evaluation system for this model:
To evaluate items 3–6 above, we used the Suture Evaluation Simulator system (Fig. 3a), and the evaluation results are shown in Figure 3c.
Questionnaire survey for participants
A simple questionnaire survey was conducted for each trainee after he or she performed the tasks to investigate how the trainees felt about this model. This questionnaire contained three questions, and there were five answers as follows:
1. How did you feel about the tasks? (Too easy/easy/typical/difficult/very difficult) 2. What did you think about the evaluation points? (Very good/good/proper/needs to be improved/poor) 3. How did you feel about the realism of this disease model? (Very good/good/proper/poor/very poor)
Statistical analysis
All data, except the responses to the questionnaire survey, were expressed as mean±standard deviation values. The statistical analysis was performed using two-tailed paired and unpaired Student's t tests, and P values of <.05 were considered to be statistically significant.
Results
The assessment points
1. Task 1 time. The time required to complete Task 1 in the Expert and Trainee groups was 30.9±11.77 seconds and 48.37±20.69 seconds, respectively (Table 1). The time for the Expert group was significantly shorter than that in the Trainee group (P<.01).
2. The time score for Task 2. The time scores for Task 2 in the Expert and Trainee groups were 252.3±220.39 and 32.11±80.60 seconds, respectively (Table 1). The score for the Expert group was significantly better than that for the Trainee group (P<.05).
3. The number of complete full-thickness sutures. The number of complete full-thickness sutures in the Expert and Trainee groups was 2.90±0.32 and 2.11±1.05, respectively (Table 1). The number of full-thickness sutures in the Expert group was significantly higher than that in the Trainee group (P<.01).
4. Maximum air-pressure tolerance. The maximum air-pressure tolerance in the Expert group was 7.58±4.58 kPa, whereas that in the Trainee group was 2.84±3.43 kPa (Table 1). The maximum air-pressure tolerance in the Expert group was significantly better than that in the Trainee group (P<.05).
5. Degree of diaphragm deformation. The degree of diaphragm deformation in the Expert and Trainee groups was 46.80±8.24% and 56.04±8.67%, respectively (Table 1), with the value in the Expert group being significantly lower than that in the Trainee group (P<.05).
6. Residual defect areas. The residual defect areas in the Expert and Trainee groups were 11.07±14.87 mm2 and 22.19±26.26 mm2, respectively (Table 1). The areas of the residual defects were not significantly different between the two groups (P=.1573).
7. Total path length of the tip of the forceps:
• Comparison between right and left forceps in the Trainee group. In Task 1, the total path lengths of the tips of the right and left forceps were 1069.31±475.15 mm and 704.21±247.53 mm, respectively. In Task 2, the total path lengths of the tips of the right and left forceps were 15,781.64±5591.44 mm and 12,010.73±5203.17 mm, respectively (Table 2). In the Trainee group, the total path length of the left forceps was significantly shorter than that of the right forceps in both Task 1 (P<.01) and Task 2 (P<.05). • Comparison of the right and left forceps in the Expert group. In Task 1, the total path lengths of the tips of the right and left forceps were 833.38±222.29 mm and 646.33±202.46 mm, respectively. In Task 2, the total path length of the tips of the right and left forceps were 13,232.25±6193.67 mm and 11,597.42±6042.48 mm, respectively (Table 3). There were no significant differences between the total path lengths of the tips of the right and left forceps in the Expert group for either Task 1 (P=.2098) or Task 2 (P=.5576). • Comparison of the Expert and Trainee groups. There were no significant differences between the groups for either hand in the total path length of the tip of the forceps for either task (Table 4).
8. Average velocities of the tips of the forceps:
• Comparison between the right forceps and left forceps in the Trainee group. In Task 1, the average velocities of the tips of the right and left forceps were 26.29±11.05 mm/s and 17.84±7.14 mm/s, respectively. In Task 2, the average velocities of the tips of the right and left forceps were 18.17±4.36 mm/s and 13.20±3.33 mm/s, respectively (Table 2). In the Trainee group, the average velocities of the tips of the right forceps were significantly faster than those of the left forceps in both Task 1 (P<.01) and Task 2 (P<.001). • Comparison of the right forceps and left forceps in the Expert group. In Task 1, the average velocities of the tip of the right forceps and those of the left forceps were 22.79±5.40 mm/s and 18.80±6.67 mm/s, respectively. In Task 2, the average velocities were 19.28±3.29 mm/s and 16.78±4.21 mm/s, respectively (Table 3). In the Expert group, there were no significant differences in the average velocities between the right and left forceps in Task 1 (P=.2098) or Task 2 (P=.1553). • Comparison between the Expert and Trainee groups. In the comparison of the Expert and Trainee groups, there was a significant difference in the average velocities of the tips of the forceps in the left hand in Task 2 (P<.05) (Table 4).
Significant difference.
Significant difference.
Significant difference.
Questionnaire survey
Eighteen of the 19 trainees answered Question 1 of the questionnaire survey. Seven (38.9%), 10 (55.5%), and 1 (5.6%) responded that the procedure was “typical,” “difficult,” and “very difficult,” respectively. None of the trainees answered that it was “too easy” or “easy.” Seventeen of the 19 trainees answered Question 2. Four (23.5%), 10 (58.8%), and 3 (17.7%) responded that the model was “good,” “proper,” and “needed to be improved,” respectively. None of the trainees answered that it was “very good” or “poor.” Eighteen of the 19 trainees answered Question 3. One (5.6%), 8 (44.4%), 8 (44.4%), and 1 (5.6%) responded “very good,” “good,” “proper,” and “poor,” respectively. None of the trainees answered “very poor.”
Discussion
This purpose of this study was to validate our new objective evaluation system for endoscopic surgical skills for pediatric surgeons between an Expert group and a Trainee group by using a thoracoscopic model of CDH repair. The major findings of the present study are as follows:
1. We developed a model of CDH repair as an endoscopic surgical skill evaluation system for pediatric surgeons. 2. According to the data obtained from our study participants, the time required to complete Task 1, the time score for Task 2, the number of complete full-thickness sutures, the maximum air-pressure tolerance, and the degree of diaphragm deformation for the Expert group were significantly better than those for the Trainee group. 3. The value of the residual defect areas was not significantly different between the Expert and Trainee groups. 4. In the Trainee group, the total path length and average velocities of the right forceps were significantly longer and faster, respectively, than those of the left forceps. 5. In the Expert group, there were no significant differences between the total path length and average velocities of the right and left forceps. 6. A comparison of the Trainee and Expert groups showed that the average velocity of the left forceps during Task 2 in the Expert group was significantly faster than that of the Trainee group. 7. The tasks performed for this system are not easy but represent a realistic situation for the Trainee group.
The results shown in Table 1 demonstrated that the Expert group had faster and more accurate surgical skills than the Trainee group, as was reported in our previous study. 7 The value of the residual defect areas was not significantly different between the Expert and the Trainee groups. The likely reason for this is that the use of three sutures for the small defect tended to be effective for both groups. Repairing a large defect of the diaphragm may lead to a larger (and possibly significant) difference between the Expert and Trainee groups.
Tables 2 and 3 showed that the Expert group could manipulate both forceps equally well, whereas the Trainee group tended to use the dominant (right) hand. In fact, the Expert group could not only perform a faster and more precise operation, but also had excellent coordination for both hands.
As shown in Table 4, there was only a significant difference in the average velocity of the tips of the forceps in the left hand between the Expert and Trainee groups, although the differences between the right hands were not significant. Because Task 2 required the operators to use sutures, the Trainee group, who were not familiar with suturing, tended to use only the dominant (right) hand instead of the nondominant (left) hand.
There were no significant differences in the total path length of the tips of the forceps in Tasks 1 and 2 or in the average velocities of the tips of the forceps during Task 1. However, the total path length of the tips of both forceps in the Expert group was shorter than that in the Trainee group. This means that the experts manipulate the forceps more economically than the trainees, as would be expected.
To evaluate an individual's endoscopic surgical skills, an objective system that has appropriate feedback for the surgeons would be ideal.11,12 In the United States, the Fundamentals of Laparoscopic Surgery™ (FLS) program was introduced in 2004 by the Society of American Gastrointestinal Endoscopic Surgeons (SAGES) as an endoscopic surgical training program unique to laparoscopic surgery. Since then, this program has been available through the SAGES and the American College of Surgeons and has become the most well-known training program for laparoscopic surgery in the United States.13,14 The FLS program is very simple and can adequately evaluate the surgeon's skills but does not have a feedback system.
We herein established a disease-specific skill evaluation system including all components of the surgical procedures. The thoracoscopic model of CDH repair would be suitable for skill evaluation of all procedures associated with the reduction of a herniated intestine using bilateral forceps (two-hand coordination) and suture ligation of the diaphragmatic defect. Our established system provides participants with transparent and inspiring feedback that is visually presented on a personal computer screen, as shown in Figure 3c. We set up the eight assessment points to evaluate the endoscopic surgical skills of the participants by improving upon our previous report. 10 By measuring the time required to complete the tasks, the total path length, and the average of the velocities of the tip of the forceps, the efficiency of the movement and speed in the forceps manipulations could be evaluated. Additionally, the quality and courtesy (in terms of the behavior) of the suture performance could be evaluated by measuring the number of the full-thickness sutures, air-pressure tolerance, the degree of the deformation, and the residual defect area after suturing by using the suture evaluation simulator. The questionnaire survey was administered to evaluate the degree of difficulty and the realism of this CDH repair model by investigating the trainee's subjective impressions about the tasks, evaluation points, and the model itself. The results of the questionnaire survey showed that the trainees felt that the tasks were difficult. However, this kind of difficulty promotes the surgeon's motivation for training and task achievement. In addition, the evaluation points and reality of the model itself were considered to be reasonable by the Trainee group.
In conclusion, this study revealed that experts possessed speedy, economical, and accurate skills using our thoracoscopic model of CDH repair. The Expert group had excellent two-hand coordination and could use both hands equally, whereas this was not the case in the Trainee group. Our model validated the quality of the endoscopic surgical skills and showed that there were differences between the Expert group and the Trainee group of pediatric surgeons. Our next step is to investigate the effectiveness of this CDH repair model as a training model for pediatric surgeons.
Footnotes
Acknowledgments
We thank Brian Quinn for his comments and help with the manuscript. This study was supported by a Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS KAKENHI grant 25293360) and a grant from The Japanese Foundation for Research and Promotion of Endoscopy.
Disclosure Statement
T.K. is an employee of Kyoto Kagaku Co., Ltd. S.O., S.I., M.U., T.J., R.S., N.M., M.H., and T. T. declare no competing financial interests exist.
