Abstract
BACKGROUND:
Mental workload is one of the important variables in understanding human performance in drone operation.
OBJECTIVE:
To test the effects of gender, age group, flight route, and altitude on the flight performance and mental workload of the novice drone operators.
METHODS:
Ten male and ten female participants without prior drone operating experience joined. They were split into two age groups. After attending a training, the participants operated a drone to perform photo taking missions under flight route and altitude conditions. The weighted NASA Task Load Index (TLX), Modified Cooper-Harper (MCH) scale, heart rate, and interbeat interval were measured to assess the mental workload of the participants. Flight time to complete the mission was adopted to indicate flight performance.
RESULTS:
The effect of age group was significant (p < 0.05) on flight time, weighted TLX score, and MCH score. Flight route and altitude were not significant on the two subjective ratings and two cardiac measures.
CONCLUSION:
The flight performance of younger participants was significantly better than that of their older counterpart. The effects of both the flight route and altitude on the perceived mental workload of the drone operators were insignificant. Both the weighted NASA TLX and MCH scales were appropriate in measuring the mental workload of the novice drone operators.
Introduction
Unmanned aerial vehicles (UAVs) have experienced a huge development in recent years and have been applied in private sectors including cargo delivery, inspections of infrastructure and facilities, ground targets monitoring, survey, area mapping, and others [1–4]. In addition, UAVs are becoming popular among recreational hobbyists. The four-propeller UAV, or the quadcopter, is the most common type due to its stability during flight and easiness of operation [5]. Even though UAV operation is not complicated, accidents are common due to human errors, machine failure, and poor weather. Among these causes, human errors have been blamed as the leading cause of UAV accidents [6]. UAV accidents could jeopardize the property and people on the ground and have become one of the major issues of public safety [7].
UAV flight is becoming popular. Scientists endeavor to investigate the behaviors of UAV operators so as to reduce human errors and thus preventing UAV accidents. Dixon et al. [8] studied the effect of UAV automation on work efficiency and workload reduction for multiple UAV flights. Liu et al. [9] have studied the effect of time pressure on UAV operator performance. Lin et al. [10] studied the psychological states of the participants in performing simulated UAV operations. Li et al. [5, 11] analyzed the distance of the line-of-sight (LOS) of operators when operating a small drone. Peng and Li [12] investigated the perceived difficulty and flight information access of novice drone operators in performing flight missions. Even with these studies, scientific investigations are still required to enhance our understanding of human behaviors in UAV operations.
Mental workload is one of the most important terms in understanding human performance [13]. It is related to task demand, mental effort, and performance [14]. Appropriate mental workload enables an operator to perform effectively and efficiently. Mental underload may cause boredom and monotony while mental overload may lead to mental fatigue and even human errors, which could eventually lead to an accident [15, 16].
There are many approaches to assess mental workload. Subjective ratings are widely used because of their non-instructiveness, ease of use, and low cost [17, 18]. Subjective measures have been adopted to evaluate the mental workload of human pilots [19, 20], drivers [14], train operators [21], power plant operators [22, 23], and electronic factory workers [24]. There are many subjective rating scales in measuring mental workload. The NASA Task Load Index (TLX) is probably the most commonly used one [18, 25]. It is a multi-dimensional subjective rating tool that encompasses six dimensions: mental demand (MD), physical demand (PD), temporal demand (TD), effort (EF), performance (PL), and frustration level (FL) [18–27]. Both the total of the raw [27–29] and weighted [20, 30] score of the TLX may be used to indicate the mental workload of the participants.
Unlike the TLX, the Cooper-Harper (CH) scale is a unidimensional mental workload assessment tool. It was originally developed to measure pilots’ subjective ratings of aircraft controllability [31]. A revised version of this scale (Modified Cooper Harper scale or MCH) [28, 32] was proposed later to include a mental workload component in a decision tree to indicate the operator’s mental demand level. The rating of MCH is ranged from 1 to 10, where 1 corresponds to “minimal mental effort is required and desired performance is easily attainable”, 6 corresponds to “maximum mental effort is required to attain adequate system performance”, and 10 corresponds to “instructed task cannot be accomplished reliably.”
Mental workload may also be measured objectively via examining the electrocardiograph (ECG) data of the participants. A typical ECG includes a P-wave, a QRS complex, followed by a T-wave and a U-wave, each representing different de- and repolarization phases within the cells of heart muscle. The interbeat interval (IBI), or R-R interval, is the time between any two consecutive heartbeats measured on the QRS complex at peak R. This time-domain measure is one of the commonly used heart rate variability (HRV) measures. The IBI will decrease when mental workload increases [28, 34]. In addition to IBI, heart rate (HR) has also been adopted to measure mental workload [34]. The literature indicates that HR increases when the mental workload increases [32, 34].
A significant amount of UAV accidents (21% to 67%) was caused by human factor issues [35]. These numbers show that there is room for improvement in safety if human factor issues can be identified and tackled [7]. When operating a UAV, the operator needs to employ mental effort to manipulate the vehicle to meet mission demands. Operating performance deteriorates, potentially leading to an accident, if the mental demand exceeds a certain level. In order to improve the safety, health, comfort, and performance of the operator, it is essential to measure the mental workload of UAV operators under various flight scenarios. There is a limited amount of research studying the mental workload of UAV operators. Merrel [36] compared three control interface methods of a small drone. He found that the LOS control method was superior to the first-person-view and video-aided controls in term of flight performance and mental workload. Cummings et al. [37] revised the original CH scale to emphasize the evaluation on unmanned vehicle displays (HVD). They claimed that their revision has shifted the emphasis of physical control of the aircraft to the information display of the controller. There were also mental workload studies investigating multiple vehicles control [8], military pilot activities [38], and simulator training of UAV operations [39]. However, more research studying the mental workload in real UAV flight missions are still required.
This research aims to measure the mental workload and flight performance of novice UAV operators in performing real ground target photo-taking missions. Gender, age, flight route, and flight altitude were selected as the factors or independent variables. The former two were operator attributes while the latter two were mission attributes. The hypotheses of this study were that these independent variables have significant effects on both flight performance and mental workload of the novice operators. The objectives of this study were to test these hypotheses. In addition, we compared the results of subjective and objective measures of mental workload and discussed their applicability in differentiating the mental workload of drone operators performing photo-taking missions.
Methods
To accomplish the objectives of the study, a real flight experiment was performed to measure the mental workload and flight performance of UAV operators.
Participants
Twenty healthy adults, including 10 males and 10 females, participated in this study. All of them were students (14) or staff (6) of the university where the authors served. The mean (±std) age of female and male participants were 32.1 (±10.3) and 27.7 (±11.1) yrs, respectively. Their digital corrected visions of the left and right eyes were 0.9 (±0.3) and 0.8 (±0.4), respectively. All the participants claimed that they have normal color vision and hearing functions. None of the participants had prior experience in flying a UAV. The participants were divided into two age groups of equal sample size (10). The ages of group one and two were 21.0 (±1.8) and 38.8 (±7.5) yrs, respectively.
Unmanned aerial vehicle
A quadcopter (DJI®, Phantom 3 professional, Shenzhen, China) was adopted. The height and diagonal size (including propellers) of this quadcopter are 28 cm and 59 cm, respectively. The weight (including battery and propeller) is 1,368 grams. A remote controller was adopted to control the flight of this quadcopter. An iPhone® 7 Plus smartphone was adopted as the display of the remote control. DJI® GO was the application providing a live video transmission feed from the drone to the smartphone. This app showed the local image captured by the onboard camera. In addition, it also displayed real time flight information such as the horizontal distance of the drone to the take-off spot, altitude above ground level (AGL), horizontal and vertical speeds, status of the drone, intensities of the Global Positioning System (GPS) and radio signals of the remote controller, and remaining power (%).
Test-site
The test site was in the stadium of a university campus (see Fig. 1). Before each flight, we used the UAV Forecast app [40] on a smart phone to assess local weather conditions. This app showed that the local wind speed at 75 m AGL, visibility, humidity, and cloud coverage were 16.5 (±11.7) km/h, 16.0 (±0.0) km, 65.8 (±15.7) %, and 67.9 (±34.0) %, respectively. The local illuminance on the ground was 39,800 (±23,193) lx. This was measured using a light meter (Trans Instruments Pte. Ltd., Petro Centre, Singapore).

Test site and flight path of the photo-taking mission: The red arrow is the location of the drone, the circle on the bottom is the take-off and landing spot.
The take-off (also the landing) spot at the test site was in the middle of a circle marked on the ground (Φ1.98 m, in Fig. 1) in the stadium. Target 1 (T1) was a circular cooling machine on the top of building D (approximately 177 m from the take-off spot). Target 2 (T2) was a circular window on the top of building M (78 m from T1). Target 3 (T3) was also a cooling machine on the top of building L (approximately 152 m from T2 and 202 m from the take-off spot, respectively). The total flight distance from the take-off point to T1, T2, T3, and then back to the landing point was approximately 609 m. However, the actual flight distances varied and depended on the flight route of each operator in each trial.
The first flight route (route 1) was the flight from the taking-off spot to T1, then to T2, next to T3, and finally back to the taking-off spot, and landing. The flight route of the second (route 2) was the reverse of route 1: the flight from taking-off spot to T3, then to T2, next to T1, and finally back to the taking-off spot, and landing. In each flight, the participant started the rotors and had the UAV take off, rise to an assigned altitude (40 m or 80 m AGL), and then fly via route 1 or 2 to complete the trial.
Flight performance and mental workload measurements
The flight time, or the time to complete the trial, was measured to indicate the flight performance. Both the subjective and objective measurements of the mental workload of the participants were collected. The subjective measurements were collected using the NASA-TLX and MCH scales. Instead of the 0 to 100 scales, we adopted the rating between 0 and 10 for each of the TLX dimension so that the ratings would be consistent with that of the MCH. The weighted total TLX score was calculated based on pairwise comparisons of the six dimensions by each participant [25]. Both the HR and IBI were measured as the objective measurements of mental workload. These data were collected using a Polar V800 device (Polar Electro OY, Kempele, Finland). This device consists of a heart rate sensor (Polar H7) attached to a chest strap and a smart watch. The validity of the Polar H7 sensor in measuring HR and IBI has been confirmed by the literature [41]. To measure the HR and IBI, the sensor was attached to the chest, and the smart watch was on the wrist of the participant. The data acquisition rate was 1,000 Hz. The data collected by the sensor were transmitted to the smart watch via Bluetooth signals. The data in the smart watch were then uploaded onto a computer to be processed after the trial.
Procedure
Before the first trial of each participant, a mentor gave the participant a brief lecture on the use of the UAV. The participant had the opportunity to practice the flight operation under the guidance of the mentor. The practice was performed with an altitude of 20 m AGL and a distance within 30 m of the drone to the take-off spot. The participant was requested to get familiar with take-off and landing switches, the joystick controls, and the photo-taking function of the UAV.
After the practice, the participant wore the Polar chest strap and the smart watch. The participant then performed the first trial assigned. The participant piloted the UAV to take off, elevate to the assigned altitude, fly to the first target, and take a photo of it. He or she had the UAV fly to the second target and then the third target, and take a photo on each of them. Before a photo was taken, the participant needed to do fine adjustment of the joysticks for target reach. Target reach occurred when the target coincided with the cross of the diagonal grid in the camera window (see Figs. 2, 3, and 4). This was confirmed by the mentor. After finishing taking a photo of the third target, the participants had the UAV returned and landed inside the circle of the taking-off spot. The flight time, heart rate, and IBI during the flight were recorded. The participants were required to complete the pairwise comparisons of the weighting procedure, the ratings of the individual TLX dimensions, and rating of the MCH immediately after the flight. After completing the questionnaire, the participants then performed the second trial. The third and fourth trials were conducted in another session at least one day later. There were four trials (2 flight routes and 2 altitudes) for each participant. The order of the trials for each participant was randomly determined. The weighting procedure was done only after the first trial for each participant. The weights for each TLX dimension were calculated later and were used for all the four trials.

Target reach at T1 at 40 m AGL.

Target reach at T2 at 40 m AGL.

Target reach at T3 at 40 m AGL.
The heart rate and IBI were processed using the KUBIOS HRV standard software (version 3.4, Kubios Oy, Kuopio, Finland). The R-R interval series of this software provided artefact correction, sample selection and trend removal options. By using the artefact correction option, artefacts due to ectopic beats and missed beat detections could be corrected by choosing an appropriate correction level, which removed the artefacts but did not distort normal R-R intervals. When the corrections were applied, detected artefact beats were replaced using cubic spline interpolation [42].
In addition to HR and IBI, the flight time, weighted total TLX, raw score of individual dimension of the TLX, and MCH score were also the dependent variables. Descriptive statistics were performed. Analyses of variance (ANOVA) were also conducted to examine the significance of gender, age group, flight route, and altitude on the dependent variables. The least significant difference (LSD) test was adopted for posterior multiple comparisons of treatment means. The Pearson correlation coefficients were calculated to quantify the correlations of the variables. The significance level of α= 0.05 was adopted. Statistical analyses were performed using the IBM SPSS® 20 software (Armonk, NY, USA).
Results
Pair-wised t-test results indicated that the difference of age between male and female participants was not statistically significant. The difference of age between the two age groups, on the other hand, was significant (p < 0.0001). The age of group two was significantly higher than that of group one.
Flight time
The overall mean (±std) flight time was 548.7 (±184.6) s. The ANOVA results of this variable indicate that the effects of flight route, altitude, and gender were all insignificant. The effects of age group (p < 0.01) and interaction of age group and gender (p < 0.05) were significant. The LSD test results indicated that the flight time of age group 2 (597.7±181.4 s) was significantly (p < 0.05) higher than that (499.7±176.6 s) of group 1. Fig. 5 shows that the discrepancy of the two age groups was primarily caused by the discrepancy of the two age groups of female participants.
NASA TLX and MCH results

Flight time of age groups for male and female participants.
The weights of the individual item of TLX for the participants were shown in Table 1. These weights were adopted to calculate the weighted TLX total score, or simply weighted TLX, for each participant. Table 2 shows the results of the scores of the individual dimension of TLX, weighted TLX, and MCH under experimental conditions. The overall mean (±std) of the weighted TLX score was 4.71 (±1.29). The ANOVA results indicate that the effects of gender, flight route, altitude on the weighted TLX score were all insignificant. The effects of age group (p < 0.05) and interactions of age group and gender (p < 0.05) were significant. The LSD test results indicate that age group 2 (4.99±1.08) had significantly higher weighted TLX score than that (4.44±1.43) of group 1. Figure 6 shows means and standard deviations of the weighted TLX scores categorized by age group and gender.
Normalized weights of the individual dimension of the TLX for the participants
Note: Different super script letters of the average of the normalized weight of the individual dimensions indicate they are significantly different at α= 0.05 level.
Raw score of individual dimension of NASA TLX, weighted TLX, and MCH results under experimental conditions
1,2are for age groups 1 and 2, respectively. MD, PD, TD, PL, EF, and FL are raw score of mental demand, physical demand, temporal demand, performance, effort, and frustration levels, respectively. *p<0.05 between age group. For the six individual dimensions, both MD and EF scores were significantly (p < 0.05) higher than the scores of all other items. **p<0.01 between age group.

Weighted TLX scores of male and female participants in the two groups.
Table 3 shows the means and standard deviations of the six dimensions. Pair-wised t-test results indicate that both raw scores of MD (6.40±2.29) and EF (6.35±2.08) were significantly (p < 0.05) higher than those of the other dimension. The raw score of FL (3.07±2.29) was significantly (p < 0.05) lower than those of all other dimensions.
The ANOVA results of the individual dimensions of the TLX indicate that the effects of flight route, altitude, gender, and age group were all insignificant on the raw scores of MD, PD, and PL. On the other hand, raw score of TD was significantly affected by gender (p < 0.01) and age group (p < 0.0001). The LSD test results showed that the raw TD score of males (3.42±2.20) was significantly (p < 0.05) lower than that of females (4.75±2.37). The raw TD score of age group 2 (5.53±1.58) was significantly (p < 0.05) higher than that of group 1 (2.65±2.15). The raw score of EF was significantly (p < 0.001) affected by gender and interaction effects of gender and age group (p < 0.01). The LSD test results indicate that males showed significantly (p < 0.05) higher raw EF score (7.15±2.09) than that of their female counterparts (5.55±1.76). Figure 7 shows the interactions of age group and gender on the raw EF score. The raw FL score was also significantly (p < 0.0001) affected by age group. The LSD test results indicate that the raw FL score of age group 2 (4.25±2.12) was significantly (p < 0.05) higher than that (1.90±1.83) of group 1.
Raw scores of the individual dimensions of the TLX across all experimental conditions
Note: n = 80 (20 participants×4 trials); different super script letters indicate they are significantly different at α= 0.05.

Raw score of effort of male and female participants in the two age groups.
The mean and standard deviation of the MCH were 3.73 and 1.97, respectively. The ANOVA results indicate that gender, flight route, and altitude were all insignificant to MCH. The effect of age group was, however, significant (p < 0.01). The MCH of group 2 (4.17±2.14) was significantly higher than that (3.27±1.71) of group 1.
Table 4 shows the mean and standard deviation of HR and IBI under experimental conditions. The overall mean (±std) heart rate and IBI during the trial were 97.3 (±15.7) beat per minute and 643.1 (±198.4) ms. Both of these two variables were not affected significantly by gender, flight route, altitude, and age group.
Heart rate (beat/min) and inter-beat interval (ms) results under experimental conditions
Heart rate (beat/min) and inter-beat interval (ms) results under experimental conditions
1,2are for age group 1 and 2, respectively.
The Pearson’s correlation coefficient between weighted TLX and MCH scores was 0.37 (p < 0.001). The correlation coefficients between the flight time and weighted TLX score and between the flight time and MCH score were 0.47 and 0.42 (both at p < 0.0001), respectively. The correlation coefficient between age and raw TLX score was 0.45 (p < 0.0001) but the correlation coefficient between age and weighted TLX score was insignificant (r = 0.20, p = 0.07). The correlation coefficient between age and MCH score, however, was also insignificant (r = 0.12, p = 0.27). The correlation coefficients between the IBI and each of the score of the individual dimension of the TLX were low (–0.2 < r<0.16) and were all insignificant (p > 0.05). Low and insignificant correlation coefficients were also found between HR and each of the score of the individual dimension of the TLX. The correlation coefficients between HR and each of the TLX and MCH scores were both insignificant. The only correlation coefficient between HR and individual dimension of the TLX that reach the α= 0.05 significance level was the TD (r = 0.25, p < 0.05). Table 5 shows the correlation coefficients between the individual dimensions of the TLX and age, flight time, and MCH score. In this table, the column containing EF score was deleted because all the correlation coefficients in this column were not significant.
Correlation coefficients between the rating of raw score of individual\\ dimension of TLX and age, fight time, and MCH
Correlation coefficients between the rating of raw score of individual\\ dimension of TLX and age, fight time, and MCH
– not significant; *significant at p < 0.05; **significant at p < 0.01; ***significant at p < 0.0001.
Route and altitude of the flight
UAVs have been used to inspect facilities from the air. To achieve such a purpose, inspectors may operate a UAV to take high-resolution photos of critical parts and analyze the photos in the office. In this study, the participants were requested to completed photo-taking missions under two flight routes and two altitudes conditions. These missions were real even though the flight distance (approximately 600 m) and time (less than 10 min, on average) might be shorter than the real photo-taking missions performed by a professional UAV operator. The literature indicates that flight route could have an impact on flight performance [43, 44]. We anticipated that the flight time between the two flight routes tested could be different. Such a difference could be attributed to the fact that the first target (T1) for route 1 was visible to the participant at the take-off spot. This made it easy for him or her to search for and reach the target. For route 2, the first target (T3) was not visible to the participant at the take-off spot so there was no such advantage. The difference between the flights of the two altitudes was in two folds. The first was that a higher altitude required more time to elevate and descend than a lower one. The second was that target reach at a higher altitude was more difficult than that of the lower one because ground targets became smaller in the former than the latter. More effort was required to do fine adjustment to have a smaller ground target coincide with the cross in the camera window than a larger one. Unfortunately, neither the flight route nor the altitude affected the flight time and any of the subjective and objective measures of mental workload significantly. Both flight route and altitude played little roles on the mental workload and flight performance of our novice operators in the photo-taking missions. The following discussion will then be focused on the two variables related to operator attributes.
Age, temporal demand, and frustration
As successful photo-taking was required in each trial, flight time was used to represent the performance of the participants. The older participants in general were more cautious, and thus needed more time, in maneuvering the vehicle during the trial than their younger counterparts. They could suffer more time pressure and frustration especially when operating the joystick for fine adjustment for target reach. This was supported by the correlation coefficients between age and raw scores of TD (r = 0.60, p < 0.0001) and FL (r = 0.44, p < 0.0001) (see Table 5). It was also likely that the younger participants might get familiar with the joystick control faster than their older counterparts and thus could have better control of the vehicle during the tests. The participants in age group 2, therefore, had significantly higher flight time than group 1. The implication of the significance of age group on both the weighted TLX and MCH scores was that different UAV training programs should be developed for trainees of different ages so as to achieve better training effectiveness.
Weighting procedure of the TLX and weights of the individual dimensions
The weights of the individual dimension of the TLX were determined via a pairwise comparison procedure recommended in the literature [18]. A recent publication [45] has pointed out that there were problems with this procedure. The first one was that the participant could be forced to give different weights on two dimensions even if s/he felt the two dimensions were equally important. If the participant wanted to make sure two or more dimensions got the same weight, s/he must deliberately make inconsistent pairwise comparisons. This might not be reasonable. Another problem with this weighting procedure was that the range of the weight for any dimension was between 0 and 0.33, regardless of the results of the pairwise comparisons. If any dimension received a weight of 0, then the weighted TLX was no longer a six-dimension scale. This contradicts the original design of the NASA TLX. Even with these challenges, NASA still believes the weighting procedure is valid in providing the weighted TLX and did not alter its recommendations on this procedure. More studies are required to tackle the above-mentioned weighting problems.
Table 1 shows that, on average, the participants felt that PL was the most important one among the six dimensions of the NASA TLX, next with the TD, and then the MD. The FR and PD were ranked as the fifth and last. All the participants could complete the photo-taking missions after a brief onsite training. This implied the operation of the drone in the current study was relatively easy so the participants felt frustration was relatively unimportant. The lowest weight of the PD among all dimensions was attributed to the fact that only finger movement on the remote controller was required to navigate the drone to fly. The contribution of this dimension to mental workload was, therefore, the lowest.
Weighted TLX and MCH
Flight time was positively correlated with both the weighted TLX (r = 0.47, p < 0.0001) and MCH (r = 0.42, p < 0.0001). This implies that the participants who had higher subjective mental workload scores spent more time to complete the trial. This supports the argument that high mental workload could lead to poor flight performance. Based on the observation of the mentor, the participants generally felt taking off and flying toward a target were easy while the mental workload and effort required for target reach and landing inside the circle were moderate. The overall weighted TLX and MCH scores were 4.71 (±1.29) and 3.73 (±1.97), respectively. The MCH scale is unidimensional. A MCH score between 3 to 4 indicates difficulty level between “fair, mild difficulty” to “minor but annoying difficulty” and operator demand between “acceptable to moderate” mental effort required to attain adequate performance [32]. The weighted TLX, on the other hand, is a multi-dimensional scale. Table 1 shows the scores of both MD and EF scores were higher than the scores of other dimensions and the scores of PL and FL were lower than those of the others. These scores indicate that the mental workload and effort level of the participants were moderate while their perceived performance was acceptable to good, and their frustration level was relatively low. The results between the weighted TLX and the MCH scores were consistent. This was also consistent with the findings in the literature [28] that both the weighted TLX and MCH could differentiate the mental workload of manned simulated flights at different flight performance levels.
The advantage of using the MCH is that this scale has only one dimension. The participant could complete answering this rating very quickly. The participants, on the other hand, had to spent a little bit more time to complete the dimensions of the TLX. Nevertheless, the six dimensions of the TLX directly related to the corresponding human mental resources and enable discrimination of the contributions of these dimensions to the overall scores. This was consistent with that in the literature [46]. It should be noted that the low/high ratings of all the dimensions in the weighted TLX correspond to the low/high levels of the dimensions except the PL. Clicking the lowest scale in this dimension indicates that the participant perceived he or she had a perfect performance. Most participants marked this rating incorrectly and made corrections after a reminder from the research personnel. Anyone who tries to use the TLX needs to pay special attention in this rating.
HR and IBI
In addition to the weighted TLX and MCH, both HR and IBI were also collected to quantify the mental workload of the participants. However, both of these variables were not affected by any of our independent variables. This was inconsistent with our findings of the weighted TLX and MCH scores. This implies that both HR and IBI might not be sensitive enough, as compared to the subjective ratings we measured, to differentiate the levels of mental workload between the two age groups of our participants. The flight time of our trials (less than 10 min on average) was much shorter than that of manned flights. Time period is surely one of the critical components of mental workload. This might be the reason for the insignificance of our independent variables on the HR and IBI.
Limitations of the study
The UAV Forecast app [40] was adopted to check the weather condition before each trial. A flight was allowed only if the app showed a “good to fly”. This made scheduling of the trials difficult and thus placed a limitation in our sample size due to the difficulty in recruiting human participants. Even if the weather was “good to fly” for all the trials, the weather conditions, especially the gust speed, of each trial might be somewhat different from one another. The variations of the weather conditions could affect both our flight time and mental workload results. Unfortunately, how such variations affected our data was not clear. This is also a limitation of this study. The other limitation was that the task complexity of our missions might not be sufficient to distinguish the mental workload. In real ground target photography or monitoring tasks, the number of targets may be much more than that of ours and the flight time may be much longer. The targets may not be easily detectable and visible from the air due to their size, shape, and color. The mental workload in real missions could, therefore, be much higher than those in this study. In future studies, more complicated missions should be planned and tested and the stress of the operators should be investigated [47]. In addition, even if the participants had no prior experience in flying UAV, other experiences using joysticks such as playing video games or controlling a radio car could help some of the participants in manipulating the remote controller of the UAV. Future research may be conducted to compare the prior experience of using joysticks on the mental workload and flight performance of UAV operations.
Conclusions
Our hypotheses were partially supported. The effects of flight route, altitude, and gender were insignificant on flight time, weighted TLX and MCH scores. Age affected flight time, weighted TLX, and MCH score significantly. Both heart rate and IBI were not applicable in assessing the mental workload of UAV flights in our tests as they were not sensitive enough to differentiate the mental workload for any of our independent variables. Flight time, the variable to indicate flight performance, was positively correlated with the weighted TLX and MCH scores, indicating high mental workload could lead to long flight time and thus poor flight performance.
Ethical approval
Jen-Ai Hospital in Taiwan approved the study (JAH108-86).
Informed consent
All participants read and signed an informed consent form prior to joining the study.
Footnotes
Acknowledgments
The authors thank Wirapatcha Nanthapaphaporn, a former graduate student at the Department of Industrial Management of Chung Hua University, for her assistance.
Conflict of interest
None of the authors have any conflicts of interest to report.
Funding
The study was partially supported by the Ministry of Science & Technology of the ROC (Contract MOST 109-2221-E-216-003-MY3).
