Abstract
BACKGROUND:
Despite offering many benefits, direct manual anthropometric measurement method can be problematic due to their vulnerability to measurement errors.
OBJECTIVE:
The purpose of this literature review was to determine, whether or not the currently published anthropometric studies of school children, related to ergonomics, mentioned or evaluated the variables precision, reliability or accuracy in the direct manual measurement method.
METHODS:
Two bibliographic databases, and the bibliographic references of all the selected papers were used for finding relevant published papers in the fields considered in this study.
RESULTS:
Forty-six (46) studies met the criteria previously defined for this literature review. However, only ten (10) studies mentioned at least one of the analyzed variables, and none has evaluated all of them. Only reliability was assessed by three papers. Moreover, in what regards the factors that affect precision, reliability and accuracy, the reviewed papers presented large differences. This was particularly clear in the instruments used for the measurements, which were not consistent throughout the studies. Additionally, it was also clear that there was a lack of information regarding the evaluators’ training and procedures for anthropometric data collection, which are assumed to be the most important issues that affect precision, reliability and accuracy.
CONCLUSIONS:
Based on the review of the literature, it was possible to conclude that the considered anthropometric studies had not focused their attention to the analysis of precision, reliability and accuracy of the manual measurement methods. Hence, and with the aim of avoiding measurement errors and misleading data, anthropometric studies should put more efforts and care on testing measurement error and defining the procedures used to collect anthropometric data.
Introduction
Anthropometry is the branch of the human sciences that deals with body measurements: measurements of size, shape, strength and working capacity [1]. The anthropometric data are essential for applying ergonomic principles for the design and improvement of a wide range of products for different users [2–4]. In school environments, anthropometry has become an important discipline, as it can be used to provide relevant students’ anthropometric characteristics, which in turn can be used to provide critical information for school furniture design [5, 6]. When the correct anthropometric data and sample population are not consider, a mismatch between anthropometric dimensions and school furniture may occur, which could ultimately result in the development of musculoskeletal disorders within the students and other problems related to the learning process [7–9]. Additionally, if school furniture is not locally designed, importers should ensure that the appropriate anthropometric data were considered, so that imported school furniture fits the intended use and users [10]. On the other hand, when employed correctly, anthropometric data yields very satisfactory results. As mentioned by Castellucci et al. [11], there is a consensual opinion among the published studies that a change in school furniture dimensions (for better fit or match) resulted in postural improvements, less muscular effort and less reported discomfort/pain. Furthermore, children anthropometrics can also be used for safety and regulation purposes [12].
There are several methods of collecting anthropometric data (e.g., 1D direct manual measurement method, 2D photography methods and 3D scanning methods), each one with its inherent limitations. The most used method is the direct manual measurement method, where measurements are collected by using a somewhat wide range of equipment (e.g. anthropometers, calipers and measuring tapes). Despite offering many advantages (low cost, easy to perform, little equipment required), direct manual measurement method can be problematic due to their vulnerability to measurement errors [13]. As an example of issues that may lead to the variability in the data and subsequent errors are the need for: (i) careful equipment calibration; (ii) trained measurers; (iii) multiple measurement acquisition (repetitions); and (iv) participants’ agreement [14, 15]. Besides that, during the measurement process there are some factors that can also contribute to the existence of errors, such as: (i) changes in participant’s posture throughout the process; (ii) variations in the pressure exerted by the measuring devices; and (iii) identification of the location of the body landmarks in the participants’ body by the measurer.
Regardless of the used methods, it is crucial that the collected data is, as much as possible, free of errors, reliable and precise. Hence, measurer error should be evaluated and explicitly described. If the dimensions have high-levels of error, all the subsequent findings of that particular study will be altered. There are many ways to assess/evaluate the collected data to identify possible errors. The most common, in the field of anthropometry, are the determination of precision, reliability and accuracy, and their importance has already been frequently studied [16, 17]. However, reports on physical measurements in human populations frequently do not include estimates of measurement errors [18]. To avoid the variability of the measures and reduce measurement error, International Standard Organization (ISO) has developed some standards [19, 20] that provide a description of anthropometric measurements, instruments, standard postures, clothing and measurer training, which can serve as a guide for ergonomists who are required to apply their knowledge to the geometric design of the workplaces (including schools) and to make it possible to compare anthropometric data from different international populations. Furthermore, ISO 15535 [20] also mentioned that “frequent and regular measurer training and quality control shall be carried out by persons experienced in anthropometry, in order to ensure acceptable standards of accuracy. Repeated measurement data should be recorded. Inter- and intra-measurer standard error of measurement, or mean absolute difference, shall be calculated and recorded for all anthropometric variables, in order that random checks can be carried out on the measuring teams during the survey” (p. 4).
Ulijaszek and Kerr [21] report various terms are used to describe anthropometric measurement error, such as: unreliability, imprecision, undependability, inaccuracy, precision, accuracy, validity, reliability, repeatability, reproducibility and bias. Published scientific literature uses different terminology to define anthropometric measurement error. However, the effects of measurement error on the quality of data are mainly categorized into two: (i) either the extent to which a measure departs from its true value or (ii) the extent to which the repeated measures give the same value [21]. At this respect, the following definitions will be considered in the current paper:
The purpose of this paper was to determine whether the currently published anthropometric studies of school children, related to ergonomics, mentioned or evaluated the variables precision, reliability and/or accuracy in the direct manual measurement method.
Method
A literature review was conducted to achieve the outlined goals for this research. This methodology, besides being replicable and scientifically transparent, is also very useful to generate a basic framework for an in-depth analysis of the existing literature [27]. Prior to the literature review a scoping study (i.e. exploratory review) of child related anthropometry was conducted to clarify the basis of the topics and to define the key concepts for this review’s research question [28].
The research question formulated for this study was generated according to the PICO (Population, Intervention, Control, Outcomes) framework [29, 30], as follows: Have the currently existing studies that collect anthropometric data (I) of school children, related to ergonomics (P), mentioned and/or evaluated precision, reliability or accuracy of the direct manual measurement method (C) to ensure the quality of the results by avoiding measurement errors (O)?
Two bibliographic databases, Scopus and PubMed, were used for finding relevant papers published in the field of anthropometric studies for ergonomics purposes involving school students. These databases were selected as it cover a wide range of research areas and the most relevant peer-reviewed journals in the area of ergonomics [31]. Furthermore, the bibliographic references of all the selected papers were also individually analyzed with the aim of finding further relevant papers, which for any reason were not found when the initial search criteria were applied.
In regards to the search string, the search terms used were ‘anthropometric characteristics’, ‘anthropometric dimensions’ and ‘anthropometric measures’. To avoid papers not falling into our research topic, the search was performed using the Boolean operator “AND”, with the search term ‘ergonomics’. The following combination were used: ‘anthropometric characteristics’ AND ‘ergonomics’; ‘anthropometric dimensions’ AND ‘ergonomics’; ‘anthropometric measures’ AND ‘ergonomics’.
The inclusion criteria used were as follows:
Original articles written in English and published in peer-reviewed journals; Published or in press between January 1990 and January 2016; Papers that considered the evaluation of anthropometric measures by using manual methods; Papers with an ergonomics research/application purpose; Papers with school students’ samples, with ages between 5 and 19 years old. Some studies were also considered and included in this study if part of their sample was also consistent with the selected age range.
All the studies that merely presented anthropometric measures with a focus in nutritional status, body composition or sports’ performance (e.g. stature, weight, body mass index, skinfolds, hip and waist circumference) were not considered, as they were not specifically related to ergonomics. Some examples of this exclusion are papers by Bradshaw and Rossignol, [32], De Paula et al. [33] and Ibrahim et al. [34]. Studies that presented 3D or photography methods (2D) to collected data were also excluded [35, 36]. Several studies were not considered in this review because the sample considered comprised only university students [37] or only male workers [38], instead of younger school students. Papers that used secondary data analysis were not considered (García-Acosta & Lange-Morales [39]; Jayaratne & Fernando [40]; Jayaratne [41]; Molenbroek et al. [42]).
Titles and abstracts of papers were scanned independently by two of the authors to identify relevant papers to retrieve for full text analysis. The cases in which the papers seemed potentially eligible but no abstract was available, the full text of the paper was retrieved. Disagreements between authors were referred to a third author, and a decision was then made regarding its inclusion. Full texts were independently reviewed for inclusion by the two authors using a standardized data extraction form, and disagreements between them were referred to the other three authors. Primary studies meeting the inclusion criteria, were identified and the corresponding data extracted.
Results and discussion
Figure 1 shows the results of the search strategy. The search on the databases resulted in an initial number of 747 papers (SCOPUS: 457 and Pubmed: 290), which was then reduced to 499 after the removal of duplicates entries. After screening the title, abstract and keywords of each article, 97 papers were identified as being potentially relevant. After reviewing the corresponding full-texts, 40 papers were selected on the basis of the inclusion criteria. Finally, six additional papers were added after the manual search of the bibliography/reference lists from the 40 selected articles. The total number of articles to be reviewed was composed by 46 papers.

Flow diagram of paper selection process.
Before starting the results and discussion process and to avoid misunderstandings, the variables (accuracy, precision, reliability and their synonymous) were considered to be evaluated when an equation or formula was applied and the results were presented. Another alternative was when there was a clear mention to the analysis of any of the considered variables. Conversely, the variables could be mentioned without evaluation, e.g. accuracy and repeatability of measurements that were achieved by practice prior to the data collection sessions [43].
The results from Table 1 show that six out of the 46 studies mention the word accuracy but none of them have evaluated it. Most of the authors mentioned that accuracy of measurements was achieved by practice prior to the data collection sessions. Furthermore, some authors declare that the accuracy of the measurements was achieved by undergoing a thorough training with a certified anthropometrics specialist [44] or that it was achieved through training and supervision [45]. In some way, the results presented of the accuracy achieved could be supported by the ISO 15535 [20], in which it is mentioned that “frequent and regular measurer training and quality control shall be carried out by persons experienced in anthropometry, in order to ensure acceptable standards of accuracy”.
Summary of the studies referring to accuracy, precision + or reliability
Summary of the studies referring to accuracy, precision + or reliability
M: mention; E: evaluated. +The results of Precision were not described since none of the reviewed studies mentioned or evaluated. *Accuracy related to the measurements instruments.
However, there are some issues that need to be addressed, considering that inaccuracy is a systematic bias, and may be due to instrument error, or to errors of measurement technique [21]:
Summary of the Measurement instruments used in each study
N/S: not specified; N/M: not mention; +Fabricated on the basis of Martin type anthropometer. *There are more studies that use metric tape but to measure school furniture dimensions.
The evaluation of the intra-measurer precision and reliability should be considered in all the reviewed papers with the aim of improving measurement reliability (as this is a direct indicator of data quality). Furthermore, the measurer error is the most complex source of anthropometric error. This type of error can even be accentuated by the use of multiple measurers [89] – condition that was presented in at least 11 of the 46 studies reviewed (Table 3), where the inter-measurer reliability and precision should have been calculated to avoid errors. This situation could also become important for the other 29 studies that do not mention (NM) or do not specify (NS) the number of measurers involved in the measurement process. Regarding the numbers of measurers, some studies were considered to be “NS, at least 2” (see Table 3) since they mentioned the use of more than one team to collected the measures. An example of this is the study of Dianat et al. [90] where the measurements were carried out by two teams, each consisting of two technicians. However, it is not specified if the two technicians of each group took different measurements; one was a recorder and the other one the measurer or if they were able to switch roles. On the other hand, some studies were considered to be NS since it was not possible to define the number of measurer. An example of this is the study of Motamedzade et al. [73] where the anthropometric dimensions of weavers’ hands were measured with direct method using a digital caliper by trained field researchers.
Characteristics of training and measurement procedure of each study included in this review
Characteristics of training and measurement procedure of each study included in this review
N/S: not specified; N/M: not mention; N/A: not applicable. *It is related to the standard posture of sitting: knees and hips flexed at 90° (right angle), supporting the feet flat on the floor and head oriented in the Frankfurt plane. Also, was considered for the standard standing posture.
Regarding precision, none of the studies reviewed mentioned or evaluated precision (Table 1), despite precision being a basic indicator of an anthropometrist’s expertise [23]. TEM is the most commonly used measure of precision [18, 91] and is also presented in ISO 7250-2 [92] as follow “The number of measurers and information on the skill of each measurer, such as intra-observer mean absolute difference or technical error of measurement or repeated measurements, are shown when such data are available. When more than one measurer is involved, the methods used to control the quality of the measurement technique are documented(...)”
It is important to highlight that eight of the studies reviewed mentioned reliability [10, 58] or synonymous terms, such as, repeatability [43, 46] and consistency [48]. Furthermore, only three of the reviewed studies have evaluated repeated measurements using reliability assessment methods (Table 1). The results of these studies show that the measurers have an acceptable value of inter- and intra-reliability. At a first glance, it seems that there is a small number of studies in this review that considered the evaluation of reliability. Nonetheless, it is important to mention that only two (one from Germany and one from Japan) out of the nine databases presented in the ISO 7250-2 [92] considered the reliability evaluation.
ISO 15535 [20] also states that “repeated measurement data should be recorded. Inter- and intra-measurer standard error of measurement, or mean absolute difference, shall be calculated and recorded for all anthropometric variables, in order that random checks can be carried out on the measuring teams during the survey” (p. 4). In the studies reviewed, paired samples t-tests were used to assess the inter- and intra-measurer reliability [57]. The use of this test is consistent with the procedure used by Steenbekkers [12] and reinforced by Goto and Mascie-Taylor [93], who indicated that inconsistency between two measurements can be assessed using a paired samples t-test, which determines whether the mean difference is significant or not. However, Bruton et al. [24], indicated that paired samples t-test, and analysis of variance techniques are statistical methods for detecting systematic bias between groups of data. These estimates, based upon hypothesis testing, are often used in reliability studies, but they give information only about systematic differences between the means of two sets of data, and not about individual differences.
Pearson correlation coefficient was another method used in the studies reviewed, with the aim of testing the inter-measurer reliability as well as the intra-measurer reliability [10]. The Pearson correlation coefficients gives information about the degree of association between two sets of data, or the consistency of the position within the two distributions. However, this coefficient does not detect any systematic errors, so it is possible to have two sets of scores that are highly correlated, but not highly repeatable [24].
The intra-class correlation coefficient (ICC) is an attempt to overcome some of the limitations of the classic correlation coefficients and it was used in one of the papers reviewed, to test the inter- and intra-measurer reliability [58]. The ICC is a single index calculated using variance estimates obtained through the partitioning of total variance into between and within subject variance (known as analysis of variance or ANOVA). It thus reflects both degree of consistency and agreement among ratings [24]. Furthermore, the ICC applied in the paper reviewed was the model “two-way mixed” and type “absolute agreement”. This type of ICC has the advantage to considered the systematic difference between the measurer [94].
Finally, the results of the present literature review show that despite the fact that the importance of having accurate anthropometric measurements have been repeatedly stressed and that measurement reliability is a direct indicator of data quality [95], only ten of the papers reviewed mention at least one of the variable or synonymous terms (accuracy, precision, reliability) and only three evaluated one of them (reliability). During the last three decades a great effort has been done, by the ISO standards, to have more accurate and reliable anthropometric measurements. Still, the results in the area of anthropometric surveys for ergonomics purposes of school students, does not differ from the idea presented more than three decades ago by Ulijaszek and Mascie-Taylor [96]. These authors explained that reports of growth and physique measurements in human populations rarely include estimates of measurement error and, this issue could be due to a lack of standardized terminology to describe the reliability of measurement in a clear and understandable way.
The results show that only a few studies have evaluated or mentioned the level of accuracy, precision and reliability. Furthermore, a deeper analysis of the reviewed papers can be performed through the examination of three factors that may affect the measurement error, as described in the following sections.
Measurer training
Only 13 out of the 46 studies reviewed considered training procedure before the data collection (Table 3). This is very important aspect since consistent training can reduce differences between measurements taken by different people [97]. In the majority of the studies, training included a theoretical approach about anthropometrics, as well as practical instructions. One of the studies has also considered training by showing a video of the anthropometric measurements and by test-measuring the required dimensions [51].
The majority of the studies did not specify the timeframes used in training. Nevertheless, with the available information it can be stated that there are many differences regarding the timeframe. For example, Brewer et al. [54] used a short training session, whilst other author used a one week of training session [10, 98], and there was even a two weeks training session, that was the large timeframe observed [57, 58].
Finally, the study of Cordovil et al. [60] did not consider training since all the anthropometric variables were obtained by an accredited level 3 ISAK anthropometrist.
Measurement instruments
The literature review has shown that a large number of measurement instruments were applied to collect the data (Table 2). The most frequently used measurement instrument was the anthropometer, used in 22 out of the 46 reviewed studies. Within these, the most used anthropometer was the Harpenden type (Holtain Ltd., Crymych, UK) (Fig. 2a). However, 12 out of the 46 studies reviewed did not mention the measurement instrument(s) used during the anthropometric survey.

(a) Harpenden type (Holtain, Crymych, UK) (b) Lafayette’s large anthropometer. (reproduced from Corchuelo et al. [100]).
Following the discussion presented in Section 3.1. it is important to mention that there are contradictions in the bibliography regarding instrument accuracy. One position is that the risk of inaccuracy is greater with a complex instrument than with a simple one. Thus, inaccuracy of measurements performed using a simple measuring tape is more likely to be smaller than the one obtained with the measurements performed using sliding scales, such as anthropometers and stadiometers [21]. On the other hand, [22] mention that the accuracy is generally best approximated by the use of precisely calibrated, rigid instruments carefully positioned by trained investigators under controlled environmental conditions.
Considering the previous information, one should reflect on the following question: is it better to measure with a measuring tape than with an anthropometer? The answer to this is not a simple one. Firstly, it will depend on the specific measure to be collected. Secondly, it is important to mention that validity is the extent to which a measurement actually measures a characteristic; and is conceptually close to the variable accuracy, given that ‘true’ values of measurements are impossible to determine [21].
Another question that arises from this analysis is: what is the validity of using a measuring tape to collect linear distances, such as popliteal height or elbow height in a sitting posture? Based on the ISO 7250-1, measuring tapes are only recommended for body circumferences measurements and not for linear distances. Nonetheless, as it is not a rigid tape, this recommendation can be accepted or not according to the characteristics of the measuring tape and to the characteristics of the body measurement to be collected. An example of this is for collecting the popliteal height. It would be much more difficult to position one end of the measuring tape in the tendon of the relaxed biceps femoris muscle and the other end on the floor, since this equipment does not have blades or branches like the anthropometer (Fig. 2a) and it may not be very stable, compromising the results.
The positioning of the landmarks might also be an issue, as happens when using a 3D scanner or a skinfolds measuring device. However, in anthropometric measures used for ergonomics purposes, it would also be ideally performed prior to any measurement. This process on its own has also some limitations; specially when applied in non-arm forces work settings, landmarking can present issues related to privacy and cultural/religious beliefs that may downsize the sample size. Thus, just a few exposed areas are usually marked and the rest of the landmarks are located by palpation over clothes and then the measurement is performed by positioning the instruments blades or branches.
Considering the previous information, there are four studies that present instruments that may be inadequate to collected the measurements considered [45, 99], i.e., all of them used a measuring tape to measure linear distances, breadths and depths, instead of using an anthropometer and/or sliding /spreading calipers. Zanuncio et al. [88] also used small calipers and a measuring tape, which means that some linear distances (e.g. popliteal height, knee height and sitting height) were gathered with the inadequate instrument (measuring tape). Finally, Grozdanovic et al. [67] used plastic measuring tape (tailor’s measuring tape type) to measure the thigh and arm circumference, which may be considered as an unreliable instrument since it is made from a material that can stretch and get deformed over time [97].
Special attention should be given to the studies that used the Lafayette’s large or small anthropometer and evaluated the following linear distances: shoulder height sitting, popliteal height, elbow height sitting and knee height sitting [5, 78]. The C-shaped arm from the Lafayette’s anthropometer provides accurate measurements for the breadth dimensions. But, on the other hand, this shape may also provide a problem for the linear distances since it will be more difficult to position the instruments blades or branches on the floor and have a direct reading (see Fig. 2b).
Having a standardized procedure for data collection will certainly minimize the measurement error and is more likely to allow comparisons with other anthropometric measurements from different population. ISO 7250-1 [19] provides some information with the purpose of standardizing the data collection procedures: (i) description of anthropometric measurements, (ii) clothing of subject, (iii) body symmetry, (iv) posture, (v) instruments (previously discuss), and (vi) support surfaces (floor or sitting surfaces).
It is important to mention that none of the 46 papers reviewed were published before the first version of the ISO 7250, 1980. Despite that, only six of the reviewed studies mentioned that the measurements were performance following the definition from the standard (Table 3). These results should be considered with caution since:
20 studies used measurements defined by other relevant authors, such as: Pheasant [1], Chaffin and Anderson [101], Evans et al. [102] and Hertzberg [103]. It is important to highlight that the dimensions from previous authors present similarities with the dimension defined by the ISO 7250. Others authors [74, 104] only gathered measurements that are not defined in the ISO 7250-1. Also, the ISO standard mentioned that it is anticipated that the basic list will be supplemented by specific additional measurements. Most of the studies reviewed presented additional measurements (Table 3). Furthermore, the ISO 15535 mentioned that measurements that are different from those specified in ISO 7250-1 can also be measured according to the purpose of the investigation. In such cases, definitions, methods, instruments and measurement units should be clearly indicated in the report.
Considering the previous points, this critical situation needs to be addressed since only 15 studies defined the measurement considered using text and figure, 18 studies using only text or figure and 13 studies did not present any definition of the measurement gathered.
Regarding the clothing of the subjects, there are three studies that need to be excluded from the analysis since they considered measurements that are not affected by clothes such as: hand dimensions [73, 75] and head/neck dimensions [74]. For the remaining 43 studies, in 22 of them the subjects were measured in t-shirts and shorts or were lightly clothed. On the other hand, in the work of Brewer et al. [54] the advice to measure with light clothes was not followed, instead they considered to add some measurement error when students wore excessively baggy clothes and thick clothing such as jeans and sweaters. Finally, 20 of the studies did not mention the clothing of subjects.
The posture adopted by the participants is marked as being a factor that affects errors in anthropometry [105]. To minimize the effect of this, the majority of the studies reviewed (27 out of 43, the same three studies were excluded) considered the measurement of the participants when seated and/or on the standard standing posture. However, 14 studies did not mention the posture adopted and two of them [82, 83] considered different postures, which was recognized by the same authors as making measurements in this way may slightly over- or under-estimate ‘standardized posture’ measurements. Furthermore, the same authors evaluate popliteal height with participants using shoes. This represents another source of error since the participants may change shoes. This is the reason why it is recommended to always measure the participants barefoot, keeping in mind that shoes may naturally vary according to culture, fashion, and country. To get more representative values of the sample under study, an option is to measure the shoe heel of the students and, in the cases where this is not possible for the researchers, an alternative would be to consider shoe correction as a value between 2 cm and 3 cm [55].
Finally, considering the information gathered from the 46 papers reviewed, the authors believe that the anthropometric surveys to be publish in the future should emphasize not only the data collection process (measurement instruments, training and collect data procedures) and the measurement error testing, but they should also focus on how the data is presented in a scientific paper or report, so that other authors can replicate the study and/or use it for comparisons between populations.
Limitations
There is a wide variety of terms that are used to refer to issues of precision, reliability and accuracy. Even though the search conducted in this study covered several relevant keywords, some papers might have been missed due to the use of different terms and wording. Hence, this may be regarded as a limitation of this study.
This work has also some inherent limitations, which researchers using this information should be aware of when interpreting the results presented in this paper. This literature review was based on peer-reviewed journals found in only two specific bibliographic databases (Scopus and PubMed). Although it is known that these databases cover a very wide range of different areas, searching in different databases, such as Google Scholar, or considered conference articles, could also have had relevant information that might have been relevant to this review.
Conclusion
The purpose of this study was to assess, by the means of a literature review, whether or not anthropometric studies of school children related to ergonomics, mentioned and/or evaluated the variables precision, reliability and/or accuracy. After reviewing 46 papers it can be concluded that this subject is poorly addressed in the literature, as only 11 studies mention at least one of the variables and none of the studies evaluates all of the variables.
Of the three papers that assessed reliability, only one presents the correct methods (ICC), which allows for the identification of individual differences and systematic errors.
It should also be acknowledged that, in regards to the factors that may affect precision, reliability and accuracy, the papers reviewed presented great differences in terms of the measurement instruments used. Furthermore, there is a clear lack of information regarding the training and procedures for anthropometric data collection.
Finally, more attention should be given to the procedures used to collect anthropometric data for ergonomics purposes. They should take in consideration the procedures defined in the relevant standards, test for measurement error and report the entire information when presenting the collected data.
Conflict of interest
None to report.
