Abstract
The os coxa is commonly used for sex and age estimation with a high degree of accuracy. Our study aimed to compare the accuracy among three methods, which include a deep learning approach to increase the accuracy of sex prediction. A total sample of 250 left os coxae from a Thai population was divided into a ‘training’ set of 200 samples and a ‘test’ set of 50 samples. The age of the samples ranged from 26 to 94 years. Three methods of sex determination were assessed in this experiment: a dry bone method, an image-based method and deep learning method. The intra- and inter-observer reliabilities were also assessed in the dry bone and photo methods. Our results showed that the accuracies were 80.65%, 90.3%, and 91.95% for the dry bone, image-based, and deep learning methods, respectively. The greater sciatic notch shape was wide and symmetrical in females while narrow and asymmetrical in males. The intra- and inter-observer agreements were moderate to almost perfect level (Kappa = 0.67−0.93, ICC = 0.74−0.94). Conclusion: The image-based and deep learning methods were efficient in sex determination. However, the deep learning technique performed the best among the three methods due to its high accuracy and rapid analysis. In this study, deep learning technology was found to be a viable option for remote consultations regarding sex determination in the Thai population.
Introduction
The aim of forensic osteology is to identify sex, age, stature, and ancestry from human bones. At crime scenes, when deceased individuals are found, law enforcement agencies and coroners need to know the aforementioned biological traits to help identify the body. 1 Sex assessment is commonly carried out before performing age and stature estimation on the unknown skeletons. Over the years, there have been studies on sex determination from various skeletal elements such as the sacrum, 2 skull, 3 scapula, 4 pelvis, 5 and vertebrae. 6 The most accurate sex indicator is the os coxa or pelvis, with an accuracy of around 97–99%.7,8 In previous studies, the os coxa was widely used for sex determination; for example, Patriquin et al. (2003) 9 studied sex determination in South African white and black people using os coxa morphology. Their result showed that the best parameters for sex prediction were the greater sciatic notch shape and pubic bone shape, which was consistent with the research on the Thai population conducted by Wangdee et al. (2014) 10 who achieved accuracies of 98.7%, 98.6%, and 98.2% for the subpubic angle, greater sciatic notch shape, and pubic bone shape, respectively. Based on these findings, an advantage of the greater sciatic notch for sex estimation is that it tends to be the well-preserved bone feature in case of burning and taphonomic destruction, in addition to its high accuracy. 11 For sex assessment, morphological and morphometric methods were used. Many morphological studies, (e.g. Klales et.al. (2012) 12 ) reported high accuracy but also moderate to high intra- and inter-observer errors. Thus, the morphometric approach was developed to increase the accuracy as well as the intra- and inter-observer reliabilities. Singh and Potturi (1978) 13 studied morphometry of the greater sciatic notch in an Indian population for sex estimation. They measured the os coxa by using a sliding caliper and found that the greater sciatic notch in females was wider than males, and the average classification accuracy was 86.75%. In Thais, Mahakkanukrauh et al. (2017) 7 also carried out similar research. In their study, the digital sliding caliper was used for measuring os coxa, and the accuracy for sex prediction was 97.5%. The intra- and inter-observer reliabilities of their measurements were tested, and the results indicated an acceptable level of reliability (R 0.83–0.98).7,13 Thus, their findings showed that the morphometric method achieved higher accuracy than the morphological method, and the inter-observer reliability was satisfactory. 7
The present study aimed to increase the accuracy and reliability of the sex estimation method used in the Thai population, and therefore the deep convolutional neural network approach was developed for this study. “Deep learning” is a specialized subset of machine learning that uses layered artificial neural networks to simulate human decision–making. It also uses layers of algorithms that data is passed through to create artificial neural networks, which streamline human-like learning and decision–making. Three types of artificial neural networks used in the deep learning process are (1) multilayer perceptrons (MLPs); (2) Convolutional Neural Networks (CNNs), such as LeNet-5 (1998), AlexNet (2012), ZFNet (2013), GoogLeNet / Inception (2014), VGGNet (2014), ResNet (2015); and (3) recurrent Neural Networks (RNNs). These simulate neurons in the human brain. 14 The convolutional neural networks (CNNs) were designed to analyze color images consisting of three 2D or 3D arrays. GoogLeNet is one of the CNNs that achieved high performance for object recognition and classification. 15 Therefore, the GoogLeNet was chosen for training, extracting features, and classifying sexes from 2D os coxa images.
The purposes of this study were: to use CNNs for training the dataset of greater sciatic notch images to predict the sex in the Thai test samples; to compare the results of this method with those of the morphological and the morphometric methods using the same Thai test samples; and to develop a tool for sex determination in the Thai population from the os coxa with a high degree of accuracy, time efficiency, user-friendly, and possibility for remote consultations.
Materials and methods
The research protocol was approved by the Research Ethics Committee, Faculty of Medicine, Chiang Mai University, Thailand (Research ID: ANA-2563-07285). The os coxae were provided by the Forensic Osteology Research Center (FORC), Faculty of Medicine, Chiang Mai University. The total sample consisted of 250 bones that were divided into 200 samples as a ‘training’ group (100 females, 100 males) and 50 samples as a ‘test’ group for the three methods. The selected os coxae for this study were all from the left side, complete, and of Thai descent. The age at death was between 26 − 94 years. Bones were excluded from the study based on the following criteria: fractured, incomplete, signs of skeletal pathology, from a non-Thai person, and from an individual who is less than 20 years or more than 100 years old.
Dry bone method
The wide, narrow, symmetrical, and asymmetrical characteristics of the greater sciatic notch area of the dry bone samples, were observed from the top view and scored. Scores of 0 and 1 were given for narrow and symmetrical, and for wide and asymmetrical, respectively (Figure 1). The training data was analyzed, and an equation for sex classification was derived. The greater sciatic notch areas of the dry bone samples in the ‘test’ group were classified using the same equation.

The greater sciatic notch shape, top view; a = narrow, b = wide, c = symmetrical, d = asymmetrical.
Following four features of the morphology in dry bone and image-based methods.
Image-based method
Technical photography
This study used 2D photos of the os coxae, which were taken with a digital camera with technical specs: Sony α57 Lens; Sony dt 18–55 mm F3.5-5.6 SAM, at focus on 55 mm, autofocus mode iso 200. The images were saved in ARW file format. The photographs were taken from the top view, and the distance between lens and bones was standardized for every photo. Each left os coxa was placed on a black silk velvet background. Each bone was positioned with the auricular surface of the ilium facing upwards with the pubic symphysis pointing to the top. The camera lens was parallel to the os coxa. The anatomical landmarks of the os coxa marked on the camera screen grid were the posterior superior iliac spine (PSIS) and the end of the lesser sciatic notch. These two landmarks lied in the same plane.
B. Parameters
Four parameters were assessed from the images. The first two parameters assessed concerned the shape (narrow vs. wide, symmetrical vs, asymmetrical) (Figure 1) of the greater sciatic notch, and the scores were given, similar to the dry bone method. The last two parameters included Ratios 1 and 2. Ratio 1 referred to the length of line A that was divided by the length of line B. Line A measured from the tip of the iliac crest to the tip of the ischial tuberosity. Line B measured from the posterior inferior iliac spine (PIIS) to the posterior end of the lesser sciatic notch (Figure 2). Ratio 2 referred to the length of line C that was divided by the length of line D. Line C was measured from the anterior superior iliac spine (ASIS) to line E. Line D was measured from the deepest point of the greater sciatic curve to line E. Line E measured from the PIIS to the posterior end of the lesser sciatic notch (Figure 3). The parameters were measured by Adobe Photoshop 2020 in pixel units. Following the analysis of the 4 parameters, the equation for sex determination was derived, and the accuracy was tested using the images of the test samples.

Ratio 1: line A measures from the tip of the iliac crest to the tip of the ischial tuberosity. Line B measures from the posterior inferior iliac spine (PIIS) to the posterior end of the lesser sciatic notch.

Ratio 2: line C measures from anterior superior iliac spine (ASIS) to line E. Line D measures from the deepest of the greater sciatic curve to line E. Line E measures from the posterior inferior iliac spine (PIIS) to the posterior end of the lesser sciatic notch.
Deep learning method
A. Pre-image processing of data
The training and test set samples were originated from the image-based method. Using Adobe Photoshop 2020 program, the images of the two groups were cropped at the greater sciatic notch area to highlight the regions of interest (ROIs) (Figure 4).

Cropped images of the greater sciatic notch area pinpointing regions of interest (ROIs).
B. Deep learning algorithm
The pre-trained GoogLeNet neural network was used for the training process. The 200 images of the training set, sized 224 × 224 pixels, were initially inputted into the deep network designer app. The fully connected layer and the final class output layer were then replaced in the custom head of the model. In GoogLeNet, these layers were assigned ‘loss3-classifier’ and ‘output classification layer’, respectively. The output size number of the new fully connected layer was changed from 10 to 2. The training data (n = 200) was then imported, and the augmentation options were adjusted to consist of random rotations from (−30°) to 30°and random rescaling from 0.9 to 1.1. The validation set was split from the training data at 30% (n = 60). For training options, various adjustments to the hyperparameters were applied for the highest accuracy, such as initial learning rate alteration, validation frequency, minibatchSize, maxepochs, L2 regularization, and momentum. Validation accuracy (%) was obtained after the training process. Results were exported and tested in the test group for sex classification (Figure 5). The deep learning method was developed on Notebook Lenovo IdeaPad Gaming3 15IMH05 81Y400PATA with 512 GB of memory and 4GB GDDR6 of GPU memory (NVIDIA GeForce GTX 1650 Ti). The deep learning model, the GoogLeNet CNN consisting of 144 layers was used and ran on MATLAB 2020a.

The progressive framework of training and test samples in sex prediction. .
Statistical analysis
Discriminant analysis was applied for analyzing data in the dry bone and image-based methods. The intra- and inter-observer reliabilities were estimated using Cohen's kappa coefficient 16 for morphological method and Intraclass Correlation Coefficients (ICC) 17 for morphometric method. The test samples were 30 samples derived from the randomized training set. All data were analyzed using IBM SPSS statistics version 22.
Results
Dry bone method
The results showed that the greater sciatic notch in females was wide and symmetrical (82.7%, 89.3%) and was narrow and asymmetrical (90%, 73.6%) in males. As a result of data analysis, the equation was derived as follows: Y = −0.433 + 2.268(A) −1.303(B). A referred to a score of narrowness or width whereas B referred to a score of symmetry or asymmetry. Y-value determined female or male if the score was ≥ 0 or ≤ 0, respectively. The accuracies of the validation and test set were 91%, 81%, 86%, and 80.65% for females, males, both sexes, and test set, respectively (Table 1). The intra- and inter-observer reliabilities showed moderate to an almost perfect agreement (Kappa = 0.67, 0.93)
Comparison of nthe validation accuracy and test accuracy in three methods.
* The output of GoogLeNet was shown the training accuracy, validation accuracy, training loss and validation loss when the training processing finished. It cannot divide the accuracy between male and female as dry bone part and photo part.
Image-based method
Based on the data from image analysis, the equation was derived as follows: Y = −4.213 −1.425(A) + 1.622(B) + 0.948(ratio1) – 0.030(ratio2). A referred to a score of narrowness or width, and B referred to a score of symmetry or asymmetry. Y-value determined the sex (female ≤ 0 scores, male ≥ 0 scores). The accuracies were reported as following: 92%, 91%, 91.5%, and 90.3% for females, males, both sexes, and test set, respectively (Table 1). The intra- and inter-observer reliabilities showed substantial to an almost perfect agreement for the scoring (Kappa = 0.80, 0.87) and good to an excellent degree of agreement for the ratio measurement (ICC = 0.74, 0.97).
Deep learning method
The best hyperparameters of this model were initial learning rate = 1e-4, validation frequency = 5, minibatchsize = 10, maxepochs = 200, L2 regularization = 0.0001, and momentum = 0.9. After training of the model, the validation accuracy was 90%. The 50 test-set samples were assessed by this model and achieved the accuracy of 91.95% (Table 1).
Discussion
In the current study, the greater sciatic notch (GSN) shape of females was generally found to be wide and symmetrical whereas males tend to exhibit narrow and asymmetrical GSN shape. The accuracies of the validation set were 86, 91.5, and 90% for dry bone, image-based, and deep learning methods, respectively. The accuracies of the test set were 80.65, 90.3, and 91.95% for dry bone, image-based, and deep learning methods, respectively. According to the results, the image-based and deep learning methods achieved acceptable accuracy (> 90%). The intra- and inter-observer reliabilities showed moderate to an almost perfect agreement (Kappa = 0.67, 0.93) 16 for the dry bone method. For the image-based method, substantial to an almost perfect agreement with GSN shape scoring only (Kappa = 0.80, 0.87) 16 and good to excellent agreement with metric data (ICC = 0.74, 0.97) were reported in this study. 17 Therefore, our results indicated that the image-based method was slightly more reliable than the dry bone method.
In previous studies, the greater sciatic notch shape of the os coxa was the best parameter for sex determination.9,10,13 Using the morphological method, Patriquin et al. 9 studied sex determination in South African whites and blacks by examining the shape of the greater sciatic notch. They noted that the greater sciatic notch shape achieved the highest accuracy for classifying sexes, averaging 87.5%, followed by pubic shape at 84.5%. Moreover, the study of Wangdee et al. 10 also reported the high accuracy (98.2% - 98.7%) and agreed that the greater sciatic notch shape was one of the best sex predictors (accuracy = 98.6%) in the Thai population. Similarly, the present study also observed the shape of the greater sciatic notch, but lower accuracy (86%) was reported despite identical methodology and bone collection, which indicated that this method was prone to subjective errors. Generally, the sexual dimorphism of the greater sciatic notch shape is palpable, in which wide and symmetrical GSN shape is more common in females whereas narrow and asymmetrical GSN shape is frequently seen among males. Consistent with that notion, the female shape was wide and symmetrical, and the male shape was narrow and asymmetrical. However, although the study of Steyn et al. 11 found that the South African black males had the typical narrow shape, the white males showed a very wide GSN shape while both the black and white females had typical wide notches. Perhaps, ethnic components may also contribute to the variation of the greater sciatic notch shape.
The morphometric method was developed to increase the accuracy and reliability of sex estimation from the GSN. Based on the current findings, the intra-observer reliabilities between dry bone and image-based methods were drastically different while the inter-observer reliabilities of dry bone and image-based methods were relatively more similar because the measurement of the os coxa images was implemented to the image-based methodology. Thus, the probability of measurement errors in inter-observer reliability might be higher than the scoring-assessment-only method. However, the inter-observer reliability of the image-based method was acceptable (ICC = 0.74), and the level of intra- and inter-observer reliability was higher than that of the dry bone method. In past studies, the sliding caliper was used for dry bone measurement. For example, Steyn et al., 11 Singh and Potturi, 13 Mahakkanukrauh et al., 7 and Arun et al. 18 used the sliding caliper for measuring the os coxa. Their results showed high validation accuracy, ranging from 94.5% to 96.7%. The present study used 2D images of os coxae for both visual examination and measurement. The validation accuracies of (a) observing the shape of the greater sciatic notch only, and (b) measuring the parameters in 2D os coxa images only, were reported 88% and 87.5%, respectively in this study. As already summarized, high accuracy was achieved (91.5%) when combining two approaches: visual scoring and measurement of images. Therefore, these two approaches should be carried out simultaneously to guarantee high accuracy when using the image-based method.
The aim of this paper was to increase the accuracy and reliability of the sex determination tool from the os coxa. Hence, we implemented the deep convolutional neural network (GoogLeNet) in this study. The deep learning technology has been widely applied to medical image analyses, such as classification of images for the diagnosis, localization for anatomical study, detection of the lesion or tumor, segmentation of images for focusing the regions of interest, and registration. 15 The performance of CNNs is also generally exceptional. Therefore, AlexNet or GoogLeNet have seen widespread usage in medical analysis.19,20 A GoogLeNet module called Inception V3 has also recently been utilized in image analysis. Examples of anatomical area analysis for which deep learning has been applied include the brain, eyes, chest, breast, cardiac, abdomen, digital pathology and microscopy, and musculoskeletal system, in which predictive accuracy is high. 19 In the current study, training samples were limited, hyperparameters were also adjusted before training, and images were not augmented with more samples. Despite those limitations, the predictive accuracies were reported 90% and 91.95% for the validation and the test group, respectively.
Our study demonstrated that two in three methods, namely image-based and deep learning methods, performed effectively in sex classification. The deep learning method was less time-consuming for assessing the test set, in which it used approximately one minute per one case. For the image-based method, the accuracy of the test samples was similar to that of the deep learning method although it was more time-consuming for testing and used approximately five minutes per one case. The morphological method yielded more errors for male prediction whereas the deep learning performed better in the classification of the greater sciatic notch shape images. For that reason, the deep learning method may be the first-choice tool for sex analysis of the GSN shape due to its time efficiency and high accuracy. The findings also suggest that diagnosis for hip bone fractures affecting the greater sciatic notch area can be done remotely by experts via image analysis using a deep learning model. Furthermore, the image-based method is the most viable option when the researchers have limited access to the deep learning program. Further study should apply the deep learning technology in other fields that also use images of dry bone or other items of interest for prediction, in addition to more samples for training data to yield higher accuracy.
Conclusion
In this study, the observation of the greater sciatic notch shape presented more erroneous predictions in males, but the image-based and deep learning methods were exceptional in classifying the sexes. The image-based method achieved a high degree of accuracy, which was similar to that of the GoogLeNet network of the deep learning method. Although the image-based method showed high accuracy, it required a significant amount of time for the assessment. On the contrary, the usage of the GoogLeNet network was rapid, easy to analyze, highly accurate, and viable for distant consultation with the experts. For future directions, it is recommended that deep convolutional neural networks should be applied to image analysis for determining the sex, age, stature, or ancestry from the Thai skeletons.
Footnotes
Acknowledgements
The authors are grateful to the Faculty of Medicine, Chiang Mai University for financial support, and especially to the Forensic Osteology Research Center, Faculty of Medicine, Chiang Mai University for the samples used in this study. Special thanks to Patara Rattanachet for the linguistic review.
Declaration of conflicting interests
The authors declared no potential conflicts of interest concerning the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Faculty of Medicine, Chiang Mai University [Grant No. 012-2564]. The first author received a TA/RA scholarship from the Graduate School, Chiang Mai University.
