Abstract
The age threshold of 14 years is relevant in Italy as the minimum age for criminal responsibility. It is of utmost importance to evaluate the diagnostic accuracy of every odontological method for age evaluation considering the sensitivity, or the ability to estimate the true positive cases, and the specificity, or the ability to estimate the true negative cases. The research aims to compare the specificity and sensitivity of four commonly adopted methods of dental age estimation – Demirjian, Haavikko, Willems and Cameriere – in a sample of Italian children aged between 11 and 16 years, with an age threshold of 14 years, using receiver operating characteristic curves and the area under the curve (AUC). In addition, new decision criteria are developed to increase the accuracy of the methods. Among the four odontological methods for age estimation adopted in the research, the Cameriere method showed the highest AUC in both female and male cohorts. The Cameriere method shows a high degree of accuracy at the age threshold of 14 years. To adopt the Cameriere method to estimate the 14-year age threshold more accurately, however, it is suggested – according to the Youden index – that the decision criterion be set at the lower value of 12.928 for females and 13.258 years for males, obtaining a sensitivity of 85% and specificity of 88% in females, and a sensitivity of 77% and specificity of 92% in males. If a specificity level >90% is needed, the cut-off point should be set at 12.959 years (82% sensitivity) for females.
Introduction
Dental age estimation (DAE), which analyses the maturation of the radicular structure of the teeth, has been a well-known and widely utilised procedure for a long time.1–4 Several methods of DAE in subadults are based on the study of seven permanent teeth, while others, which are dedicated mostly to the DAE of individuals older than 16, rely on analysis of the third molar.5–10 Every method, however, has different accuracy, sensitivity and specificity in the estimation when considered at different age thresholds.11,12 To date, very little research has focused on the sensitivity and specificity indices and the connected rate of false age classifications (false attribution over/under the threshold) especially for the age threshold of 14 years. 13 Moreover, receiver operating characteristic (ROC) curves have been seldom experimented to evaluate the performance of dental methods for age assessment.14,15
According to Italian law, there are two different age thresholds to consider. The minimum age for criminal accountability is 14 years. Every case involving individuals between 14 and 18 years of age must then be established individually using different criteria, depending mainly on the assessment of the psychological maturity of each subject. An individual older than18 years of age is considered a fully responsible and accountable adult.
In criminal law, one of the most important issues which often arises during the analysis of the age estimation procedures is the different specificity index of the adopted method which expresses the rate of false positives. 16 Even if an age misclassification over the threshold does not lead to an erroneous guilty verdict, it could imply heavy legal and ethical consequences, since it could result in a minor being treated as an accountable juvenile (age > 14 years) or as an adult (age > 18 years). 17 Hence, in criminal proceedings, only methods which are characterised by very high specificity and which can provide estimations with a very high probability (at least 90%) can fulfil the legal requirements and support the criminal court in the process of age assessment. 18 Otherwise, in a civil law context, the operator is allowed to take a different approach to the specificity and sensibility of the method of age estimation, since in Italy and many other countries, a probability of 51% may suffice for age assessment in civil proceedings. 19
The present study is a continuation of previous research 16 which explored and compared the accuracy in estimating the age of a sample of Italian subadults using four different methods: the Cameriere (C),20,21 the Demirjian (D),22,23 the Haavikko (H) 24 and the Willems (W)25–27 methods. As in the previous research, methods were applied based on the maturation of the dentition up to the second molar and excluding the third. Therefore, Cameriere's research which considered all the right mandibular teeth is not considered here. 28 Starting from the specificity and sensitivity indexes evaluated for the same methods (D, W, H and C), the present research explores the performance of these methods in predicting the attainment of the age threshold of 14 years. Moreover, the statistical analysis based on ROC curves and new decision criteria are developed, thereby yielding an improvement in the performances of the methods with respect to their best values of specificity and sensitivity, which are inherently linked.29,30
Materials and methods
A total of 501 digital ortopantomographs (OPGs) of Italian children of Caucasian origin were taken from three selected clinical radiology offices from northern, southern and central Italy. The inclusion criteria were aged between 11 years (4015 days) and 15 years and 364 days (5839 days), Italian nationality, Caucasian ethnic group, an unremarkable medical history and digital files and good quality images available. Exclusion criteria were systemic diseases; premature birth; congenital anomalies; oral pathologies such as tooth agenesia, endodontic treatments, large carious lesions involving the dental pulp, gross mandibular pathologies; X-rays of poor quality; and incomplete data regarding sex or chronological age.
The composition of the sample.
Four methods were adopted to provide dental age estimations, and each one took into consideration the maturation of dentition up to the second molar (excluding the third molar). The methods were:
The original Demirjian (D) method for seven teeth, with values taken at the 50th percentile. The Willems (W) conversion score elaborated with the polynomial regression system for Belgian individuals. The estimated age was calculated at the 50th percentile. The Cameriere (C) method, using the European formula, as presented in the AgEstimation Project website (http://agestimation.unimc.it). The Haavikko (H) method, with values taken at the 50th percentile.
Statistics
The intra-rater agreement was calculated for each method using the intra-class correlation coefficient, rescoring 50 radiographs six weeks after the first assessment of dental development. Inter-rater agreement was calculated using two other examiners using the intra-class correlation coefficient for each method.
Using the age threshold of 14 years, the area under the ROC curve (AUC)29,30 was calculated for each method and each sex. The binomial exact 95% confidence intervals (CI) for the AUC were also calculated.
A ROC curve is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the total actual positives (true positive rate or sensitivity) versus the fraction of false positives out of the total actual negatives (false positive rate or 1-specificity), at various threshold settings. In the present context, the sensitivity represents the rate of subjects correctly classified as ≥14 years old. The specificity represents the rate of the subjects correctly classified as <14 years. An AUC of 1 represents a perfect test; an AUC of 0.5 represents a worthless test.
Comparisons in AUC for C, D, H and W methods were performed for females and males using the test of De Long. 31
A criterion was chosen maximising the Youden index (sensitivity-[1-specificity]), and sensitivity, specificity, accuracy (the proportion of all corrected results) and positive predictive value (PPV) were calculated. 32 Sensitivity and specificity are measures of intrinsic diagnostic accuracy because they are not affected by the prevalence of the condition. Conversely, the accuracy and PPV are influenced by the prevalence of the condition. This new criterion was chosen to improve the estimate of the age threshold of 14 years in the Italian population, and for the best method, a new criterion with a specificity of 90% was calculated.
Results
Intra-class correlation coefficients (ICC) for intra-rater agreement.
ICC for inter-rater agreement.
The sample was almost equally divided into males (244) and females (257), and nearly equally divided into age cohorts from 11 to 16 years (Table 1). One hundred and three subjects (40%) in the female group, and 109 (45%) in the male group, were 14 years old or older.
AUC, Youden criterion, sensitivity, specificity, accuracy and PPV for females.
AUC: area under the curve; Accuracy: proportion of all corrected results; PPV: positive predictive value.
AUC, Youden criterion, sensitivity, specificity, accuracy and PPV for males.

Receiver operating characteristic (ROC) curve for each method in females.

ROC curve for each method in males.
Difference in AUC between methods in females.
Difference in AUC between methods in males.
ROC curves are a graphical chart built to plot the fraction of true positives (and so the sensitivity) versus the fraction of false positives (1-specificity). True positives make the curve grow upward, while false positives make the curve shift to the right. Every curve creates an area below it called the area under the curve (AUC); the higher the value of the area, the higher is the predictive value of the test in terms of the combination of sensitivity and specificity.
Moreover, with the need to minimise the false positive cases, as they are the worst possible error in DAE in criminal law cases, it can be useful to change the cut-off point of the different methods to get the best results in terms of the combination of sensitivity and specificity. The cut-off point is the value that discriminates between positive and negative results.
The Youden criterion for the Cameriere method in females was 12.928 years. The sensitivity was 85% [95% CI: 77–93%] and the specificity was 88% [95% CI: 81–92%]. Setting a specificity of 90%, the criterion becomes 12.959 years with a sensitivity of 82% [95% CI: 73–88%]. The Youden criterion for the Cameriere method in males was 13.258 years. The sensitivity was 77% [95% CI: 68–85%], and the specificity was 92% [95% CI: 86–96%]. There is no need to set a specificity of 90% in this case because the Youden criterion has a specificity of 92%
Discussion
There has been great interest for many years in the evaluation of age in subadults via the analysis of dental structure calcification and maturation, with a particular focus on specific age thresholds (i.e. 14, 16 and 18 years). There is strong evidence in the literature of the accuracy of the methods for DAE based on the analysis of OPGs.33–39 Each, however, offers a different specificity and sensitivity when applied to specific age thresholds, and most research has focused on the third molar and the higher age thresholds (16, 18 and 21 years). The issue of specificity, moreover, is of special importance – especially in a criminal law context – to avoid false positive attributions.
This study therefore compared four of the most common methods for DAE in an Italian sample distributed around the age threshold of 14 years (11–16 years). With the aid of ROC curve analysis, it aimed to find a method of evaluation that combines the highest possible specificity, which is necessary in the context of criminal case analysis, and the highest obtainable sensitivity in order to limit the percentage of false negatives. ROC curves are very helpful in determining the diagnostic capabilities of different methods. With the AUC value, it is easy to establish which of the methods gives the best combination of sensitivity and specificity.
The ROC curves examined were divided by sex and method.
The results of the present study show that the Cameriere method, which is based on the maturation of dentition up to the second molar and uses the European formula, provides the highest AUC value both for males and females. The Cameriere method shows a highly discriminative accuracy for the age threshold of 14 years, even if it was not developed for this purpose. Cameriere himself proposed an alternative approach that also included the wisdom tooth when a cut-off age of 14 years is used. 15 In fact, the European formula for a subject with a complete calcification of the seven mandibular permanent teeth allows a maximum attribution of 13.689 in males and 14.064 in females.
In terms of the Haavikko and Cameriere methods, it is necessary to set the cut-off point at a lower level to have greater accuracy of estimation at the age threshold of 14 years, while in the Demirjian and Willems methods, there is no need for a cut-off point change.
Therefore, to estimate more effectively the age threshold of 14 years in the Italian population, the Cameriere method can be used with a proper adjustment. In fact, setting the cut-off point at 12.928 for females and 13.258 years for males, it is possible to obtain a sensitivity percentage of 85% and a specificity percentage of 88% in females, and a sensitivity of 77% and specificity of 92% in males. If a specificity level >90% is needed, the cut-off point can be set at 12.959 (82% sensitivity) for females.
Conclusions
After examining a sample of Italian children aged between 11 and 16 years using ROC curves, it can be stated that:
In the case of the Haavikko and Cameriere methods, it is necessary to set the cut-off point at a lower level to have greater accuracy of estimation at the age threshold of in 14 years. Considering the Youden index, in the Cameriere method, the cut-off point should be set at 12.928 years in females and 13.258 years in males; in the Haavikko method, the cut-off point should be set at 12.20 years in females and 12.53 in males. The Cameriere method has the highest AUC value in both males and females. This method shows a high discrimination accuracy for the age threshold of 14 years. In the Cameriere method, by setting the cut-off point at 12.928 for females and 13.258 years for males, it is possible to obtain a sensitivity of 85% and specificity of 88% in females, and a sensitivity of 77% and specificity of 92% in males. If a specificity level >90% is needed, the cut-off point can be set at 12.959 (82% sensitivity) for females.
Future research should validate the proposed approach in a larger sample and investigate possible improvements of the predictions by applying the Cameriere's method taking into consideration the apex opening of the third molars as well.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
