Inter- and Intraobserver Agreement in the Assessment of Thyroid Nodule Ultrasound Features and Classification Systems: A Blinded Multicenter Study

Abstract

Background:

Single-center trials demonstrated moderate-substantial level of interobserver agreement in the evaluation of ultrasound (US) features of thyroid nodules. Multicenter studies on US agreement, however, are scanty, and data on intraobserver agreement are poor. Aim of the study was to assess inter- and intraobserver agreement between different thyroid centers and different specialists.

Methods:

A blinded analysis of 100 electronically recorded thyroid nodule US images was conducted in three large-volume thyroid centers by seven radiologists and endocrinologists. The evaluation was repeated after randomization 4 months later. The following US characteristics were evaluated: composition, echogenicity, margins, intranodular echogenic spots, vascularity, and shape. Thyroid nodules were also classified according to AACE/ACE/AME, EU-TIRADS, ATA, and ACR-TIRADS US classifications. Intra- and interobserver agreement was calculated using cross-tabulation expressed as mean Cohen's Kappa.

Results:

Interobserver agreement for US features: K-coefficient was 0.53 for composition, 0.47 for echogenicity, 0.46 for intranodular vascularity, and 0.33 for margins of the nodules. For echogenic foci, the K-coefficient was 0.47 for microcalcifications, 0.38 for macrocalcifications, 0.11 for the subcategory comet-tail artifacts, and 0.42 for shape. Operators resulted uncertain on hyperechoic foci definition in 16% of cases and described them as “hyperechoic foci of uncertain significance.” Interobserver Cohen-K for US classification systems was 0.44 for AACE, 0.42 for ACR-TIRADS, 0.39 EU-TIRADS, and 0.34 for ATA. Intraobserver agreement: the K-coefficient for nodule US features was 0.62 for intranodular vascularity, 0.58 for composition, 0.60 for echogenicity, 0.54 for macrocalcifications, 0.55 for microcalcifications, 0.47 for comet tails, 0.39 for margins, and 0.35 for shape. Intraobserver Cohen-K for US classification systems was 0.54 for AACE, 0.49 for ACR-TIRADS, 0.38 for ATA, and 0.33 for EU-TIRADS.

Conclusions:

Intraobserver reproducibility for thyroid nodule US reporting and US classification systems appears fairly adequate, while the interobserver agreement between different centers is lower than that assessed in single-center trials. Reporting and rating ability of thyroid US examiners still appear not consistent. An unified lexicon of thyroid US features, a simplified method of classification, and a dedicated training in the description of thyroid US findings may increase the observers' agreement and the predictive value of US classification systems in real world practice.

Introduction

Thyroid nodules are a common clinical finding, being detected in 19–67% of the general population (1,2). As most thyroid lesions are benign and may be safely managed with a surveillance program, the main goal of their initial assessment is the identification of the minority of nodules that could harbor a clinically significant cancer (3).

Thyroid ultrasound (US) examination is widely used as the first diagnostic tool for the management of thyroid nodular disease because it may demonstrate the presence of a few well-established findings suspicious for malignancy (4,5). In the majority of cases, however, thyroid sonography reveals less clear-cut US features with a low level of clinical predictivity. For these reasons, major Endocrine and Radiological Societies produced US classification systems that were reported to provide a fairly good prediction of malignancy and, potentially, a more accurate selection of the lesions to be submitted to fine needle aspiration (FNA) biopsy (6 –10).

Single-center trials demonstrated a good level, from moderate to substantial, of interobserver agreement in the evaluation of the US features of thyroid nodules (3,5,11 –15). Yet, thyroid US is a rather subjective imaging method and is dependent on the specific expertise of the operators and the quality of their US equipment (5,11,16). Therefore, wider multicenter studies are needed to establish the actual clinical usefulness of US classification systems and their applicability to real world practice. Unfortunately, studies on the level of agreement between different thyroid centers are scanty, and data on intraobserver agreement for thyroid nodule features are poor (5,11 –16).

Aim of this study was to assess the inter- and intraobserver agreement between different thyroid centers and different specialists, endocrinologists, and radiologists with specific thyroid expertise, in the evaluation of thyroid nodule US features and the definition of the US classification scores.

Materials and Methods

One hundred thyroid nodules, either solitary or in multinodular goiter, referred for surgery to the Regina Apostolorum Thyroid Center, had a preoperatory assessment by means of two similar state-of-art US machines (Esaote Twin, Genoa, Italy) equipped with a 5–15 MHz linear transducer. Sample size was determined according to the guidelines proposed by Cantor (17), thus estimating a minimum sample size of 50 cases. US images were acquired during preoperatory neck evaluation by two operators with specific thyroid expertise who did not participate in the following trial. The scanning protocol included both transverse and longitudinal real-time multiplane imaging of each thyroid lesion. Static gray scale, as well as color-doppler, images were saved to PACS and their labeling was removed before the trial. A blinded analysis of the electronically-recorded US images was separately conducted in three high-volume centers for thyroid diseases (Regina Apostolorum, Rome; Catholic University, Rome; and Santa Maria Nuova, Reggio Emilia, Italy) by seven thyroid imaging experts (two radiologists and five endocrinologists) with an at least 15-year experience.

The mean size nodule was 2.7 cm (range 0.6–8.0 cm). There were 30 (30%) malignancies with a mean size of 2.2 cm (range 0.6–5.3 cm), of which 24 (80%) were papillary carcinomas, 5 (16.3%) were follicular carcinomas, and 1 (3.3%) was a medullary cancer. The benign nodules had a mean size of 2.9 cm (range 0.9–8.0 cm). The structure of the nodules, according to the EU-TIRADS Lexicon (8), was as follows: solid: 64, predominantly solid: 22, predominantly cystic: 9, spongiform: 3, and cystic: 2.

Before starting the trial, all the participants received the EU-TIRADS lexicon for the definition of thyroid nodule US findings and the schemes of the four US classification systems under evaluation (8). After 1 month their understanding of the reporting modalities of the trial was assessed by an external monitor (A.P.).

The experts were requested to fill-in for each set of images a multiple-answer electronic questionnaire based on the EU-TIRADS lexicon and, subsequently, to perform a stratification of the risk of malignancy of the lesions on the basis of the 5-class ACR-TIRADS, ATA, and EU-TIRADS systems and the 3-class AACE/ACE/AME US classification (7 –10). While the ACR-TIRADS classification is based on points, the other US systems are based on patterns.

The following US characteristics, defined in accordance with the EU-TIRADS terminology (8), were evaluated: composition (solid, predominantly solid, predominantly cystic, and cystic); echogenicity (hyperechoic, isoechoic, mildly and deeply hypoechoic); margins (well-defined, ill-defined, microlobulated, and spiculated); vascularity (no vascular signals, perinodular or slight intranodular flow, and marked intranodular flow); echogenic foci (microscopic, macroscopic, continuous or interrupted egg shell calcifications, and comet-tail artifacts); and shape (round/oval, taller than wide) (8). In case of uncertainty, the experts were also required to define the hyperechoic foci as “spots of uncertain significance” (7).

After 4 months, the experts performed a second blinded evaluation of the same images after their randomization in a different order. No consultation was allowed, and the blindness of the trial was supervised. The study flow chart is summarized in Figure 1. This retrospective observational study received the Institutional Board review and approval and followed the tenants of the Declaration of Helsinki.

FIG. 1.

Flowchart of the study.

Statistical analysis

Statistical analysis was performed using the Statistical Package for Social Science (IBM-SPSS), release 21.0. Continuous variables are expressed as mean ± standard deviation, while categorical variables are displayed as frequencies.

The interobserver agreement was calculated using cross-tabulation expressed in Cohen's Kappa. Kappa values were evaluated, according to the standard proposed by Landis and Koch, as follows: 0–0.20 poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.0 almost perfect agreement (18).

For each US features and classification system, the distribution of the different grades of concordance was also calculated. Nodules with concordance were defined as the frequency of nodules displaying an agreement of at least six out of seven observers. Mean Cohen's K and mean concordance were calculated from each repeated intrarater observation (test–retest reliability). The intraobserver agreement was estimated as the percentage of answers that were coincident during the two examinations performed by the experts.

Results

The level of interobserver agreement for the different US features of thyroid nodules is reported in Table 1.

Table 1.

Interobserver Agreement for the Ultrasound Features and Ultrasound Classification Systems of Thyroid Nodules

US features	Mean Cohen's K	Level	Mean concordance (%)
Composition	0.53	Moderate	71
Echogenicity	0.47	Moderate	48
Margins	0.33	Fair	64
Intranodular vascularity	0.46	Moderate	51
Microscopic calcifications	0.47	Moderate	93
Macrocalcifications	0.38	Fair	94
Egg shell calcification	0.65	Substantial	99
Comet	0.11	Poor	87
Taller than wide	0.47	Moderate	93
AACE/ACE/AME	0.44	Moderate	51
ATA	0.34	Fair	26
EU-TIRADS	0.39	Fair	33
ACR	0.42	Moderate	36

Mean concordance: defined as a consistent definition by seven out of seven observers (100%) and by six out of seven observers (86%).

US, ultrasound.

The mean K coefficient was 0.53 (moderate level of agreement) for composition, 0.47 (moderate) for echogenicity, and 0.46 (moderate) for intranodular vascularity. For the echogenic foci, the K-coefficient was 0.47 (moderate) for microscopic calcifications, 0.38 (fair) for macrocalcification, and 0.11 (poor) for comet-tail artifacts, respectively. The operators resulted uncertain on the conclusive definition of the hyperechoic foci in 16% of cases and preferred to describe them as “hyperechoic foci of uncertain significance.”

Finally, the K-coefficient was 0.33 (fair) for the description of the margin of the nodules and was 0.42 (moderate) for the shape (oval vs. round).

The interobserver mean Cohen-K for US classification systems was 0.44 for AACE (moderate level of agreement), 0.42 (moderate) for ACR-TIRADS, 0.39 (fair) for EU-TIRADS, and 0.34 (fair) for ATA.

The results of the intraobserver analysis are reported in Table 2 as the mean K-coefficient and the mean of the concordance degree.

Table 2.

Intraobserver Agreement for the Ultrasound Features and Ultrasound Classification Systems of Thyroid Nodules

US features	Mean Cohen's K	Level	Mean concordance (%)
Composition	0.58	Moderate	84.6
Echogenicity	0.60	Moderate	73.0
Margins	0.39	Fair	75.4
Intranodular vascularity	0.62	Substantial	77.5
Microcalcifications	0.55	Moderate	93.6
Macrocalcifications	0.54	Moderate	95.7
Egg shell calcification	0.96	Almost perfect	99.7
Comet	0.47	Moderate	95.3
Taller than wide	0.35	Fair	92.7
ATA	0.38	Fair	64.7
AACE	0.54	Moderate	75.7
EU-TIRADS	0.33	Fair	55.0
ACR tot	0.49	Moderate	64.5

The K-coefficient for nodule US feature was 0.62 (substantial level of agreement) for intranodular vascularity, 0.58 (moderate-substantial) for composition, 0.60 (moderate-substantial) for echogenicity, 0.54 (moderate) for macrocalcifications, 0.55 (moderate) for microcalcifications, 0.47 (moderate) for comet tails, 0.39 (fair) for margins, 0.35 (fair) for taller than wide shape, and 0.35 (fair) for the shape (oval vs. round). The mean Cohen-K coefficient for US classification systems was 0.54 (moderate level of agreement) for AACE, 0.49 (moderate) for ACR-TIRADS, 0.38 (fair) for ATA, and 0.33 (fair) for EU-TIRADS.

Discussion

Several thyroid nodule US classification systems from different scientific societies are currently available for the evaluation of the risk of malignancy and the indication to cytological assessment (7 –10). A few recent studies demonstrated that the main US classification systems have an elevated predictive value of malignancy in high-risk US categories and that they are effective for ruling-out the indication to FNA in low US risk nodules (3). The 3- and the 5-category classifications released by the ETA, ATA, AACE/ACE/AME, American College of Radiology, and SKSTR (7 –10,19) showed a rather similar diagnostic accuracy and a substantial interobserver agreement (Table 3) in single center trials (3,5,11,14). However, a few problems and some potentially limiting aspects should be considered. The available data are mostly due to single-center trials, generally performed by examiners with similar training, while the evidence from controlled multicenter studies is at present limited (5,14,15). Thyroid US examination, moreover, is an operator-dependent diagnostic procedure and is influenced by the specific expertise and the quality of the US equipment, relevant for the accurate definition of some characteristics. Finally, the intraobserver agreement was not, until now, systematically assessed and was evaluated either in radiological or endocrinological services. This issue is relevant because the concordance of the same operator in repeated examinations could be less than optimal when reporting a few subtle differences in US findings, such as the presence of ill-defined, speculated, or lobulated margins or the actual source of intranodular echoic spots.

Table 3.

Interobserver Agreement for the Different Ultrasound Classification Systems in Multi- and Single-Center Studies

US classification system	Multicenter	Single center
US classification system	Present study	Persichetti et al. (2018)	Grani et al. (2018)	Hoang et al. (2018)	Pang et al. (2019)
AACE/ACE/AME	0.44	0.82	0.73	—	—
ATA	0.34	0.76	0.75	—	0.51
EU-TIRADS	0.39	—	0.68	—	—
ACR	0.42	—	0.61	0.51	—

Interobserver agreement is expressed with Cohen's K.

The present study addressed the inter- and intraobserver agreement in the definition of the main thyroid nodule US findings using four major US classification systems.

Interobserver agreement

The level of agreement among operators from different thyroid centers in the description of US features of thyroid nodules was only partially satisfactory as it ranged from fair to moderate, only partly confirming the results from former single-center studies. The lowest level of consistency was found for the characteristics of margins, with a Cohen's K = 0.33, similar to the previously reported data, ranging from 0.13 to 0.61 (13,20). The agreement on the presence of microcalcifications and the composition of the nodules resulted rather lower than the values found in former single center trials that ranged from 0.51–0.54 and 0.62–0.81, respectively (11 –13,21). The low level of agreement for microcalcifications, interrupted rim calcifications, and shape are in accordance with a recent trial on the predictivity of the ATA classification (14).

The interobserver agreement for the four thyroid nodule US classification systems was not completely satisfactory as well. The AACE/ACE/AME and ACR had a moderate level of agreement (Cohen's K = 0.44 and 0.42, respectively), while the ATA and EU-TIRADS demonstrated only a fair interobserver concordance (Cohen's K = 0.34 and 0.39, respectively). These results appear less acceptable for clinical practice than those found in former single-center trials, as reported in Table 3.

Not surprisingly, these results are in accordance with the studies concerning the discrepancy in the US assessment of breast masses with the BI-RADS US lexicon (19,20,22,23). In analogy with thyroid US data, the interobserver agreement in breast lesions ranged from 0.32 to 0.37 for margins, from 0.36 to 0.41 for the echo pattern, and from 0.48 to 0.51 for the presence of microcalcifications. The choice of an indeterminate report for intranodular echoic spots could represent a warning about the possible presence of a potentially malignant finding and could be considered part of an intermediate risk category. The performance rate was lower for less trained operators, but was improved by a dedicated training, similarly to what was reported for thyroid nodule classifications (11,16).

Intraobserver agreement

Intraobserver reproducibility was higher than interobserver agreement but was not satisfactory for a reliable use in clinical practice. The agreement was, as a mean, moderate for nodule composition, echogenicity, and microcalcifications (Cohen's K = 0.58, 0.60, and 0.55, respectively) and was fair only for the margin's definition (Cohen's K = 0.39). Similarly, the intraobserver agreement for the US classification systems was moderate for AACE/ACE/AME and ACR-TIRADS (Cohen's K = 0.54 and 0.49, respectively) and fair only for the ATA and EU-TIRADS classifications (Cohen's K = 0.38 and 0.33, respectively).

As a whole, the results of the present study demonstrate that the interobserver agreement between thyroid US experts operating in different centers ranges from fair to moderate for both the definition of the single US features and the rating of the nodules according to the US classification systems.

Variability in evaluating thyroid nodule US features was highest for margins and echogenic foci, except for macrocalcifications, confirming the data reported in a recent study, that demonstrated a Cohen-K value ranging from 0.25 to 0.39 (4). The classifications with a lower and less articulated number of classes showed a better inter- and intraobserver reliability than the more complex ones. As the description of specific features may markedly modify the rating of the risk of malignancy, the definition of these findings should be carefully characterized and the option for an “indeterminate report” should be considered for high-risk findings (as for the presence/absence of microcalcifications) in case of operator uncertainty.

As the intraobserver was better than interobserver reproducibility, even if barely adequate for clinical practice, the operators demonstrated that their personal criteria of reporting and classification for thyroid nodule US features were consistent, although with incomplete agreement with the examiners of the other centers.

Limits of the study

Even if the study was designed for preventing major bias, a few limitations are present. Observers were blinded to the conclusive pathological findings, but they were aware that all the nodules under examination were submitted to surgery and had pathologic confirmation. In their everyday practice, the observers mostly used the AACE/ACE/AME and the EU-TIRADS systems. This could have influenced, despite the preliminary training, the results of the agreement because of the uneven initial familiarity with the different classification methodologies. The use of video clips might have improved the study making it closer to real conditions of clinical practice. Therefore, a randomized study comparing the agreement between static images and clips should be useful for future considerations. Finally, the study provides information about results obtained by expert thyroid US operators, but the agreement among less experienced sonographers could be different and, possibly, less satisfactory (24).

In conclusion, the present study highlights that, even among experts, the interobserver agreement among multiple centers is low and that more work is needed. In the community and in centers without specific thyroid expertise, the situation is probably even less satisfactory.

To improve the inter- and intraobserver agreement, an universal lexicon and classification system should be released by the major scientific societies of the field and a consensus conference should be held in the different professional organizations. Moreover, even if we did not address specifically this issue, on the basis of the present and previous studies assessing the predictivity of classification systems, a 4-tier system might provide an appropriate balance between the complexity of the classification and the accurateness of risk scoring. Finally, on the basis of initial results (25), the use of artificial intelligence appears to be promising tool for the improvement of the diagnostic accuracy and consistency of thyroid US reporting.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

Funding Information

No funding was received for this article.

References

Dean

, Gharib

. 2008. Epidemiology of thyroid nodules. Best Pract Res Clin Endocrinol Metab, 22:901–911.

Guth

, Theune

, Aberle

, Galach

, Bamberger

. 2009. Very high prevalence of thyroid nodules detected by high-frequency (13 MHz) ultrasound examination. Eur J Clin Invest, 39:699–706.

Persichetti

, Di Stasio

, Guglielmi

, Bizzarri

, Taccogna

, Misischi

, Graziano

, Petrucci

, Bianchini

, Papini

. 2018. Predictive value of malignancy of thyroid nodule ultrasound classification systems: a prospective study. J Clin Endocrinol Metab, 103:1359–1368.

Papini

, Guglielmi

, Bianchini

, Crescenzi

, Taccogna

, Nardi

, Panunzi

, Rinaldi

, Toscano

, Pacella

. 2002. Risk of malignancy in nonpalpable thyroid nodules: predictive value of ultrasound and color-Doppler features. J Clin Endocrinol Metab, 87:1941–1946.

Hoang

, Middleton

, Farjat

, Teefey

, Abinanti

, Boschini

, Bronner

, Dahiya

, Hertzberg

, Newman

, Scanga

, Vogler

, Tessler

. 2018. Interobserver variability of sonographic features used in the American College of Radiology Thyroid Imaging Reporting and Data System. AJR Am J Roentgenol, 211:162–167.

Kim

, Park

, Chung

, Oh

, Kim

, Lee

, Yoo

. 2002 New sonographic criteria for recommending fine-needle aspiration biopsy of non palpable solid nodules of the thyroid”. AJR Am J Roentgenol, 178:687–691.

Gharib

, Papini

, Garber

, Duick

, Harrel

, Hegedus

, Paschke

, Valcavi

, Vitti P; AACE/ACE/ AME Task Force on Thyroid

Nodules

. 2016. American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazione Medici Endocrinologi medical guidelines for clinical practice for the diagnosis and management of thyroid nodules—2016 update. Endocr Pract, 22:622–639.

Russ

, Bonnema

, Erdogan

, Durante

, Ngu

, Leenhardt

. 2017. European Thyroid Association guidelines for ultrasound malignancy risk stratification of thyroid nodules in adults: the EU-TIRADS. Eur Thyroid J, 6:225–237.

Yoon

, Lee

, Kim

, Moon

, Kwak

. 2016. Malignancy risk stratification of thyroid nodules: comparison between the Thyroid Imaging Reporting and Data System and the 2014 American Thyroid Association Management Guidelines. Radiology, 278:917–924.

10.

Tessler

, Middleton

, Grant

, Hoang

, Berland

, Teefey

, Cronan

, Beland

, Desser

, Frates

, Hammers

, Hamper

, Langer

, Reading

, Scoutt

, Stravros

. 2017. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): white paper of the ACR TI-RADS Committee. J Am Coll Radiol, 14:587–595.

11.

Grani

, Lamartina

, Cantisani

, Maranghi

, Lucia

, Durante

. 2018. Interobserver agreement of various thyroid imaging reporting and data systems. Endocr Connect, 7:1–7.

12.

Park

, Park

, Choi

, Kim

, Son

, Lee

, Yoon

, Kim

, Moon

, Kwak

. 2012. Interobserver variability and diagnostic performance in US assessment of thyroid nodule according to size. Ultraschall Med, 33:166–190.

13.

Choi

, Kim

, Kwak

, Kim

, Son

. 2010. Interobserver and intraobserver variations in ultrasound assessment of thyroid nodules. Thyroid, 20:167–172.

14.

Pang

, Margolis

, Menezes

, Maan

, Ghai

. 2019. Diagnostic performance of 2015 American Thyroid Association guidelines and inter-observer variability in assigning risk category. Eur J Radiol Open, 6:122–127.

15.

Itani

, Assaker

, Moshiri

, Dubinsky

, Dighe

. 2019. Inter-observer variability in the American College of Radiology Thyroid Imaging Reporting and Data System: in-depth analysis and areas for improvement. Ultrasound Med Biol, 45:461–470.

16.

Kim

, Park

, Jung

, Kang

, Kim

, Choi

, Kim

, Oh

, Kim

, Jeong

, Yim

. 2010. Observer variability and the performance between faculties and residents: US criteria for benign and malignant thyroid nodules. Korean J Radiol, 11:149–155.

17.

Cantor

. 1996. Sample-size calculations for Cohen's Kappa. Psychol Methods, l:150–153.

18.

Landis

, Koch

. 1977. The measurement of observer agreement for categorical data. Biometrics, 33:159–174.

19.

Shin

, Baek

, Chung

, Ha

, Kim

, Lee

, Lim

, Moon

, Na

, Park

, Choi

, Hahn

, Jeon

, Jung

, Kim

, Kwak

, Lee

, Park

, Sung JY; Korean Society of Thyroid Radiology (KSThR) and Korean Society of

Radiology

. 2016. Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean Society of Thyroid Radiology consensus statement and recommendations. Korean J Radiol, 17:370–395.

20.

Wienke

, Chong

, Fielding

, Zou

, Mittelstaedt

. 2003. Sonographic features of benign thyroid nodules: interobserver reliability and overlap with malignancy. J Ultrasound Med, 22:1027–1031.

21.

Park

, Kim

, Jung

, Kang

, Kim

, Choi

, Sung

, Yim

, Jeong

. 2010. Observer variability in the sonographic evaluation of thyroid nodules. J Clin Ultrasound, 38:287–293.

22.

Schwab

, Redling

, Siebert

, Schotzau

, Schoenenberger

, Zanetti-Dallenbach

. 2016. Inter-and Intra- observer agreement in ultrasound BI-RADS Classification an Real-Time Elastography Tsukuba Score Assessment of Breast lesions. Ultrasound Med Biol, 42:2622–2629.

23.

Reston VA 2013 American College of Radiology. ACR BI-RADS_Atlas In Breast Imaging Reporting and Data System, 5th ed. American College of Radiology, Reston, VA.

24.

Koh

, Kim

, Lee

, Kim

, Kwak

, Moon

, Yoon

. 2018. Diagnostic performances and interobserver agreement according to observer experience: a comparison study using three guidelines for management of thyroid nodules. Acta Radiol, 59:917–923.

25.

Wildman-Tobriner

, Buda

, Hoang

, Middleton

, Thayer

, Short

, Tessler

, Mazurowski

. 2019. Using artificial intelligence to revise ACR TI-RADS risk stratification of thyroid nodules: diagnostic accuracy and utility. Radiology, 292:112–11.