Grading lumbar disc degeneration: a comparison between low- and high-field MRI

Abstract

Background

More advanced disc degeneration on magnetic resonance imaging (MRI) is found in individuals with low back pain. However, it is unclear whether this grading is independent of the scanner’s field strength.

Purpose

To compare disc degeneration on high- versus low-field MRI.

Material and Methods

Low back pain patients were enrolled to undergo high-field (3 T) MRI, followed by low-field (0.25 T) MRI of the lumbar spine within 3 h. Three radiologists graded the disc degeneration on Pfirrmann’s grading scale with a hiatus of 3 months. A subsample was regraded 6 months later. Reproducibility was measured by weighted kappa statistics (using PROC FREQ statement with AGREE in the TABLES statement for SAS), absolute agreement (i.e. 1:1 agreement/the total number) and the difference in the prevalence (McNemar test).

Results

Moderate to substantial agreement (κ = 0.52–0.62) and absolute agreement of 43.8–66.1% were found between field strengths. Low-field MRI tended to have numerically higher and lower grades than high-field MRI resulting in a significant difference in the prevalence of grades (p < 0.001). Both field strengths resulted in a moderate to substantial inter-reader agreement (low-field: κ = 0.63, 0.63, 0.54 and high-field: κ = 0.55, 0.43, 0.53) and intra-reader agreement (high-field: κ = 0.57, 0.77, 0.67 and low-field: κ = 0.51, 0.50, 0.70). Only, the reader with the shortest experience had better agreement with high-field compared to low-field.

Conclusions

There were a significant difference in the prevalence of disc degeneration grading between 0.25 T and 3 T MRI. Therefore, field strength should be taken into consideration when comparing studies using disc degeneration grading as an outcome.

Keywords

Disc degeneration magnetic resonance imaging low back pain magnetic fields reliability agreement

Introduction

Degenerative findings in the lumbar spine are common in individuals with or without low back pain (LBP),^1–6 and the intervertebral disc undergoes progressive morphologic and cellular changes with age.^1,7 Nevertheless, more advanced disc degenerative changes are found in individuals with LBP compared to individuals of the same age without LBP.^3,5,8 Non-specific LBP is believed to be initiated by degenerative processes in the disc, amplified by inflammation and infection.⁹ Magnetic resonance imaging (MRI) of the lumbar spine is often requested to identify the possible source of pain in patients with non-specific LBP.¹⁰ This makes a reliable and feasible MRI grading of the disc degeneration relevant in a clinical setting.^11,12

Disc degeneration is believed to be generated by a catabolic process,¹³ with loss of the hydrophilic glycosaminoglycan (GAG) primary structure within the nucleus pulposus.¹⁴ These changes result in reduced water content in the disc,^9,15,16 which can be seen on MRI as decreased T2 signal (hypointense).^11,17,18 For lumbar spine imaging, high-field MRI scanners (>1 T) are usually preferred by radiologists and clinicians due to better image quality related to a higher signal-to-noise ratio.¹⁹ Despite this, low-field scanners do not necessarily yield lower diagnostic potential in LBP evaluation.^3,20,21 Low-field MRI scanners have the advantages of lower purchase and maintenance cost,¹⁹ and a previous study has indicated that grading degenerative changes in the vertebral endplates (Modic classification) are affected by the field strength of the MRI scanner.²¹ Thus, it would be useful to know the diagnostic capability of low-field scanners compared with higher-field scanner for grading degeneration of the intervertebral disc in the lumbar spine. This study aimed to compare lumbar disc degeneration grading from a high-field (3 T) scanner and a low-field (0.25 T) MRI scanner.

Material and Methods

Design

This paper, which is reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology guidelines (STROBE),²² includes patients with LBP referred to conventional high-field MRI at the Department of Radiology, Frederiksberg Hospital, Denmark in 2012. The study was approved and conducted according to the local ethics committee (KF 01-045/03) and the Danish Data Protection Agency (01758 FRH-2012-003).

Study Group Selection

Patients 18–65 years of age with LBP with or without sciatica/radiculopathy who were referred to conventional high-field MRI of the lumbar spine were asked to participate in this study. The exclusion criteria were pregnancy, prior spine surgery, suspected fracture, cancer, or ‘red flag’ symptoms. The patients were consecutively included to undergo first high-field MRI followed by supine low-field MRI within 3 hours on the same day. Also, all patients had a clinical examination and were interviewed about their back pain by a physician (BBH).

Image Acquisition

The high-field scan (Siemens Verio 3 Tesla, Erlangen, Germany) included a sagittal T2-weighted (T2w) and a sagittal T1-weighted (T1w) turbo spin echo (TSE) sequence followed by an axial T2w TSE sequence covering the five lumbar disc levels. The following low-field MRI (G-Scan, ESAOTE, Italy 0.25-Tesla) included a sagittal T2w and sagittal T1w TSE sequence, as well as an axial 3D T2w isotropic gradient echo sequence (3D-Hyce) covering the lumbar spine from L2 to S1. The MRI sequences used in this study were part of the standard protocols from the manufacturers to keep them as realistic as possible (Table 1).

Table 1.

Details of the magnetic resonance imaging sequences of the lumbar spine,

	High-field 3 T scan			Low-field 0.25 T scan
	Sagittal TSE-T2	Axial TSE-T2	Sagittal TSE-T1	Sagittal TSE-T2	Axial GRE-T2	Sagittal TSE-T1
TR, ms	3000	6200	638	4370	10	590
TE, ms	100	117	9.5	120	5	20
ST, mm	4	3.0	4	4	2	5
SBS, mm (%)	0.4 (10)	0.48 (16)	0.4 (10)	0.5 (12)	0	0.5 (10)
FOV, mm	300*300	200*200	300*300	224*200	210*210	224*200
Acquisition_Matrix	384*288	384	384*288	224*200	180*180	256*168
Interpolated_Matrix	384*384	768*768	384*384	512*512	512*512	256*256
Time, min	2.56	3.51	2:39	5.28	5.5	4.44

TSE-T2 = T2-weighted turbo spin-echo; GRE-T2 = T2 weighted 3D hybrid contrast enhancement gradient echo; TSE-T1 = T1-weighted turbo spin-echo; TA = acquisition time; TR = repetition time; TE = echo time; ST = slice thickness; SBS = spacing between slices.

MRI Evaluation

One senior consultant neuroradiologist (A), one junior consultant radiologist (B) and one senior consultant musculoskeletal radiologist (C) graded the disc degeneration on the midsagittal T2-weighted images according to Pfirrmann’s five-point grading system. The disc degeneration was subjectively graded as Grade I, where the nucleus pulposus was homogeneous with high T2-signal intensity and a clear distinction between the nucleus and annulus; Grade II, where the nucleus pulposus was lightly inhomogeneous with a clear distinction between the nucleus and annulus with or without horizontal hypointense bands; Grade III, where the nucleus pulposus was inhomogeneous with unclear distinction between the nucleus and annulus, and a slightly decreased disc height; Grade IV, where the nucleus pulposus was inhomogeneous with a low signal intensity without distinction between the nucleus and annulus, and a manifest disc height reduction; Grade V, where the nucleus pulposus was inhomogeneous and hypointense, without distinction between the nucleus and annulus, and a fully collapsed disc space.¹⁷ The evaluation was performed on a certified radiologic workstation monitor using IMPAX® software (AGFA®). Due to the limited field of view of the low field MRI scanner resulting in geometric distortion of the L1/L2 level in some tall individuals, the grading used for comparison only included the L2/L3 to L5/S1 lumbar levels. The radiologists were blinded to any clinical information, and no communication regarding the grading between the readers was allowed. The disc degeneration grading of the high-field MR images and low-field MR images were made with a hiatus of three months. A subsample of 20 high- and low-field MRIs was regraded by all radiologists 6 months later. In addition, the most experienced spine radiologist (C) evaluated all scans for disc herniations, high intensity zones (HIZ), subchondral endplate changes (Modic changes type 1 and type 2), spondylolisthesis, and spinal stenosis.¹¹

Statistical Analysis

Descriptive data are reported as point estimates (proportions and mean values ± standard deviations). Agreement of the scores between field strengths and readers was assessed by: (1) weighted kappa statistics for ordinal data using the PROC FREQ statement with AGREE in the TABLES statement for SAS (IBM Corp, version 9.4, 64-bit edition for Windows), which by default tests agreement by Cicchetti–Allison weights, and (2) absolute agreement (i.e. levels with 1:1 agreement/the total number of levels).²³ Based on previous MRI agreement studies,²⁴ and the accepted guidelines for kappa values, 0–0.2 indicates slight agreement, 0.2–0.4 fair agreement, 0.4–0.6 moderate agreement, 0.6–0.8 substantial agreement and >0.8 almost perfect agreement.²⁵ Frequency distributions of the Pfirrmann’s grades were reported for each reader, and McNemar test was used to compare the prevalence of each grade between field strengths.

Results

Patient Characteristics

Seventy-five patients accepted participation and were first scanned in the high-field (3 T) scanner and subsequently scanned in the low-field (0.25 T) scanner. One patient could not enter the low-field scanner, five patients had a history of spine surgery, and seven patients could not complete both scans due to accentuated pain. Six low-field MR examinations had either severe motion artefacts or both scans did not include both the L2/L3 and L5/S1disc level. Therefore, the following analyses are based on 56 patients (29 females) aged 21–65 years. Back pain was reported on a 100 mm line using the visual analogue scale (VAS), where 0 indicate no pain and 100 indicate worst pain imaginable. Back pain ranged from 18 to 100 mm during activities and 1 to 85 mm during rest. A majority of the patients (70%) had additional radicular symptoms (Table 2).

Table 2.

Patient characteristics.

Patients (n = 56)
Age in years, mean (SD)	39.5 (±11.2)
Females, no. (%)	29 (51%)
Patients with LBP > 6 months, no. (%)	34 (61%)
LBP during activities (VAS), mean (SD)	65.8 (±21.2)
LBP in rest (VAS), mean (SD)	46.8 (±19.4)
LBP and radicular symptoms, no. (%)	40 (70%)
Straight leg raise test, no. (%)	26 (38%)
Any neurologic deficit	8 (14%)
MRI findings^a
Disc bulging, no. (%)	47 (21%)
Disc protrusion, no. (%)	46 (21%)
Disc extrusion, no. (%)	7 (3%)
Modic changes type 1, no. (%)^b	8 (4%)
Modic changes type 2, no. (%)^b	9 (4%)
Spinal stenosis, no. (%)	4 (2%)
Spondylolisthesis, no. (%)	10 (4%)

^aBased on the evaluation by one radiologist (ZR) and reported as number of levels with the specific MRI finding.

^bIncludes the superior or inferior endplate or both.

LBP, low back pain; VAS, 0–100 mm visual analogue scale; HIZ, hyper intensity zones.

Disc degeneration – high-field vs. low-field MRI

The three radiologists graded the same 244 lumbar discs on both high- and low-field MRI with a hiatus of 3 months. Low-field MRI resulted in numerically more higher and lower grades compared to high-field MRI for all three radiologists (Fig. 1). There was a moderate to a substantial agreement between field strengths, although there was a significant difference in the prevalence of different grading between low- and high-field MRI (Table 3).

Fig. 1.

The histogram illustrates the number of disc degeneration grades for 3 T high-field MRI (dark grey) and 0.25 T low-field MRI (light grey).

Table 3.

Number of grades, agreement and absolute agreement between high- and low-field MRI for disc degeneration evaluations.

	3 T MRI, no. (%)	0.25 T MRI, no. (%)	κ (95% CI)	Absolute agreement, no. (%)	p-value
Reader A
Grade I	51 (22.8%)	54 (24.1%)	0.53 (0.46–0.60)	108 (48.2%)	<.001
Grade II	77 (34.4%)	64 (28.6%)
Grade III	73 (32.6%)	49 (21.9%)
Grade IV	22 (9.8%)	49 (21.9%)
Grade V	1 (0.5%)	8 (3.6%)
Reader B
Grade I	31 (13.8%)	83 (37.1%)	0.52 (0.45–0.58)	98 (43.8%)	<.001
Grade II	85 (38.0%)	50 (22.3%)
Grade III	53 (23.7%)	38 (17.0%)
Grade IV	49 (21.9%)	48 (21.4%)
Grade V	6 (2.7%)	5 (2.2%)
Reader C
Grade I	1 (0.5%)	17 (7.6%)	0.62 (0.55–0.69)	148 (66.1%)	<.001
Grade II	97 (43.3%)	96 (42.9%)
Grade III	88 (39.3%)	61 (27.2%)
Grade IV	38 (17.0%)	49 (21.9%)
Grade V	0	1 (0.5%)
All Readers
Grade I	83 (12.4%)	154 (22.9%)	0.55 (0.52–0.59)	354 (52.6%)	<.001
Grade II	259 (38.5%)	210 (31.3%)
Grade III	214 (31.8%)	148 (22.0%)
Grade IV	109 (16.2%)	146 (21.7%)
Grade V	7 (1.0%)	14 (2.1%)

Agreement (κ) between 3 T high-field and 0.25 T low-field (0.25T) MRI scanners was calculated by weighted kappa statistics for ordinal data. The McNemar test was used to compare the relative prevalence of the different grades and is given as p-values. Reader A = senior neuroradiologist; Reader B = junior radiologist; Reader C = senior musculoskeletal radiologist.

Inter- and intra-reader agreement

Table 4 presents the inter- and intra-reader agreement and absolute agreement for each reader, with corresponding 95% confidence intervals (CIs) for both for the high- and low-field MRI scanners. There was a moderate to substantial inter- and intra-reader agreement for both field strengths. Between reader A and C there was a higher inter-reader agreement for low-field MRI than for high-field MRI. The reader with the shortest experience evaluating low-field MRI (B) showed a higher intra-reader agreement for high-field MRI compared to low-field MRI.

Table 4.

Agreement and absolute agreement of 3 T and 0.25 T MRI between radiologists.

	3 T MRI		0.25 T MRI
	κ (95% CI)	Absolute agreement	κ (95% CI)	Absolute agreement
Inter-reader
Reader A vs. B	0.55 (0.48–0.61)	114 (50.9%)	0.63 (0.57–0.69)	121 (54.0%)
Reader A vs. C	0.43 (0.37–0.49)	102 (45.5%)	0.63 (0.57–0.69)	133 (59.4%)
Reader B vs. C	0.53 (0.46–0.59)	123 (54.9%)	0.54 (0.48–0.60)	106 (47.3%)
Intra-reader
Reader A	0.57 (0.46–0.67)	39 (48.7%)	0.51 (0.40–0.61)	32 (40.0%)
Reader B	0.77 (0.68–0.85)	58 (72.5%)	0.50 (0.39–0.62)	35 (43.8%)
Reader C	0.67 (0.55–0.79)	56 (70.0%)	0.70 (0.61–0.80)	58 (72.5%)

The inter- and intra-reader agreement was calculated by weighted kappa statistics for ordinal data. Absolute agreement is given as number (no) and percent (%); Reader A = senior neuroradiologist; Reader B = junior radiologist; Reader C = senior musculoskeletal radiologist.

Discussion

Principal Findings

This study presents the reproducibility of Pfirrmann’s disc degeneration grading on a high-field (3 T) and a low-field (0.25 T) MRI scanner in a population similar to the original test of the scale (Fig. 2).¹⁷ Low-field MRI grading resulted in numerically higher and lower grades on the scale compared to high-field MRI, resulting in a moderate to substantial agreement and a significant difference in prevalence of the grading between field strengths. The radiologist with the shortest experience evaluating low-field MRI had a better intra-reader agreement with high-field MRI. These are important findings, as Pfirrmann’s five-point grading system is considered a reliable and clinical feasible in vivo evaluation of disc degeneration, and therefore widely used in research.^11,12

Fig. 2.

Examples of Pfirrmann’s disc degeneration grades on 3 T (a) and 0.25 T (b) sagittal T2-weighted magnetic resonance images in the same patient and lumbar disc level.

Disc Degeneration – High-field vs Low-field MRI

The signal-to-noise ratio increases with higher field strengths and this can be used to either shorten the scan time or increase the image quality, which, in theory, should result in better reader performance. A previous study investigated the diagnostic reproducibility of high- versus low-field MRI for structural degenerative changes in the spine (i.e. disc herniation, central canal, lateral recess and foraminal stenosis as well as nerve root compression).²⁰ They found almost perfect agreement between experienced radiologists (κ-values: 0.71–0.92) and no significant difference between field strengths. In this study, we investigated a tissue property, which may be more dependent on the T2 signal. Despite this, it is notable that we found only a moderate to substantial agreement, low absolute agreement, and a significant difference in the proportions of the grades between field strengths. Another study comparing high-field MRI to low-field MRI of another tissue property (Modic changes) also found a significant difference in proportions and reproducibility between field strengths.²¹ The authors argued that more pronounced subchondral degeneration could be difficult to identify on high-field MRI because of increasing inhomogeneity of the signal in the endplate. Whatever the reason, field strength, tissue properties and/or sequence choice, these parameters seem to affect the disc degeneration grading. Therefore, studies investigating degenerative disc grading on MRI with different field strengths can be expected to generate different results. This adds another potential limitation to Pfirrmann’s grading system, which has been criticized for its subjectivity and poor definition of reduced disc height.¹² This calls for better standardisation of disc degeneration grading with better reproducibility to avoid bias in clinical studies. Ideally, disc degeneration grading should be a continuous and maybe computerised measure, which would enable clinicians to follow the degenerative process during treatment and have the potential to distinguish early painful degenerative changes from age-related changes.^9,26 Imaging modalities such as T2-mapping, T1rho, dGEMRIC, spectroscopy or sodium MRI have been considered for this purpose. However, the clinical implications of such new sequences on a patient level as well as their reproducibility and associations to the clinical LBP are still widely unknown.²⁶

Inter-reader Agreement

The initial test of the grading system by Pfirrmann et al. was performed on a 1.0 T MRI scanner and they found an inter-reader agreement (κ-values) between 0.84 and 0.90 and absolute agreements between 88.0 and 92.3%.¹⁷ In this study, we tried to make the readings as close to a clinical setting as possible, and therefore, no initial consensus training of the radiologists was conducted. In addition, only the senior musculoskeletal radiologist (C) was subspecialized in the interpretation of degenerative pathologies in the spine. These considerations may explain our lower inter-reader agreement and absolute agreements compared to the original reliability test of the grading system. In another clinical study, disc degeneration has been graded on a four-item grading scale and comparable moderate inter-reader agreement (κ-value: 0.59) on a low-field (0.2 T) MRI scanner was found.²⁷ Another similar study also graded the disc degeneration on Pfirrmann’s grading scale and they found a similar inter-reader agreement with κ-values ranging from 0.63 to 0.70 on high-field MRI (1.5 T).²⁴ We observed a tendency to better overall inter-reader agreement for low-field MRI compared with high-field MRI. This might be due to lower signal-to-noise ratio and lower spatial resolution of low-field MRI, resulting in fewer details on the mid-sagittal images.

Intra-reader Agreement

We observed a tendency to a better intra-reader agreement than inter-reader agreement, which has also been found in previous studies.^24,27 In the original study by Pfirrmann et al.,¹⁷ the intra-reader agreement varied between 0.74 and 0.81 and the absolute agreement between 80.0% and 85.0%, which is better than our results. However, our intra-reader agreement is comparable with grading other degenerative pathologies in the spine.^24,27 A likely reason for our lower intra-reader performance again could be the radiologists’ different experience evaluating degenerative spine diseases – especially on low-field MRI. This conclusion is supported by the results showing that the radiologist with the shortest experience evaluating low-field MRI had a better intra-reader performance with high-field MRI.

Limitation of the Study

Due to logistic reasons, all the patients were first scanned in the high-field MRI scanner and secondly in the low-field MRI scanner. This may represent a limitation, as the water content in the discs is known to decrease during the day because of the upright position. Ideally, the patients should have been randomised to either high- or low-field MRI, as the first examination. However, this limitation may be compensated by the low- and high-field MRI was performed on the same day with a maximum time span of three hours. Patients above age 65 were not included to ensure a higher prevalence of low disc degeneration grades in the study population. This may explain the fairly low mean age, which may limit the ability to generalise our MRI findings to other studies; however, this has not affected the overall aim of the study. Another limitation could be the use of a 3 T scanner as the representative for high-field MRI. In a clinical setting, 1.5 T MRI scanners are often used for spine imaging, which may also limit the ability to generalise our findings into a clinical context. Further, 3 T MRI scanners might have a different reproducibility in grading degenerative discs compared to 1.5 T or 1 T MRI high-field scanners. Finally, Kappa value on ordinal data depends on the prevalence in each category, which may lead to difficulties comparing the Kappa values to other studies with different prevalence in the categories.²⁸

In conclusion, a significant difference in the prevalence of disc degeneration grading was found between low- and high-field MRI of the lumbar spine when using the Pfirrmann’s disc degeneration grading system. Moderate inter- and intra-reader agreement and absolute agreement were found in the current study highlighting the need for dedicated training before the disc degeneration grading can be used with higher precision in both clinical practice and research setting.

Footnotes

Acknowledgements

The authors thank all participants and the MRI staff of the Department of Radiology.

Declaration of conflicting interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Professor Boesen has received travel grants from ESAOTE, Genoa, Italy to hold invited lectures at ESSR 2015 and 2017 and ECR 2015. The other authors report no conflicts of interest.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by The Oak Foundation, Copenhagen University Hospital, Bispebjerg and Frederiksberg, Savværksejer Jeppe Juhl og Hustru Ovita Juhls Mindelegat, Minister Erna Hamiltons Legat for Videnskab og Kunst and the Danish Rheumatism Association.

References

Kjaer

Leboeuf-Yde

Korsholm

, et al. Magnetic resonance imaging and low back pain in adults: a diagnostic imaging study of 40-year-old men and women. Spine 2005; 30:1173–1180.

Jarvik

Hollingworth

Heagerty

, et al. The Longitudinal Assessment of Imaging and Disability of the Back (LAIDBack) Study: baseline data. Spine 2001; 26:1158–1166.

Hansen

Bendix

Grindsted

, et al. Effect of lumbar disc degeneration and low-back pain on the lumbar lordosis in supine and standing. Spine 2015; 40:1690–1696.

Deyo

RA.

Diagnostic evaluation of LBP.

Arch Intern Med 2002; 162:1444.

Cheung

KMC

Karppinen

Chan

, et al. Prevalence and pattern of lumbar magnetic resonance imaging changes in a population study of one thousand forty-three individuals. Spine 2009; 34:934–940.

Lurie

Doman

Spratt

, et al. Magnetic resonance imaging interpretation in patients with symptomatic lumbar spine disc herniations. Spine 2009; 34:701–705.

Haefeli

Kalberer

Saegesser

, et al. The course of macroscopic degeneration in the human lumbar intervertebral disc. Spine 2006; 31:1522–1531.

de Schepper

EIT

Damen

van Meurs

JBJ

, et al. The association between lumbar disc degeneration and low back pain: the influence of age, gender, and individual radiographic features. Spine 2010; 35:531–536.

Adams

Lama

Zehra

, et al. Why do some intervertebral discs degenerate, when others (in the same spine) do not? Clin Anat 2015; 28:195–204.

10.

Lotz

Haughton

Boden

, et al. New treatments and imaging strategies in degenerative disease of the intervertebral disks. Radiology 2012; 264:6–19.

11.

Fardon

Williams

Dohring

, et al. Lumbar disc nomenclature: version 2.0. Recommendations of the combined task forces of the North American Spine Society, the American Society of Spine Radiology and the American Society of Neuroradiology. Spine J 2014; 14:2525–2545.

12.

Kettler

Wilke

HJ.

Review of existing grading systems for cervical or lumbar disc and facet joint degeneration.

Eur Spine J 2006; 15:705–718.

13.

Burke

Watson

McCormack

, et al. Intervertebral discs which cause low back pain secrete high levels of proinflammatory mediators. J Bone Jt Surg Br 2002; 84:196–201.

14.

Roughley

PJ.

Biology of intervertebral disc aging and degeneration: involvement of the extracellular matrix.

Spine 2004; 29:2691–2699.

15.

Adams

M a.

Dolan

Intervertebral disc degeneration: evidence for two distinct phenotypes.

J Anat 2012; 221:497–506.

16.

Adams

M a

Roughley

PJ.

What is intervertebral disc degeneration, and what causes it?

Spine 2006; 31:2151–2161.

17.

Pfirrmann

Metzdorf

Zanetti

, et al. Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine 2001; 26:1873–1878.

18.

Bendix

Kjaer

Korsholm

Burned-out discs stop hurting.

Spine 2008; 33:E962–E967.

19.

Coffey

Truong

Chekmenev

EY.

Low-field MRI can be more sensitive than high-field MRI.

J Magn Reson 2013; 237:169–174.

20.

Lee

RKL

Griffith

Lau

YYO

, et al. Diagnostic capability of low- versus high-field magnetic resonance imaging for lumbar degenerative disease. Spine 2015; 40:382–391.

21.

Bendix

Sorensen

Henriksson

GAC

, et al. Lumbar Modic changes: a comparison between findings at low- and high-field magnetic resonance imaging. Spine 2012; 37:1756–1762.

22.

Elm

E von

Altman

Egger

, et al. Strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. BMJ 2007; 335:806–808.

23.

Kottner

Audigé

Brorson

, et al. Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol 2011; 64:96–106.

24.

Carrino

J a

Lurie

Tosteson

ANA

, et al. Lumbar spine: reliability of MR imaging findings. Radiology 2009; 250:161–170.

25.

Landis

Koch

GG.

The measurement of observer agreement for categorical data.

Biometrics 1977; 33:159–174.

26.

Hansen

Carrino

, et al. Imaging in mechanical back pain: anything new? Best Pract Res Clin Rheumatol 2016; 30:766–785.

27.

Sorensen

Kjaer

Jensen

, et al. Low-field magnetic resonance imaging of the lumbar spine: reliability of qualitative evaluation of disc and muscle parameters. Acta Radiol 2006; 47:947–953.

28.

Sim

Wright

CC.

The kappa statistic in reliability studies: use, interpretation, and sample size requirements.

Phys Ther 2005; 85:257–268.