Abstract
BACKGROUND:
Pressure pain threshold (PPT) is decreased in several musculoskeletal disorders, giving indirect evidence regarding pain status. Despite the fact that PPT has been already proven to be reliable in patients with acute conditions, there is great variability of methods and results observed within studies, and only a few evidences confirming its reliability in chronic conditions.
OBJECTIVE:
The objective of this study was to determine the test-retest reliability of PPT in the neck and low back regions to discriminate individuals with neck or low back pain from healthy individuals. Additionally, one secondary aim was to establish the minimum detectable change (MDC) and the standard error of measurement for future clinical studies and interventions.
METHODS:
In this reliability study, 74 individuals (15 individuals from the neck pain and 17 from the neck control group; 21 individuals from the low back pain and 21 from the low back control group). PPT was measured in the neck region (suboccipital, trapezius and supraspinal muscles) and in the lower back region (paraspinal muscles in the levels of L1, L3 and L5). Intrarater reliability was assessed using intraclass correlation coeficient and Bland-Altman.
RESULTS:
Excellent intra-rater reliability was observed for both (ICC of 0.874 for the neck pain versus ICC of 0.895 in neck control group; ICC of 0.932 for the low back pain group versus ICC of 0.839 for the control group). A small bias was observed for all groups (
CONCLUSION:
It may be suggested that the protocol with PPT is reliable and able to discriminate individuals with and without neck and low back pain with a minor measurement error. Therefore, this method may be used to detect possible progress after interventions in patients with neck or low back pain.
Keywords
Introduction
Neck and low back pain are common musculoskeletal disorders (MSD), affecting people from different ages leading to several medical visits and work absenteeism [1, 2]. Considering that pain is a multidimensional phenomenon with a subjectivity component, differences in pain intensity in similar conditions may be reported [3]. These contrasting results may be explained through data from diverse sources such as self-reported questionnaires or visual analogue scale. The usual problem of these instruments is the lack of standardized evaluation methods, which may rise difficulties to understand data and to prevent clinicians to assess the real improvement of interventions [4, 5]. Therefore, the inclusion of more objective measures of pain may be helpful for both research and clinical practice.
Pressure pain threshold (PPT) may be defined as the threshold at which gradually increasing pressure causes pain [6, 7]. Despite it is not a direct method to assess pain, PPT is decreased in a variety of musculoskeletal pain conditions [8, 9] and it may be applied to different muscle regions [10], giving indirect evidence regarding pain status. Thus, it may be extremely useful in determining the therapeutic effects and follow-up of some MSD, especially neck and low back pain [11, 12, 13, 14]. PPT has been already proven to be reliable in patients with acute conditions. However, there is great variability between the evaluation methods and the results observed as well as missing data confirming its reliability in chronic conditions [14].
Reliability is an important issue in classification, scaling and instrument development as well as clinical studies [15]. Reliability may be defined as the ratio of variability between subjects (e.g., patients) or objects (e.g., computed tomography scans) to the total variability of all measurements in the sample, being the ability of a measurement to differentiate among subjects or objects. Therefore, the Intra-rater agreement/reliability (also referred to as test-retest) may be understood as the same rater, using the same scale, classification, instrument or procedure to assesses the same subjects or object at different times [16]. Results of reliability studies provide information about the amount of error inherent in any diagnosis, score, or measurement, where the amount of measurement error determines the validity of the study results or scores [16].
Hence, the aim of this study was to determine the reliability of PPT in the neck and low back regions to discriminate among individuals with neck or low back pain from healthy individuals. Additionally, one secondary objective was to establish the minimum detectable change (MDC) and the standard error of these measures for future clinical studies and interventions.
Methods
Study design and ethical procedures
This cross-sectional study (intra-rater reliability) was based on the Guidelines for Reliability and Reporting Agreement Studies (GRRAS) checklist [15]. This study was approved by the Institutional Ethics in Research Committee (Protocol # 913643). All procedures performed were in accordance with the Declaration of Helsinki and all participants agreed with the research objectives and signed an informed written consent.
Sample size calculation
Walter et al. [17] provided a robust mathematical approach to estimating the required number of participants for reliability studies. The hypothesis was that the test-retest reliability would be clinical relevance if at least 0.7 level of intraclass coefficient was detected. Using this information along with 80% of statistical power and 95% of confidence interval (
Sample
Seventy-four subjects with low back or neck pain with their respective control groups were enrolled in this study and selected from the community. Neck pain was defined as the pain perceived as arising in a region bounded superiorly by the superior nuchal line, laterally by the lateral margins of the neck, and inferiorly by an imaginary transverse line through the T1 spinous process [18, 19]. The inclusion criteria for the neck pain group was: to present non-specific neck pain with average to moderate intensity (VAS 4-7/10) in a period greater than three months [19], between 18–40 years old and that agreed to take part in the study. The criteria for the neck control group was: to not present neck pain, between 18–40 years old and that agreed to take part in the study.
Regarding chronic lower back pain, it is defined as a pain of a non-mechanical origin in the low back are with more than 3 months of symptoms [20]. The inclusion criteria for the group with low back pain were: age between 18 to 40 years old, volunteer agreement to join the study, to present low back pain of mechanical origin with average to moderate intensity (VAS 4-7/10) for at least three months. For the low back control group, the criteria were: matched age from the low back pain group, absence of low back pain and that agreed to take part in the study.
Participants who underwent previous back surgery were excluded, as well as the ones that used chronically some type of medications with possible effect on pain discrimination (analgesics, benzodiazepines, antidepressants, antiepileptic drugs and anti-inflammatories) within 6 months before the assessment. All subjects reported no pain or neurological disorder in the upper limbs, previous surgery, and a history of injury three months prior to the assessments of this study. The subjects were asked not to use stimulants, drugs, alcohol or participate in physical activity 8 hours before the assessment.
Procedures
Initially, participants answered a questionnaire containing demographic information (age, gender, height, weight, medication consumption, symptoms frequency and educational level). The evaluations in the cervical region were carried out on the sub-occipital, trapezius and supraspinal muscles bilaterally (Fig. 1), whereas for the lumbar spine the evaluations were carried out on the paraspinal muscle bellies (longissimus back and erector spinae) in the levels of L1, L3 and L5 bilaterally (Fig. 1). The order of evaluation was standardized from the uppermost point and to the bottom, both in the cervical and lower back regions. The interval between each application in the pre-determined points was between 30–60 seconds. Initially, the participant was familiarized with the test with the application of pressure on the muscle belly of the biceps femoris. The interval between the two evaluations was 24 hours to evaluate the intra-rater reliability. All evaluations were conducted in a laboratory with controlled temperature (24
Points assessment of the Pressure Pain Threshold (PPT). 
The evaluation of the PPT took place through an electronic algometer (EMG Brand System
Characteristics of the sample
Side-to-side comparison for all tested groups
The cut-off point used as a reference to determine the exact moment of pain intensity was 1.0, later analyzed after processing the data through routine in MatLab
Statistical analysis
Statistical analysis was performed using IBM SPSS version 21.0 and G Power version 3.1.3, with an alpha level of 0.05. All variables were normally distributed, after Shapiro-Wilk test (
To assess the test-retest reliability across the two sessions, intraclass correlation coefficients (ICCs) and standard error of measurement (SEM) were calculated based on two-way random effects model as described by Fleiss [22] and Shrout and Fleiss [23].
Intraclass correlation coefficient and standard error of measurements (SEM) for all tested groups
Intraclass correlation coefficient and standard error of measurements (SEM) for all tested groups
* Statistically significant.
Bland-Altman comparison of pressure pain threshold (PPT) between each condition and their respective control group. 
The ICC gives the ratio of variances due to differences between test-retest evaluation and we used the ICC classification proposed by Fleiss [22], whereas an ICC below 0.4 indicates “poor” reliability; between 0.40 and 0.75 “fair to good” reliability; and above 0.75 “excellent” reliability.
The SEM quantifies the precision of the individual measurements and gives an indication of the absolute reliability according to the following equation: SEM
The formula SEM
The Bland-Altman plots were used to assess agreement between the measurements and the systematic differences. The graph presents the magnitude of disagreement (including systematic differences) [25, 26]. To check the differences between pathological and control groups in the PPT, we used the first evaluation as a measure and the t-test for independent analysis was subsequently performed to calculate the effect size by testing Cohen’s
We evaluated 74 individuals of both genders (30 men, 44 women). In this sample, 15 subjects had chronic non-specific neck pain and 17 constituted the cervical control group. On the other hand, 21 individuals had chronic non-specific low back pain and 21 constituted the low back control group. The data characteristics of the groups are shown in Table 1. No differences were observed regarding the side to side comparisons for all the tested groups (neck pain and low back pain) and their respective control groups and these data is represented in Table 2. No differences among gender or age were observed within the groups (unpaired
High intraclass correlation coefficients (ICC) with small standard error of the measurements were found in all groups, indicating that this protocol is reproducible (
The minimum detectable change (MDC) was used to assess the variation observed by the error measurement and the values observed were: 0.63 kgf for neck pain while 1.21 kg was determined for low back pain.
When analyzing the degree of agreement through Bland-Altman plot (Fig. 2), it was observed a bias of
To test the discriminant validity of this method, the PPT between the pathological groups and their respective controls were compared. Concerning the neck pain group, statistically significant differences were observed (neck group mean 3:47
Comparison of pressure pain threshold (PPT) between each pathological group and their respective control.* Indicates statistically significant difference between the means (
A receiver operator curve analysis was performed to verify what is the cut-off point of PPT to discriminate individuals with or without low back pain and it was determined that 6.6 is the cut-off point, with an area under the curve of 0.76, reaching the an accuracy level (Sensitivity: 80.65% and Specificity: 53.23%). On the other hand, for the neck pain group the area under the curve calculated was 0.67. Despite it is statistically significant (
This study analyzed the reliability of PPT and the discriminatory capacity of this method to identify individuals with neck and low back pain. The Bland-Altman plot showed that the PPT evaluation method has high agreement, regardless of the individual’s condition [26]. The ICC has demonstrated excellent reliability in all the conditions tested [27]. In addition, it was observed that the method has discriminant validity, because it presents statistically significant differences between the pathological groups and their respective controls, with strong effect size [25]. These data agree with Jorgensen et al. [28] and Walton et al. [13] who found good reliability when using the PPT to discriminate patients with neck pain from control individuals. These authors have also concluded that the algometry is quite simple, and it can be performed by experienced and inexperienced evaluators after just one hour of training.
Our findings achieved accurate values even with 24-hour intervals, which is a positive aspect once reliability evaluations on separate days may change a lot [13, 14] because sometimes is difficult to determine the exact moment when the pain starts probably caused by the bias of the evaluator’s reaction time in different groups [5, 14]. However, this aspect was controlled in this study because all data analysis was carried out precisely at 0.001 second.
As the MDC was small and in agreement with other studies [13, 17, 27], we can assume that a minor variation due to measurement error has occurred.
Thus, this assessment can be used for clinical purposes, in order to determine any improvement related an intervention. Moreover, it was observed that individuals with neck pain have lower PPT as evidenced in other studies, regardless of the age of the subjects [17, 19]. This may indicate peripheral nociceptive hypersensitivity caused by the neck pain [29]. Moreover, PPT has shown good validity when compared to pain and disability questionnaires [30].
The same discriminant validity was also noticed in low back pain since they had lower PPT when compared to individuals without the disease (mean 4.88 versus 6.87), as observed in other studies [31, 32]. This evidence is also reinforced by the data from ROC curve, whereas it was determined a possible cut-off point of 6.6, indicating that in individuals with low back pain the estimated values should be lower than 6.6.
This may be explained by the fact that injured tissues release several inflammatory mediators, which excite nociceptors in the injury area, maintaining the local sensory excitation and nociceptive hypersensitivity [33]. Thus, there will be an increased sensitivity to mechanical stimulation and consequently PPT will suffer a decrease [4]. This process, known as peripheral sensitization, is expressed with spontaneous heating, extension of neuronal receptive area, increased sensitivity to mechanical stimulation and spontaneous reduction of the threshold [34].
The strength of this approach is that PPT are accurate measures to assess pain in an indirect way at different spine regions. Moreover, this method could be applied to other cases of musculoskeletal disorders. Additionally, as a minor minimum detectable change was observed, this one can assume that a minor variation due to measurement error. Therefore, this parameter may be used for clinical purposes since a value above the MDC may be understood as clinically significant.
On the other hand, we can point as limitation of this study a small sample used and a sample mainly composed by young adults, which decreases the external validity of these finding. However, recent studies have been reported that neck and low back pain are being identified at earlier age [35]. Moreover, it is important to remind that PPT is an indirect and somewhat complex method to assess pain. Therefore, we reinforce the need of futures studies to confirm the hypothesis shown in this article.
The interpretation of the reported pain during algometry is somehow complex, since the nociceptive system may operate in different states of sensitivity and excitability [35]. In addition, changes in tissue sensitivity may occur in response to neurobiological and biopsychosocial factors [36]. Thus, biopsychosocial disorders can perpetuate or even exacerbate nociceptive hypersensitivity [37], resulting in a global and diffuse experienced pain [35] in several chronic pain conditions.
Conclusion
Pressure pain threshold evaluation is reliable and able to discriminate individuals with and without neck and low back pain with a minor measurement error. Therefore, this method may be used to detect possible progress after interventions in patients with neck or low back pain.
Footnotes
Conflict of interest
None to report.
