Abstract
Objective
The objective of this study was to assess the intra- and inter-rater reliability of manual muscle testing (MMT) and hand-held dynamometer (HHD) in the measurement of isometric wrist strength in asymptomatic and symptomatic (distal radius fractures [DRF] and non-specific wrist pain [NSWP]) populations.
Method
Thirty-nine adults participated in an intra session, repeat measure, crossover study design. MMT and HHD isometric wrist strength was tested in six standardized test positions by two raters.
Results
Poor-to-excellent intra- and inter-rater reliability with MMT was found in all patient populations (ICC = 0.04–1.00). Excellent intra-rater reliability with HHD in the DRF (ICC = 0.86–0.95) and NSWP (ICC = 0.92–0.97) populations and excellent inter-rater reliability in the asymptomatic (ICC = 0.77–0.93) and DRF (ICC = 0.82–0.95) populations. Fair to excellent intra-rater reliability with HHD was seen in the asymptomatic population (ICC = 0.71–0.94) and fair to excellent inter-rater reliability in the NSWP population (ICC = 0.59–0.90).
Conclusion
MMT is shown to have variable reliability when assessing isometric wrist strength and is insensitive to small strength changes. HHD has been shown to be an objective and reliable measure of isometric wrist strength in specific positions in asymptomatic, DRF and NSWP populations. Further studies are required to ensure adequate dynamometry stabilization and obtain an optimal testing procedure for these populations.
Introduction
Approximately one-fifth of all hand injuries involve the wrist 1 with distal radius fractures (DRFs) the most common consequence of wrist trauma. 2 The incidence of DRF within the British population is 37 females and nine male per 10,000 people each year, 3 with an estimated overall annual cost of treatment of 20 million pounds in 2006. 2 Pain arising from an unidentified musculoskeletal source within the wrist (termed non-specific wrist pain [NSWP]) is common. 4 NSWP and hand pain have not been separately investigated, with the combined prevalence reported as 12% of females and 9% of males. 4 The rehabilitation required for these two conditions will benefit from the determination of a reliable wrist strength assessment technique. A precise assessment tool will aid the therapist's specificity in prescription of a strengthening programme, support the evaluation of treatment and overall outcome. It is expected that this would also result in a patient's earlier return to function and work.
Measurement of strength is an essential element of a therapist's assessment 5 and is often used as the primary outcome measure following a wrist injury. 6 Rehabilitation requires specific strengthening to obtain an optimal functional result and the specificity in prescription of exercises requires a reliable assessment technique. A basic requirement of any assessment tool is its reliability. This is established when it has demonstrated that repeated measurements are consistent. 7 Hand-held dynamometers (HHDs) have been shown to be reliable and superior in detecting small objective changes in muscle strength in comparison with manual muscle testing (MMT). 8
MMT is widely used in clinical settings. The grading largely relies on the rater's subjective judgement allowing for variability in testing. 9 The rater must develop an internal criterion for comparing the test results, which is reliant on personal experience and skill. 9 There is a significant improvement in reliability when the rater has more experience and training. 10 MMT does not objectively detect small changes in muscle strength. 11 The inability to detect a 50% loss of strength in the knee extensors was shown by Beasley. 11 The sensitivity of MMT is less than 75%, 12 this value does not meet the recommended level of ≥90% for a screening tool. 13
Dynamometry is able to objectively detect strength changes with small increments, 14 and is reliable and valid in the measurement of muscle strength. 12,15 Intraclass correlations (ICC) and 95% confidence intervals (CI) are used to measure correlation and agreement between repeated measures. 15,16 ICC values >0.75 indicate excellent reliability, 0.40–0.75 fair and <0.40 poor. 17 A narrow CI represents greater precision in the estimate of reliability. 18 Reliability determined through correlation may be insensitive to systematic variation. 19 A two-way analysis of variance (ANOVA) for repeated-measures tests for significant differences (P < 0.05) between raters over repeated measures. 19
Findings for wrist dynamometry reliability studies are detailed in Table 1. 16,19–25 The results show fair to excellent reliability with HHD testing in asymptomatic and symptomatic individuals. Intra-rater reliability has been shown to be higher than inter-rater reliability, 25 and isometric strength testing with an HHD more reliable than isotonic tests. 8 Previous studies have not looked at strength of wrist deviation or forearm rotation. Direct comparison of studies is difficult as there are varied symptomatic groups, methods of testing, devices and statistical analysis.
Summary of results from hand-held dynamometry wrist reliability studies
WE, wrist extension; WF, wrist flexion; ICC, intraclass correlation coefficients; r, Pearson correlation coefficient values
The results of reliability studies should be applied to the particular device and specific populations tested. 15 The portable HHD JTech Commander PowerTrack II (JTech – JTECH Medical, Salt Lake City, UT, USA) correlates (r = 0.81–0.9) with a static dynamometer in the measurement of isometric shoulder strength. 26 Isometric testing of symptomatic shoulders with the JTech has shown fair to excellent inter-session reliability (ICC = 0.79–0.96) and inter-rater reliability (ICC = 0.79–0.92). 27 It has not been used in the study of wrist strength (Table 1).
Variability in HHD testing occurs due to differences in the testing protocols, difficulties in providing adequate limb stabilization and the rater's strength. 28 The testing procedure requires standardization of the joint or muscle position to ensure that repeated measurements are conducted in the same way to minimize measurement error. When using an HHD the rater is required to meet the force of the patient and stabilize the dynamometer against the limb. When the tested limb is adequately stabilized reliability improves. 14 Raters with a low physical strength or grip strength may find stabilization of the HHD more difficult. 9 Reliability reduces as the tested muscle strength increases. 29 If an isometric test measures greater than 60 lb of force adequate stabilization by the rater is unachievable. 30,31 No relationship has been found between HHD testing and gender, 28 and tester experience has not been shown to affect reliability. 29
The purpose of this study was to investigate the intra- and inter-rater reliability of MMT and HHD in measurement of isometric wrist strength in asymptomatic and symptomatic (DRF and NSWP) participants.
Methods
Thirty-nine participants were tested (19 following DRF and 20 with NSWP) by two raters on two occasions in an intrasession, repeated measures, crossover study design. Symptomatic participants were recruited from the hand therapy department and wrist rehabilitation class. Those included were aged ≥16 years, had a clinically healed DRF (defined by a fracture within 4 cm of the wrist joint and healed with radiographic evidence) at ≥10 weeks post-injury or NSWP or wrist pain with no pathology on MRI or wrist pain with ligament laxity. Patients were excluded if they were pregnant. Asymptomatic participants were recruited from the symptomatic population using the contralateral wrist for testing purposes. Participants all entered the study following review of the patient information sheet and once informed consent had been taken. Ethical approval was granted by Great Ormond Street Ethics committee.
The participants were randomized (by a computer-generated randomization series) to begin with rater 1 or rater 2. Wrist strength was measured in six standardized positions; (flexion, extension, ulna deviation, radial deviation, pronation and supination) beginning with either MMT or HHD. The positions, and MMT or HHD, were randomized using a Latin square. 32 Randomization reduces the effect of rater or test order.
MMT was performed once and three times with HHD for each position. The optimal number of repetitions for isometric wrist strength is unknown. 6 Grip strength dynamometer testing shows an average taken from three assessments produces reliable results, 33 supported by published HHD studies. 9,14,20,34 One MMT is shown to be clinically acceptable. 9,12 There was allocation of a 5-second break between each HHD or MMT, change of testing position required a 30-second break. A rest period of 30 seconds when taking fewer than 15 measurements is recommended. 6 A 10-minute break was allocated between each rater to reduce the effects of fatigue and pain; this is considered sufficient for the repletion of muscle energy. 35 The testing procedure was standardized (Appendix 1). The verbal instructions allow for a 4–6-second contraction, which is when maximum strength peaks. 6
Pain on a Visual Analogue Scale (VAS) was measured prior to testing and between each test session. This pain rating measure is valid, reliable and appropriate for use in clinical practice. 36 If the VAS increased by more than 20 mm participants were withdrawn from testing. This was to avoid an exacerbation of pain that would prevent participants from continuing therapy.
Testing position
Standardized testing positions are described in Appendix 1. In symptomatic participants mid-range joint positions are ideal as they are the easiest to obtain, 6 and allow quick and easy standardization by clinicians. Pronation and supination were tested in neutral forearm rotation. For ease of dynamometer placement flexion was tested in supination and extension in pronation. Deviation movements were tested with forearm pronation and neutral deviation for ease of dynamometer placement. Proximal joints to the wrist will affect the ability of the wrist musculature to generate strength, and therefore these were manually stabilized. 6 The raters verbally encouraged relaxation of the fingers to reduce the effects of the extrinsic finger muscles.
Raters
Two female physiotherapists (rater 1: height 167 cm, weight 56 kg; rater 2: height 162 cm, weight 60 kg) who had not previously used the JTech performed the testing and were instructed on the use of the JTech and testing procedure by an experienced clinician in a one-hour training session. Grip strength was assessed by a Jamar measurement according to American Society of Hand Therapy guidelines 33 (grip strength: rater 1: left = 15 kg, right = 24 kg; rater 2: left 24 kg, right 30 kg).
Equipment
The JTech is a portable, battery-powered, computerized HHD with a range of 0–100 lb and measures to the nearest 0.5 lb.
Sample size
For true reliability >0.7 the study required a sample size of 19 participants, based on a 5% significance level and a power of 80% for two raters. 18 The sample size was increased if patients were withdrawn from testing.
Data recording
The MMT score was verbally reported to a scribe who documented the result on a data collection sheet. Each test with the HHD was viewed and documented by the scribe and cleared prior to repeat testing. The tester and the subject were blinded to the screen to reduce any motivational effect. A new data collection sheet was used for each repeat test session.
Outcome measures
The primary outcome measure was pounds of force, measured in 0.5 lb increments. MMT was graded using the Oxford Manual muscle test for isometric strength. 37 Ability to maintain a position against gravity is grade 3, with minimal resistance 3+ , with less than moderate resistance 4− , with moderate resistance 4 and with maximal resistance 5. 37
Data analysis
Two-tailed paired sample t-tests were applied to determine the difference in age, length of problem or VAS within the sample populations. Raw data were analysed from the single MMT and the mean of three HHD trials over both test sessions and for each subject group, using a generalized linear model. ICC and 95% CI were analysed using model 2,1 for MMT and 2,k for HHD, with IBM SPSS version 19 (Statistical Package for the Social Sciences Inc, Chicago, IL, USA). The mean magnitude of force (lb) measured by each rater during each session was used to calculate the two-way ANOVA using the statistical programme SAS version 9.1.3 (SAS Institute Inc, Cary, NC, USA).
Results
The participants' demographics are within Table 2. One participant with NSWP failed to complete the testing procedure due to pain and their results were excluded from the study data. Student's t-tests performed on demographic data found a significant difference (P < 0.026) in age between DRF and NSWP groups. Analysis of the length of the problem and pain were not significantly different between groups (Table 2).
Participant demographics excluding participant withdrawn from testing due to pain
DRF, distal radius fracture; NSWP, non-specific wrist pain; R, right; L, left
Manual muscle testing
Table 3 reports the ICC, 95% CI and ANOVA P values for all MMT.
Intra-rater and inter-rater reliability coefficients for manual muscle test
Asymp, asymptomatic participants; DRF, distal radius fracture; NSWP, non-specific wrist pain participants; WE, wrist extension; WF, wrist flexion; Rad Dev, radial deviation; Ulna Dev, ulna deviation; Pro, pronation; Sup, supination; ICC, intraclass correlation; 95% CI, 95% confidence interval; NS, not significant; x, unable to calculate score; Significant differences in raters (P < 0.05) are denoted with an asterisk (*)
Hand-held dynamometer
Table 4 reports the ICC, 95% CI and ANOVA P values for all HHD tests.
Intra-rater and inter-rater reliability coefficients for hand-held dynamometer
Asymp, asymptomatic participants; DRF, distal radius fracture; NSWP, non-specific wrist pain participants; WE, wrist extension; WF, wrist flexion; Rad Dev, radial deviation; Ulna Dev, ulna deviation; Pro, pronation; Sup, supination; ICC, intraclass correlation; 95% CI, 95% confidence interval; NS, not significant; Significant differences in raters (P < 0.05) are denoted with an asterisk (*)
Discussion
There was no statistically significant difference (P < 0.05) in intra-rater reliability of manual testing of the wrist in the asymptomatic population. This result is unsurprising, as we would expect all participants to have full strength, be symptom free and therefore achieve the highest score possible. Individual raters are likely to have formed an internal criterion by which to grade the manual strength and therefore with repeated measures by the same rater this criterion would remain unchanged. Inter-rater MMT showed no significant difference between rater scores in the asymptomatic population in all positions except for wrist flexion on the first test session. This result (and the wide ICCs) indicates large variability in inter-rater MMT as it is a subjective score dependent on a rater's previous experience. The significant difference between raters was eliminated on the repeat test sessions. This is possibly due to the participants becoming familiar with the test position. On repeat testing the participants provided a reliable maximum contraction. This is considered to be a learning effect. In the DRF and NSWP populations there was no significant difference in scores with intra- and inter-rater testing in all wrist positions (on repeat testing).
Multiple ICC reliability values could not be calculated in the asymptomatic population as the MMT score was the same for all the participants (grade 5). This has been seen in a previous MMT study 9 and is a clear demonstration that manually scoring strength on a limited and ordinal scale does not allow for small incremental differences in strength.
HHD has fair to excellent reliability when repeat measures of wrist strength are taken by one or two therapists in an asymptomatic population. Significant differences between raters were shown in one test position (ulnar deviation). This improved on the second test session, which can be attributed to a learning effect of the participants. A significant difference was demonstrated on the repeat test session in wrist extension; this movement produced the largest magnitude of force on testing. The production of a strong wrist movement will have an effect on the rater's ability to stabilize the dynamometer.
Fair to excellent intra- and inter-rater reliability was shown in the DRF population. A significant difference in intra-rater testing in DRF participants was shown in all positions except for radial deviation, pronation and supination and with inter-rater testing on the second test session, except in radial deviation and pronation. On measuring rotational movements the dynamometer was placed close to or over the fracture site. All fractures were clinically healed; however, the pressure of the dynamometer may have inhibited the participant's ability to produce a maximum strength contraction. The significant difference in rater's measures in DRF participants could be due to the testing position. Pain and the average length of problem were not significantly different between the symptomatic groups. Age was significantly different between symptomatic groups. The DRF group had a greater mean age that would reduce the participant's ability to produce high maximum strength measurements as there is a known significant correlation between muscle strength and age. 33 A reduction in strength (associated with age) would enable the rater to provide enhanced stabilization of the dynamometer producing a more reliable measure.
Excellent reliability was shown with intra-rater testing of the NSWP participants, with a significant difference only in pronation. This position produced the highest dynamometry force reading of all positions. Fair to excellent reliability was shown with inter-rater testing and no significant difference between raters on the second test session.
The strength of the rater can impact the rater's ability to stabilize the dynamometer against the limb being tested. 29 In an isometric test the rater is required to ‘match’ the resistance of the participant. The higher the magnitude of force produced by the participant the less able the rater is to adequately stabilize the dynamometer. 20,29 If the participant is stronger than the rater and stabilization is lost, less strength than the participant's maximum will be recorded. Measurements will therefore vary between raters with different strengths. Testing in symptomatic participants is more reliable as they are generally weaker than asymptomatic subjects. 8,16 Dynamometer stabilization is not achievable when force is greater that 60 lb, 30,31 the maximum measurement within this study was 41.5 lb. Raters therefore had adequate physical and grip strength to provide the manual stabilization required.
Excellent inter-rater reliability (ICC = 0.93) in wrist extension of the asymptomatic population was shown, however there was a significant difference between the raters measurements. Extension produced the maximum mean strength measurement. This can also be shown with significant differences in the measurements in the DRF population on pronation (intra- and inter-rater) and supination (intra-rater) and in the NSWP group in pronation (intra-rater). These testing positions, which yield high strength measurements, will require enhanced stabilization to improve reliability.
Testing positions and procedures for isometric wrist strength within the literature are widely varied and there is currently no consensus as to the ‘ideal’ position. 6 Significant differences between raters in the measurement of wrist extension in an asymptomatic population within this study and previously 19,25 suggests an alternative testing position is required. A mid-rotation position for extension could be considered, although this position will require increased manual stabilization. Extension in the asymptomatic population produced the greatest mean strength without the contribution of Extensor Carpi Ulnaris (ECU), as in a pronated position ECU works as an ulnar deviator; but when in mid-rotation it will work with the radial extensors to produce a strong extension movement. 38 The mid-rotation position will also require the participant to provide enhanced contractile force by using the deep head of pronator quadratus (PQ) to work as a dynamic stabilizer of the distal radio-ulnar joint (DRUJ). 39 In the pronated position the plinth will provide external stability and PQ will be in a mechanically advantageous position. Conversely skeletal stability will be superior in a mid-rotation position as this is where the DRUJ has maximal congruency and compression 40
The main limitation in symptomatic subjects is the inability to assume the testing position, and therefore it is suggested that testing should take place at mid-range joint positions. 6 In this study extension and deviation movements were measured in pronation, which is often difficult to fully obtain following a DRF; three participants were tested in mid forearm rotation to obtain these measures. However, it must be considered that ECU will not work as an ulnar deviator in mid-position, but as a wrist extensor. 37
MMT and HHD showed differences between the raters in the first and second test sessions in all population groups attributable to a learning effect, seen in previous dynamometry studies. 9 Before testing the raters asked participants to demonstrate (with no resistance) the specific movement required for each test session. Poor understanding of the testing, apprehension over the testing procedure and the possible instigation of pain with application of the resistance may have resulted in the poor reliability on the first test session. Increased familiarity with the testing procedure is likely to produce consistently reliable measures. Future research and clinicians should incorporate a ‘practice session’ with applied resistance to reduce this effect.
External factors may affect maximum contraction such as patient motivation and cooperation. 9 Patients who volunteer for studies are potentially more likely to be sincere in their effort and therefore improve reliability testing. Verbal motivation from the rater was standardized within this study.
Poor reliability may be related to muscle fatigue or pain. In this study, we minimized these variables through allocation of rest periods and monitoring of the VAS. Participants with higher pain scores may be unable to complete testing or produce less reliable results. Rater bias from the initial to the second test was reduced by blinding the rater to the JTech measurements, but this was not possible with the MMT scores. This study measured reliability within a test and retest of a short period of time, these results may not be applicable to repeated tests being done days or weeks apart, and this would need to be studied further.
Further research to produce an optimal and reliable testing procedure for the six wrist measurements is necessary as well as further work on optimizing adequate stabilization. Normative values within an asymptomatic population will be of benefit in returning patients to pre-injury levels and there remains a requirement to calculate the standard error of measurement to enable therapists to know if an actual change in strength following rehabilitation has been achieved. It would be of interest to test muscles in varied positions, the alteration of muscle length and stability will present different strength patterns that can be precisely rehabilitated following specific injuries or surgery.
Conclusions
MMT shows variable reliability in the assessment of isometric strength of asymptomatic and symptomatic wrists. They are not sensitive enough to detect small changes in a patient's strength. HHD shows fair to excellent reliability in the objective measurement of isometric wrist strength. The reliability of this strength assessment technique will benefit patients through the therapist's ability to modify therapy according to objective changes and therefore prescribe specific strengthening exercises. Further research into the most reliable testing procedure with adequate stabilization of all movements is required to produce the most reliable results for all population groups.
Footnotes
Acknowledgements
This research received a research grant from the British Association of Hand Therapy Ltd.
Competing interests: None declared.
Appendix 1 Standardized testing positions
Participant seated on four-legged chair with a standard plinth on the side of the participant being tested. The humerus was stabilized adjacent to the body by use of a soft band (Velfoam); the elbow was measured at 90° with a Promedics clear international goniometer (Promedics Orthopaedic Company, Port Glasgow, UK), the plinth was adjusted to allow the elbow to rest at this position. The forearm was stabilized on the plinth using dycem non-slip material (Dycem Limited, Bristol, UK), with the wrist over the edge of the plinth at a 5 cm mark, measured from the dorsal, distal end of the radius (by a Promedics universal finger goniometer). The rater faced the participant and was seated on a four-legged chair. The HHD was placed perpendicular to the limb segment being tested.
Before each test the rater demonstrated to the participant the movement required, asking the participant to reproduce the movement with no resistance applied.
The instructions to the participant were standardized and given at the same time as applying pressure manually or with the HHD as: ‘I want you to match my resistance. Now push as hard as you can, keep pushing, keep pushing, keep pushing, and relax’.
Forearm in full available pronation, wrist at 0°. Placement of HHD or manual pressure at dorsal and distal metacarpal shaft of middle finger. Tester uses one hand to test and the other hand to manually stabilize the forearm on the plinth at the 5 centimetre (cm) cm mark. If the participant is unable to achieve full pronation, the subject is measured with forearm in mid-rotation.
Forearm in full supination, wrist at 0°. Placement of HHD or manual pressure volarly across metacarpal heads. Tester uses one hand to test and the other hand to manually stabilize at the 5 cm mark. If the participant is unable to achieve full pronation, the subject is measured with forearm in mid-rotation.
Forearm in full pronation, wrist at 0° of deviation. Placement of HHD or manual pressure over radial border of the shaft of second metacarpal. Tester uses one hand to test and the other hand to manually stabilize the forearm on the plinth at the 5 cm mark. If the participant is unable to achieve full pronation, the subject is measured with forearm in mid-rotation.
Forearm in full pronation, wrist at 0° of deviation. Placement of HHD or manual pressure over ulna border of the shaft of fifth metacarpal. Tester uses one hand to test and the other hand to manually stabilize the forearm on the plinth at the 5 cm mark. If the participant is unable to achieve full pronation, the subject is measured with forearm in mid-rotation.
Forearm in mid-rotation, wrist at 0°. Placement of HHD or manual pressure on the volar surface of the wrist just distal to the 5 cm mark. Tester uses one hand to test and the other hand to manually stabilize the forearm on the plinth just distal to the elbow.
Forearm in mid-rotation, wrist at 0°. Placement of HHD or manual pressure on the dorsal surface of the wrist just distal to the 5 cm mark. Tester uses one hand to test and the other hand to manually stabilize the forearm on the plinth just distal to the elbow.
