Abstract
Background
Epidemiological evidence underscores low back pain (LBP) as a prevalent and consequential musculoskeletal disorder, posing a significant public health challenge. Patient-reported outcome measures (PROMs) play a crucial role in the diagnostic process for LBP, with the Roland-Morris Disability Questionnaire (RMDQ) being a commonly utilized tool in evaluating LBP.
Objective
This cross-sectional study aimed to cross-culturally adapt and validate the Indonesian version of the 24-item-RMDQ among nonspecific LBP (NSLBP) patients.
Methods
The RMDQ scales underwent forward-backwards translation, readability, and content validity assessments with NSLBP patients (n = 137), with a mean age of 38.6 ± 11.8 years (59% female). Psychometric testing included assessments of internal consistency and 1-week test-retest reliability, convergent validity with pain numeric rating scale (PNRS), and the Physical Component Summary (PCS) and Mental Component Summary (MCS) of quality of life (Short Form 12). The construct validity using confirmatory factor analyses (CFA).
Results
The findings of this study indicated a good internal consistency (Cronbach α = 0.80) of the translated instrument. Moderate to good repeatability estimates of all RMDQ items were demonstrated with the total ICC of the total RMDQ score of 0.90 [95%CI (0.85–0.94)]. The instrument correlations with PNRS, PCS, and MCS were 0.54, 0.60, and 0.23, respectively. The goodness-of-fit test further affirmed an acceptable fit of the data, although low factor loadings were found in several RMDQ items.
Conclusion
Although the factor structure of the RMDQ scale warrants further investigation, the overall findings support its suitability for clinical application in Indonesian NSLBP patients.
1. Introduction
Low back pain (LBP) is a significant global public health concern that affects the quality of life and work absenteeism of individuals across all age groups [1]. While healthcare utilisation for addressing LBP shows regional variations, globally, LBP is one of the most prominent causes of primary care visits and has a higher impact on the workforce and finances [2], thus, imposes substantial disability and financial burdens [3]. As a leading cause of disability globally [4, 5], LBP also ranks fourth in disability-adjusted life years (DALYs), with over half a billion people worldwide affected by this condition [6]. The burden of LBP is projected to increase, particularly in low-income and middle-income countries, including Indonesia. Recent studies in Indonesia indicate a 12-month prevalence of 44% among middle-aged adults [7], while the estimated overall prevalence is between 7.6% and 37% [8]. The exact prevalence of LBP in Indonesia, however, remains uncertain, partly due to diagnosis limitations in clinical settings.
The LBP diagnosis often uses Patient Reported Outcome Measures (PROMs), with the Roland-Morris Disability Questionnaire (RMDQ) being a prevalent choice for assessing, monitoring, and evaluating LBP [9]. The RMDQ is widely used in clinical and research settings for quantifying disability, monitoring changes over time, assessing treatment effectiveness, and comparing the impact of LBP across different groups. The RMDQ comes in various versions, varying in the number of questions assessing LBP’s impact on daily functioning [10]. The 24-item version offers the most comprehensive assessment, covering a wide range of activities, allowing the most detailed examination [11]. The shorter version, such as the 15- and 16-item versions, however, provides a time-efficient alternative. Nevertheless, all versions capture diverse dimensions of disability, including personal care, lifting, walking, sitting, standing, sleeping, and social interactions. The 24-item RMDQ version, however, is currently the recommended RMDQ version.
The RMDQ is known for its ease of understanding and has been validated in various settings, including in a Hindi version [12, 13]. The Urdu version of the RMDQ has also been found to be a valid and reliable instrument for evaluating disability associated with chronic nonspecific LBP [14]. The Amharic version of the RMDQ was found to have good reliability and validity in assessing disability related to LBP in the Ethiopian population [15], as well as the Hausa [16], and Igbo [17] version in Nigerian LBP patients. Despite evidence supporting the RMDQ’s validity in assessing disability in LBP patients across various populations and cultures, it has yet to undergo cross-cultural adaptation to ensure its validity and reliability in Indonesia. This adaptation is crucial for facilitating direct comparisons with global research and improving generalizability and our understanding of LBP’s worldwide impact. Given that most LBP cases fall under nonspecific LBP (NSLBP), where the LBP cause is not definite [18], this study focuses on cross-culturally adapting and evaluating the validity and reliability of the 24-item RMDQ among individuals experiencing NSLBP in Indonesia.
2. Methods
2.1. Study design, participants, and ethical clearance
This cross-sectional study focuses on individuals in Indonesia experiencing nonspecific low back pain (NSLBP). Participants for the readability test were recruited from a health clinic, while those for psychometric analysis were recruited online through social media platforms, relying on the networks of the authors and research assistants. Inclusion criteria involved individuals with NSLBP who could understand Indonesian. Additionally, participants were questioned about symptoms related to specific LBP and provided their physician’s diagnosis, if available, to exclude specific LBP diagnoses. Those participating in the readability assessment underwent physical examinations to rule out specific causes of LBP. The study encompassed individuals with acute, sub-acute, and chronic nonspecific LBP, covering a range of symptom durations. The required sample size for assessing the instrument’s validity was determined by the ratio of cases to parameters, set at 5 : 1 for confirmatory factor analysis [19, 20]. Thus, the minimum sample required for the analysis was 120 participants. A subsample of participants who completed the initial assessment responded to the questionnaire one week later for a 1-week test and retest reliability assessment.
This study conformed with the Declaration of Helsinki. All participants provided written informed consent embedded in the online form. The procedure received approval from the Ethics Committee at the Directorate of Research and Community Service, Universitas Negeri Yogyakarta, under protocol No. T/17/UN34.9/KP.06.07/2023.
2.2. Measures
The instrument subjected to cross-cultural adaptation in this study was the RMDQ, developed by Martin Roland and Chris Morris in 1983 to assess the impact of LBP on an individual’s daily functioning [21]. Consisting of 24 items, the questionnaire explores a diverse range of activities, encompassing fundamental movements such as walking, standing, and sitting, as well as more intricate tasks like lifting and bending. Respondents provided ‘yes’ or ‘no’ responses to each item [22]. A score of 1 was allocated for every “yes” response, indicating the extent of disability experienced [23]. The scoring process involved summing up the scores across all 24 items, resulting in a total score that ranged from 0 to 24, with the higher score representing a higher disability [24].
Participants were also asked to respond to the 11-point numeric pain rating scale (NPRS) and quality of life using the Short Form (SF-12) with its component scores and subscales. The SF-12 comprises 12 items within eight subscales [25, 26]. Six items from four subscales generated a physical component summary score (PCS). These subscales assess various aspects, including general health perception (GH), physical functioning (PF), role limitation due to physical health (RP), and bodily pain (BP) [27]. A mental component summary score (MCS) was also derived from six items across four other subscales. These encompassed measurements of role limitations due to emotional problems (RE), vitality (VT), mental health (MH), and social functioning (SF). Higher scores on these subdomains indicate a higher quality of life [28].
Additional measures in this study were demographic characteristics, including age, sex and questions regarding the onset of the LBP and symptoms associated with specific LBP.
2.3. Procedures
This study adhered to a standardised instrument cross-cultural adaptation protocol, encompassing the essential steps of forward and backward translation for all five scales, transitioning from the source language (English) to the target language (Indonesian/Bahasa) [29]. The process also included readability and content validity evaluations, followed by comprehensive psychometric testing [21]. This study engaged six independent translators in both forward and backward translations. The readability study was conducted in 10 LBP patients. They were provided with paper copies of the translated scales and were required to review each item, responding to those they comprehended while identifying items with unclear wording. Adhering to established norms, in instances where less than 80% of participants understood all scale items, the translators undertook revisions until the specified criterion was met [21].
A panel of five experts, consisting of three general practitioners, a physiotherapist, and a public health specialist recruited from the researchers ‘network, independently assessed content validity for each item, considering its relevance to the local culture and clarity using a 4-point Likert scale. The item content validity index was calculated from the proportion of experts rating 3 or 4. An item was deemed acceptable if the index was 0.78 or higher. Items below this threshold underwent expert revision and were re-evaluated for readability and content validity to ensure cultural appropriateness and linguistic precision within the examined scales.
To evaluate convergent and construct validity, the translated scales were disseminated online for participants to complete independently. The minimal sample size for confirmatory factor analysis for validity testing was determined by maintaining a ratio of 5 cases to 1 parameter, as required for statistical estimation in confirmatory factor analysis [17, 18]. Therefore, a minimum of 120 participants was required for the analysis. The number of participants enlisted in this study exceeded this requirement. A subset of individuals participating in the initial assessment were contacted later for a second evaluation, completing the scales once more to assess the reliability through a 1-week test and retest.
2.4. Statistical analysis for psychometric testing
To evaluate the internal consistency reliability, Cronbach’s α was calculated. A commonly adopted criterion for acceptable reliability was set at alpha ≥0.70, and a corrected item-total correlation exceeding 0.30 was considered indicative of effective discrimination for an item [22]. To assess the 1-week test-retest reliability, Cohen Kappa was measured for each item while intraclass correlation coefficients (ICCs) were computed for the total score. The ICC and Cohen Kappa values below 0.50 were deemed poor, those in the range of 0.50–0.75 were considered moderate, scores between 0.75–0.90 were regarded as good, and values exceeding 0.90 were classified as excellent [23]. For convergent validity assessment, Spearman correlations were computed between RMQD and an 11-point numeric pain rating scale (PNRS) and quality of life using the Short Form (SF-12) with its component scores and subscales.
In the final step, a confirmatory factor analysis (CFA) was carried out to assess the construct validity of the Indonesian version of RMDQ. The model fit of the data was appraised based on the theoretical structure proposed by the original developers of each scale, utilising the maximum likelihood method. Several goodness-of-fit indices were considered for determining acceptable model fit. The first criterion involved the χ2/degree of freedom (df) ratio, where a value below 3 signified a good fit. The second criterion was the root mean square error of approximation (RMSEA), with a value below 0.08 indicating good fit, a value ranging from 0.08 to 0.10 indicating moderate fit, and a value exceeding 0.10 indicating poor fit [24]. Additionally computed were the comparative fit index (CFI), Tucker–Lewis index (TLI), and the standardised root mean square residual (SRMR) [24]. Adequate fit between the hypothesised model and the observed data was indicated by CFI and TLI values exceeding 0.90 and SRMR values below 0.08 [24].
The statistical analysis of data was conducted with SPSS® v29.0 (IBM, Chicago, IL, US), with the exception of the CFA analyses, which utilised STATA v14 (College Station, TX, US). Statistical significance was considered at a p-value below 0.05.
3. Results
3.1. Participant characteristics
The readability assessment involved the participation of ten NS LBP patients with a mean age of 37,2 ± 10,6 years. Additionally, 137 individuals with NSLBP, with a mean age of 38.6 ± 11.8 years (59% female), completed the online version of the Indonesian-RMDQ for the psychometric testing. Among these, 62 participants, with a mean age of 39.1 ± 12.6 years (56% female), also completed the test and retest reliability assessment a week later. Most participants across both samples reported experiencing chronic LBP, constituting 37% in the total sample and 42% in the test-retest sample. Notably, no differences in characteristics were identified between participants who completed the test and retest reliability assessment and those who did not.
3.2. Translation, readability, and content validity assessments
No modification beyond linguistic translation by panel translators was made. All the translated scales fulfilled the criterion of at least 80% of participants comprehending each item. All experts assigned ratings of 3 or 4 for equivalency and relevance to all items, affirming their acceptable content validity. As a result, no additional modifications were deemed necessary.
3.3. Psychometric testing
As shown in Table 1, the internal consistency of the Indonesian-translated version of RMDQ was good (Cronbach’s α = 0.80). The item-total correlation of several items were below 0.3. However, all items were retained to achieve comparability with the original item. The repeatability coefficients of each item were moderate to good (ICCs = 0.90 [[95%CI (0.85–0.94)).
Correlation matrix of the Rolland-Morris disability questionnaire with numeric pain rating scale and quality of life.
3.4. Convergent validity test
Figure 1 shows the Spearman correlation test between the RMDQ and PNRS was 0.54, while the RMDQ with PCS and MCS were 0.6 and 0.23, respectively. Moderate correlations were found between RMDQ and ‘bodily pain’ (r = 0.63) and ‘role physical’ (r = 0.56) subscales.

Correlation matrix of the Rolland Morris disability questionnaire with numeric pain rating scale and quality of life. Notes: RMDQ: Rolland Morris Disability Questionnaire, PNRS; Pain Numeric Rating Scale (Pain), GH: General Health, PH: Physical Health, RP: Role Physical, BP: Bodily Pain, RP: Role Physical, MH: Mental Health, V: Vitality, SF: Social Functioning, PCS: Physical Component Health, MCS: Mental Component Summary.
3.5. Construct validity test
Figure 2 further illustrates the unidimensional factor structure of the 24 RMDQ items.

Confirmatory factor analysis of the Rolland Morris disability questionnaire.
The goodness and fitness assessments of the unidimensional hypothetical factor structure of the Indonesian RMSQ revealed an RMSEA of 0.098 and a chi-square to degrees of freedom ratio (χ2/df) of 2.85. Additionally, the CFI was 0.86, the TLI was 0.90, and the SMSR was 0.081. The goodness and fitness assessments, thus, imply a moderate level of fit between the model and the observed data.
4. Discussion
This study is the first study that culturally adapts and analyses the psychometrics of the Indonesian version of the RMDQ. Findings from the readability test emphasise the clarity and accessibility of the translated version, while the content validity assessment confirms an acceptable content validity index for both instruments, affirming their linguistic integrity. Additional psychometric analyses of the translated instrument’s internal consistency, test-retest reliability, and convergent validity reveal satisfactory results. The goodness-of-fit test further affirmed an acceptable fit of the data, although low factor loadings were found in several RMDQ items. While the factor structure of the RMDQ scale warrants further investigation, the overall findings support its suitability for application in Indonesian NSLBP patients.
The comparative analysis explores studies on the cross-cultural adaptation and validation of the RMDQ across various linguistic and cultural contexts. The adapted Indonesian version of RMDQ displayed satisfactory internal consistency reliability and test and retest in this current study, as evidenced by a Cronbach’s α of 0.80 and an ICC of 0.90. Similar findings were reported in a study adapting the RMDQ in the Hindi population, demonstrating excellent psychometric indices with Cronbach’s α of 0.99 and ICC of 0.98 [11]. Significant correlations were also observed between the Hindi RMDQ scores and pain numeric rating scale (PNRS) ratings, indicative of its convergent validity within the Indian cultural setting [11]. Similarly, the Urdu adaptation exhibited commendable psychometric properties [25, 26], including strong internal consistency (Cronbach’s α = 0.860) and test-retest reliability (ICC = 0.846) [26]. These findings were further substantiated by significant associations with pain measures, affirming their reliability and validity within the Pakistani NSLBP population.
The Nigerian adaptations, encompassing both Igbo [15] and Hausa [14] versions, also showcased excellent psychometric properties, rendering them valuable tools for assessing disability in Nigerian NSLBP patients. The Igbo adaptation demonstrated excellent internal consistency (Cronbach’s α = 0.91) and test-retest reliability (ICC = 0.84) [15], while the Hausa version exhibited adequate internal consistency (Cronbach’s α = 0.70) and test-retest reliability (ICC = 0.79) [14]. Moreover, significant correlations with pain measures underscored the convergent validity of both adaptations, further supporting their applicability in assessing disability within the Nigerian cultural setting. Lastly, the Ethiopian adaptation of the RMDQ in Amharic also demonstrated good internal consistency (Cronbach’s α = 0.88) and excellent test-retest reliability (ICC = 0.91) [13]. Furthermore, significant correlations with quality of life measures underscored its convergent validity, endorsing its efficacy in capturing disability experiences within Ethiopian NSLBP patients [13].
While each adaptation of the RMDQ exhibited commendable psychometric properties reflective of its reliability and validity within specific cultural and linguistic contexts, nuanced discrepancies in factorial structure and levels of reliability underscore the necessity for tailored adaptations and validation processes. The findings collectively underscore the imperative for meticulous cross-cultural adaptation to ensure the validity and reliability of the RMDQ across diverse populations with NSLBP. Further research to refine the utility of the RMDQ as a comprehensive tool for assessing disability and informing clinical interventions in the global population, particularly in Indonesia, is warranted.
Furthermore, factor analyses to confirm the underlying structure of the Indonesian RMDQ are recommended, as the CFA in this study revealed discrepancies in item loadings. This is also crucial because there is a lack of studies incorporating factor analysis. Additionally, most studies have relied on the exploratory factor analysis (EFA) [13, 25], rather than CFA, while the EFA is not designed to confirm whether the hypothetical unidimensional structural factor in the RMDQ is retained in the translated version. Moreover, the RMDQ cross-cultural adaptation studies that demonstrated the translation versions’ uni-dimensionality were reported in the shorter RMDQ versions, such as the 15-item Brazilian RMDQ for general pain [27] and a 15 item Brazilian RMDQ for NSLBP [27]. Moreover, the bidimensional factor structure was also reported in the 16-item Brazilian RMDQ [27] and the bidimensional 16-item English RMDQ in older adults NSLBP [12]. The findings from these studies also suggest the necessity for further exploration of the factor structure in the 24-item Indonesian RMDQ. In line with the approaches employed in these studies, future research should consider exploring the selection of items from the original 24-item Indonesian RMDQ version to create alternative shorter RMDQ versions.
This research supports previous findings in various cultural contexts, emphasising the importance of linguistic and cultural adjustments to clarify translated assessment tools and ensure that the original RMDQ’s psychometric properties are retained in the translated version. Notably, the study rigorously evaluates RMDQ’s suitability for assessing disabilities among the NSLBP in the Indonesian population, contributing valuable insights for clinical practice. However, the study has several limitations that should be acknowledged. Firstly, the exclusive focus on NSLBP may limit the generalizability of its findings to other musculoskeletal disorders. Thus, caution is advised when applying these findings to different clinical contexts or populations. Secondly, this study’s cross-sectional design may overlook longitudinal changes in patients’ conditions, potentially impacting the assessment of the measures’ responsiveness over time. Therefore, exploring the RMDQ responsiveness to changing health conditions is recommended. Thirdly, although the ratio of participants to number of items/parameters in this study was 5 : 1, which was an acceptable sample size for CFA analysis, as suggested by Kline [28], the ratio was below the sample size for CFA of at least 200 participants that is recommended by Myers, Ahn [29]. Expanding the sample, including various age groups, and incorporating patient perspectives through qualitative methods may offer valuable insights for improving the cross-cultural adaptation of the instruments.
5. Conclusion
The Indonesian RMDQ translation demonstrated acceptable content validity and was readable for NSLBP patients. Further psychometric testing revealed good internal consistency and test-retest reliability. The instrument also demonstrated convergent validity with pain and quality of life measures. However, while the construct validity showed an acceptable fit, some RMDQ items had low factor loadings, indicating a need for further exploration of the underlying factor structure and potential selective item retentions for alternative shorter RMDQ versions. Nevertheless, the overall findings support the suitability of the Indonesian version of the RMDQ for clinical use among NSLBP patients in Indonesia. This adaptation offers a valuable tool for assessing disability and functioning in Indonesian NSLBP patients. It also has the potential to contribute to global health initiatives and cross-cultural disability research in diverse populations.
Footnotes
Author contributions
NIA, RY, JK, and HH conceived and planned the study. NIA, JK, and HH administered the survey. NIA, RY, and HH performed the statistical analysis and interpreted the data. NIA and JK drafted the manuscript. All authors reviewed and approved the final version of the manuscript.
Acknowledgments
This study was funded by the Universitas Negeri Yogyakarta Research Assignment Grant Number 01-08/UN.34/SPK/KU/2023.
Conflict of interest
No conflict of interest is declared.
