Abstract
Introduction
The Good Nursing Care Scale for Nurses (GNCS-N) measures nurses’ perceptions of nursing care quality, but no validated Indonesian version has been available.
Objective(s)
This study aimed to translate, adapt, and evaluate the construct validity and reliability of the Indonesian version of the GNCS-N (I-GNCSN).
Method
Following COSMIN-informed procedures, the GNCS-N underwent forward-back translation, expert review, and pilot testing. A cross-sectional study using convenience sampling was conducted among inpatient nurses, and 255 complete responses were retained for confirmatory factor analysis (CFA). A second-order CFA was performed in LISREL 8.72 because the GNCS-N has an established theory-based multidimensional structure. Model fit was evaluated using chi-square, RMSEA, SRMR, CFI, NNFI, NFI, GFI, and AGFI. Convergent validity was assessed using average variance extracted (AVE), and internal consistency was assessed using composite reliability and Cronbach’s alpha.
Result
The final I-GNCSN retained the original seven dimensions and 40 items. The re-estimated second-order CFA showed acceptable-to-good fit: chi-square = 1529.99, df = 708, RMSEA = 0.068 (90% CI 0.063-0.072), SRMR = 0.059, CFI = 0.98, NNFI = 0.98, and NFI = 0.96. GFI (0.77) and AGFI (0.73) remained below ideal thresholds and are therefore interpreted cautiously. Most standardized factor loadings were moderate to high, although one item showed comparatively weak performance and should be re-examined in future studies. Internal consistency remained acceptable, whereas convergent validity was weaker for constructs with AVE values below 0.50.
Conclusion
The I-GNCSN demonstrates acceptable structural validity and good internal consistency for use among Indonesian inpatient nurses. However, some fit indices remained suboptimal and convergent validity was mixed for several constructs; therefore, findings should be interpreted cautiously. Further studies should examine temporal stability, test the instrument in broader clinical settings, and re-evaluate weaker items.
Keywords
Introduction
The quality of nursing care (QNC) is a fundamental indicator of health system performance, as it directly affects patient outcomes, satisfaction, and safety, nurse well-being and job satisfaction. Conceptually, QNC is a multidimensional construct that encompasses both structural and process-related factors, including the competence of nursing staff, the care environment, and patient–nurse interactions (Juanamasta et al., 2021; Leino-Kilpi, 1991).
QNC has been assessed using patient outcomes, single global ratings, and multidimensional questionnaires. Outcome indicators (Aiken et al., 2002; McHugh et al., 2021) and single-item measures (Aiken et al., 2012, 2013; Sloane et al., 2018) are useful for monitoring performance, but they do not adequately capture the breadth of nursing care or provide strong psychometric evidence for the construct itself. For this reason, multidimensional instruments are more appropriate when the goal is to examine nurses’ perceptions of care quality and to evaluate specific domains that may require improvement.
Among available instruments, the Good Nursing Care Scale for Nurses (GNCS-N) was selected because it is theory-based, multidimensional, and has been used internationally in several language and cultural contexts (Gaalan et al., 2019; Gröndahl et al., 2019; Leino-Kilpi, 1992; Xue et al., 2023). Unlike single-item measures, it captures several domains of nursing care quality and therefore offers stronger potential for quality improvement and cross-context comparison. The GNCS-N was also preferred over locally developed alternatives because its conceptual framework and item structure have already been established and tested in prior studies.
Despite its relevance, no validated Indonesian version of the GNCS-N has been available. This represents an important gap because QNC assessment in Indonesia requires an instrument that is not only linguistically translated but also culturally adapted and psychometrically tested. The present study therefore aimed to translate, culturally adapt, and evaluate the structural validity and reliability of the Indonesian GNCS-N among inpatient nurses.
Literature Review
Many ways are used to measure quality nursing care, including using indicators, questionnaires, or single questions, instead of measuring patient perception. Several studies used indicators or patient outcomes, including Heede et al. (2007), Cimiotti et al. (2012), Carthon et al. (2012), Aiken et al. (2014), McHugh et al. (2021), and Aiken et al. (2021). Meanwhile, many studies measure quality nursing care by a single question that focuses on quality nursing care as an outcome (Aiken et al., 2002; Aiken, Clarke, Sloane, et al., 2002; Aiken et al., 2012; Aiken et al., 2013; Brooks-Carthon et al., 2011; Nantsupawat et al., 2011; Sloane et al., 2018). However, this single item could not measure the psychometric properties. Besides, two studies used self-administered questionnaires, including Liu and Aungsuroch (2018) and Koy et al. (2020).
On another side, Lucero et al. (2010) measured quality nursing care as a process. They measured using unmet nursing care needs based on RNs’ reports of necessary nursing care left undone. Moreover, some studies use the “Good Nursing Care Scale” (Leino-Kilpi, 1992), as follows Gaalan et al. (2019), Gröndahl et al. (2019), and Istomina et al. (2012). In addition, Tsogbadrakh et al. (2021) developed the quality nursing care scale instrument in Mongolia. Meanwhile, regardless of the process or outcome measurement reference. Some studies developed and measured psychometric properties their instrument to fit norms, cultures, and beliefs. They construct the questionnaire to make it appropriate within their country’s condition (Koy et al., 2017; Liu et al., 2021; Tsogbadrakh et al., 2021).
Good Nursing Care Scale was developed by Leino-Kilpi (1992). The main categories of good nursing care are actor, characteristics of the actor, task-oriented and human-oriented activities, modes of activity, pre-conditions and aims. After three times modification, the total number of items become 40 that load into 7 dimensions, including 1) Staff characteristics; 2) Task-oriented activities (Physical activities, Education activities, and Supporting initiative); 3) Human-oriented activities (Respect, Caring, and Encouragement); 4) Preconditions; 5) Progress of nursing care; 6) Environment (Physical environment and social environment); and 7) Cooperation with relatives. GNCS-N have been widely used and translated into several languages. The latest validity test showed GNCS-N provides evidence of unidimensionality with an adequate goodness-of-fit to the Rasch model (Stolt et al., 2019). Person-separation validity was acceptable and misfit was reasonable with Rasch-equivalent Cronbach α was 0.88 (nurse data).
Objectives
This study aimed to examine whether the Indonesian version of the GNCS-N retained the original multidimensional structure and demonstrated acceptable reliability for use among Indonesian inpatient nurses.
Method
Design
The validation was done using a cross-sectional observational study with psychometric analysis.
Sample
Sample size for CFA should be linked to both the number of items and the complexity of the hypothesized model. With 40 observed items and a second-order factor structure, a sample exceeding 200 participants was considered adequate for stable estimation, consistent with commonly cited CFA recommendations (Gunawan et al., 2021; Kyriazos, 2018). Participants were recruited using convenience sampling from inpatient units. Because this non-probability approach may introduce selection bias, representativeness should be interpreted cautiously. After screening the returned questionnaires for completeness, 255 responses were retained for the final CFA.
Inclusion/Exclusion Criteria
Information gathered in July–August 2022 came from a convenience professional nurse working in the inpatient department (IPD). The following qualified for inclusion: One should have a graduate degree or diploma in nursing from a reputable university and at least one year working in an IPD unit. Professionals who were not then actively working for pay at the time the data were gathered were excluded from the study.
Translation
Translation and cultural adaptation followed COSMIN-informed principles (Phongphanngam & Lach, 2019): forward translation by two bilingual translators, reconciliation into one Indonesian version, expert review of semantic and conceptual equivalence, independent back-translation by a bilingual translator blinded to the original instrument, and final harmonization. The process was intended to preserve conceptual meaning while improving linguistic and contextual suitability for Indonesian nurses.
The expert panel comprised five members, including three nursing academics/clinicians with doctoral qualifications and extensive clinical experience, one bilingual nursing expert, and one authorized translator. Panel members were selected based on at least 10 years of professional experience, expertise in quality of nursing care or instrument development, and proficiency in Indonesian.
Cognitive Debriefing and Pilot Testing
A pilot test with 30 nurses was conducted to assess clarity and comprehensibility. Participants reviewed the translated items and instructions and provided feedback on wording and interpretation. No major comprehension problems were identified, so no substantive item revisions were required after pilot testing.
Data Collection Instruments
Seven dimensions with 40 items of GNCS-N, including five items of nursing staff characteristics (NSC), six items of care-related activities (CRA), five items preconditions for care (PFC), 5 items of nursing environment (NE), six items of course of nursing process (CNP), seven items of patients’ coping strategies (PCS) and six items of collaboration with relatives (CWR). The scoring uses a five-point Likert scale ranging from 1 to 5 (strongly disagree-strongly agree).
The I-GNCSN uses a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). Descriptive score categories were retained only to aid practical interpretation of mean scores in service evaluation; they were not used as evidence of validity.
The Indonesian version of GNCS-N (I-GNCSN) score is a numerical scale ranging from 1 to 5, with five distinct levels, which is expressed as x̅ = (x̅ max — x̅ mix)/k. Moreover, to ensure non-overlapping intervals, an increment of 0.01 was incorporated into each succeeding lower limit as reported in reference (Polit & Lake, 2013). The mean scores fall within five levels of interpretation. A very poor level of nursing care ranges from 1.00 to 1.80; a poor level of nursing care ranges from 1.81 to 2.60; a fair level of nursing care ranges from 2.61 to 3.40; A good level of nursing care ranges from 3.41 to 4.20, a very good level of nursing care ranges from 4.21 to 5.00.
Data Collection Procedure
Each participant was provided with the questionnaire and instructed to fill it out as thoroughly as they were able after receiving an in-depth explanation of the study from a member of the research team. Participants deposited their finished questionnaires into a secure container located in each ward, which only the research team can access.
Statistical Analysis
A second-order CFA was performed in LISREL 8.72 to test whether the Indonesian version retained the original seven-dimension structure of the GNCS-N (Istomina et al., 2012; Leino-Kilpi, 1991, 1992). A second-order model was considered appropriate because the GNCS-N is theory-based and conceptualizes good nursing care as a higher-order construct represented by related first-order dimensions. Model fit was evaluated using multiple indices rather than any single cutoff, including chi-square, chi-square/df, RMSEA, SRMR, CFI, NNFI, NFI, GFI, and AGFI (Hair et al., 2018). Factor loadings of at least 0.30 and t-values greater than 1.96 were considered acceptable.
Convergent validity was assessed using average variance extracted (AVE). AVE values above 0.50 were considered supportive of convergent validity, although values below 0.50 were interpreted in conjunction with composite reliability, consistent with Fornell and Larcker (1981).
Internal consistency was evaluated using composite reliability and Cronbach’s alpha, with values above 0.70 considered acceptable (Hair et al., 2018; Plichta et al., 2013). Because convenience sampling and self-report may influence estimates, reliability findings were interpreted together with the broader psychometric evidence.
Modification indices were consulted only after the initial model estimation and were applied conservatively. Residual correlations were added only within the same factor and only when item wording or content overlap provided a clear theoretical justification.
Ethical Considerations
The Declaration of Helsinki guided the ethical conduct of this study. The research protocol was approved by the National Research and Innovation Agency of the Republic of Indonesia (BRIN) and the hospital director granted permission for the study. Written informed consent was obtained from all participants before data collection. Participation was voluntary, no incentives were provided, and participants were free to withdraw at any time.
Results
Demographic of Respondents
Most of the nurses were 30 years old, with an average of 30 years old, and the range was from 22 to 57 years old. Almost all nurses were married (81.1%). Almost two thirds of nurses were female (73.4%). Seventy-four point three percent were working in non-intensive care settings. More than half of respondents had a bachelor’s degree (57.2%), a diploma in nursing (42.3%), a master’s degree, and a specialist, 0.3% and 0.3%, respectively.
CFA Assumption
Normality, linearity, and absence of multicollinearity were assumed before performing CFA. First, we used the Critical Ratio (CR) for Skewness (S) and Kurtosis (K) to assess univariate normality. SI’s CR fluctuated between -8.93 and -1.58. SK’s CR fluctuated between -13.24 and -2.78. Hair et al. (2018) found that both the SI and SK for CR were greater than 1.96 (<0.05). This meant that the normalcy assumption had been broken, small distortions from simple structure did not lead to misfit in the RMSEA and SRMR (Beauducel & Wittmann, 2005). In addition, the scatterplot matrix was used to verify linearity. Scatterplot results showed a linear relationship between all possible combinations of independent variables, Seven latent variables were tested for correlation, and the results showed that the correlation between any given pair of variables was anywhere from 0.58 to 0.83, while between 40 items were from 0.15 to 0.80. Bivariate multicollinearity is present when the absolute value of any correlation between two variables is > 0.90 (Hair et al., 2018; Kline, 2015). Therefore, the multicollinearity assumption was not disproved.
Construct Validity
I-GNCS-N Goodness of Fit Statistics (n = 255)
Note. Normal Theory Weighted Least Squares Chi-square = 1529.99; df = 708; p = 0.00. Although GFI and AGFI are presented for completeness, greater emphasis was placed on RMSEA, SRMR, NFI, NNFI, and CFI because GFI and AGFI are sensitive to sample size and model complexity.

Second-order confirmatory factor analysis of I-GNCSN
I-GNCS-N Factor Loading, Coefficient of Determination, Composite Reliability, Average Variance Extracted, and Cronbach Alpha (n = 255)
Note. B = standardized factor loading; R2 = coefficient of determination; Error = 1 - R2; ρc = composite reliability; ρv = average variance extracted. AVE values of 0.50 or higher are generally preferred. However, when AVE is below 0.50 but CR exceeds 0.60, convergent validity may still be considered acceptable.
One item (X5.3) showed comparatively weak performance (loading = 0.32; R-squared = 0.11), suggesting that it should be monitored in future validation studies rather than automatically removed from this adaptation study. Further study should consider modifying the item.
Construct Reliability and Average Variance Extracted
Internal consistency estimates remained acceptable across dimensions. Composite reliability values ranged from 0.768 to 0.896, indicating satisfactory construct reliability across all seven dimensions. Average variance extracted values ranged from 0.359 to 0.616 (Table 2). Three dimensions, namely nursing environment, patients’ coping strategies, and collaboration with relatives, achieved AVE values above 0.50. The remaining dimensions had AVE values below 0.50; however, because all corresponding composite reliability values exceeded 0.60, convergent validity was considered acceptable although not optimal. In line with Fornell and Larcker (1981), constructs with AVE values below 0.50 may still be considered acceptable when composite reliability remains adequate, but the evidence is not uniformly strong and should be confirmed in future studies.
Discussion
This study translated and validated the GNCS-N for Indonesian inpatient nurses using a structured cross-cultural adaptation process and a re-estimated second-order CFA. The modified analysis provided more stable support for the original seven-dimension structure than the earlier model. Fit indices such as RMSEA, SRMR, CFI, NNFI, and NFI supported acceptable-to-good model fit, although GFI and AGFI remained below ideal thresholds and should not be overlooked. Because GFI and AGFI are known to be sensitive to sample size and model complexity, greater interpretive weight was placed on RMSEA, SRMR, CFI, and NNFI when evaluating model fit (Nozawa et al., 2026; Sharma et al., 2005). Accordingly, the structural validity of the I-GNCSN is better described as acceptable rather than excellent.
The findings are broadly consistent with international validation studies showing that the GNCS-N can be adapted across settings while retaining its multidimensional structure. The psychometric performance of the I-GNCSN aligns closely with other international validations. Persian version showed CFI = 0.97 and RMSEA = 0.039 (Esmalizadeh et al., 2024). Internal consistency in the present study (Cronbach’s α = 0.73–0.96) is comparable to values reported for Persian (0.93) (Esmalizadeh et al., 2024), Mongolia (0.94) (Gaalan et al., 2019), and Finland (0.71–0.97) (Istomina, 2011) adaptations, supporting the robustness of the scale across diverse cultural and healthcare systems.
High inter-construct correlations suggest that the dimensions of nursing care quality are closely related, which is theoretically plausible in a higher-order model. At the same time, this overlap means that discriminant validity should be examined further in future work using additional approaches such as the heterotrait-monotrait ratio and measurement invariance testing across hospital and ward types.
One notable finding concerns Item 24 (“Patient can stay longer to recover”), which displayed the highest residual variance. This may reflect Indonesia’s Ministry of Health regulations, which standardize patient length of stay based on diagnosis. In contrast, settings with more flexible discharge policies may rate this item differently. Cultural and policy-specific constraints may therefore influence the relevance of certain items, suggesting the need for contextual interpretation or potential modification in future iterations.
By providing a valid and reliable measure, the I-GNCSN offers nurse managers, hospital administrators, and policymakers a practical tool to evaluate multiple dimensions of nursing care quality. This supports not only quality monitoring but also targeted interventions to improve care processes and patient outcomes.
Strengths and Limitations
This study provides the first Indonesian cross-cultural adaptation and psychometric evaluation of the GNCS-N for inpatient nurses. Strengths include the use of COSMIN-informed procedures, a structured translation process, and theory-driven CFA. The reanalysis also resolved the earlier inadmissible parameter estimate, yielding a more defensible structural model.
Several limitations affect interpretation. Convenience sampling may have introduced selection bias and limits generalizability beyond inpatient settings similar to those included here. Self-report data may be affected by response bias, and temporal stability could not be evaluated because test-retest reliability was not assessed. In addition, formal CVI/CVR indices were not calculated during expert review, some model fit indices remained below ideal thresholds, and one item showed weak performance. These limitations suggest that the I-GNCSN should be considered a promising but still developing tool whose performance should be re-examined in broader and more representative samples.
Implications for Practice
In Indonesian healthcare settings, the I-GNCSN could be used by nurse managers and quality teams to identify specific dimensions of care requiring improvement, such as the care environment, the nursing process, or collaboration with relatives. For example, ward-level results could be reviewed alongside incident reports, staffing data, or patient complaints to guide targeted quality-improvement plans.
The instrument may also support hospital accreditation and internal quality monitoring by providing structured evidence on nurses’ perceptions of care quality. Repeated administration could help evaluate whether managerial interventions, staff development, or workflow changes are associated with improvement in specific GNCS-N dimensions. These applications are most appropriate in inpatient settings comparable to those represented in this study.
Conclusion
The Indonesian GNCS-N demonstrates acceptable structural validity and good internal consistency in a sample of Indonesian inpatient nurses. However, several constructs showed weaker convergent validity, some fit indices remained below ideal thresholds, and one item performed relatively weakly. The instrument therefore appears suitable for cautious use in inpatient settings comparable to the study sample, while further studies should assess test-retest reliability, strengthen content validity evidence, and re-evaluate the scale in broader clinical contexts.
Footnotes
Acknowledgments
The copyright of the original instrument retains to Professor Helena Leino-Kilpi RN, MEd, PhD ©Leino-Kilpi 2013.
Ethical Considerations
Throughout the entirety of this inquiry, we maintained our dedication to upholding the ethical standards outlined in the Declaration of Helsinki. Ethical clearance was reviewed and approved by the National Research and Innovation Agency of the Republic of Indonesia (BRIN) with Ref. No: 176/KE.01/SK/8/2022. In addition, Hospital director granted the permission to carry out the study. Before enrolling in the research study, each participant provided their written informed consent. Participants was freely to denied or withdraw from this research during data collection.
Author Contributions
IGJ, RP, YA, and MLF contributed to the conceptualization and study design. Data collection was performed by IGJ, ASI, BA, and JG. IGJ and RP conducted the data analysis. Study supervision was provided by RP, YA, and MLF. IGJ drafted the manuscript. Critical revisions for important intellectual content were undertaken by RP, YA, MLF, BA, and JG.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The Second Century Fund (C2F) Chulalongkorn University.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
