Abstract
BACKGROUND:
Service-learning (SL) has been widely implemented and grown as a pedagogy in the rehabilitation professions. However, assessment on the quality of evidence for the effectiveness of SL related to student learning outcomes and the scope of SL activities related to the occupation of work in the rehabilitation professions is not available.
OBJECTIVE:
This systematic review was to evaluate the rigor of the methodological quality of SL studies and the scope of SL activities related to the occupation of work in the rehabilitation professions.
METHODS:
We performed a systematic on-line electronic literature search of nine bibliographic databases available through the university library system to identify peer-reviewed journal articles on SL provided by the tri-alliance of rehabilitation professional students, with the primary or secondary outcome on the evaluation of student SL experiences. Twenty-two SL articles using experimental design between 1995 and 2016 were extracted as they qualified for the methodological appraisal. Appraisal of each article was performed independently by four investigators using the Effective Public Health Practice Project Quality Assessment Tool for Quantitative Studies.
RESULTS:
In six of the 22 SL studies (27%), service provided by the rehabilitation professional students was related to the occupation of work (i.e., assessment, prevention of illness, injury, and disability, and intervention). There was a significant increase in the number (and percent) of SL studies related to the occupation of work compared to that of a previous systematic review (0%, P = 0.03, Fisher’s exact test). Results from the evaluation of the methodological quality of these 22 reviewed articles revealed that all received a global rating score of weak. The low methodological quality rating of the reviewed articles was mainly attributed to not controlling for confounders (22 articles), non-blinding (21), and using outcome measures which did not have evidence to support their validity (14). Inability to control for confounders was related to weak research design as more than 77% of the reviewed articles used quasi-experimental designs without a control group. Non-blinding was related to the self-report nature of the outcome measures.
CONCLUSIONS:
A significant increase in the number of SL studies related to the occupation of work was found, which may provide an indirect indication on an increase in the capacity to provide (work) rehabilitation services. However, selected studies demonstrated high risk of bias which limited firm conclusions to be drawn on reported findings from SL in the tri-alliance of rehabilitation professions curricula.
Introduction
Service-learning (SL) is a form of experiential education in which students engage in organized meaningful volunteer service activities through reciprocal collaborative partnerships with organizations in the community in response to their identified needs and concerns (e.g., health-related, and social and/or economic injustice issues) [1, 2]. The service activities have to meet students’ academic course learning objectives. In addition, students are involved in structured ongoing deliberate reflections to enrich their service provision experience [1, 2]. The three critical elements that constitute SL and partly differentiate it from other types of pedagogy are experiential, reciprocal and reflection [1, 3].
Since the launch of the Health Professions Schools in Service to the Nation (HPSISN) program in 1995, which fosters partnerships between universities and communities to improve health care services across the country [4], SL has been widely implemented and grown as a pedagogy in a number of health professions/science academic curricula [5]. The model of SL has expanded and amalgamated with other pedagogies [6] which include international cultural immersion [7], inter-generational [8], inter-professional [9] and inter-disciplinary education [10]. By including the three elements (i.e., experiential, reciprocal and reflection) critical to SL, health professional students are provided with opportunities for civic engagement (i.e., through the experiential learning) [11, 12]. Through self-reflection, students gain further understanding of course content, a broader appreciation of the discipline they are studying, the distinctive perspective of a particular culture and community, and an enhanced sense of civic responsibility [13]. Community engagement and reciprocal knowledge exchange between students and community members prepare students to practice in the rapidly changing health care environment [14].
Medicine and nursing have been the primary leaders of SL to date, which is illustrated in the systematic review conducted by McMenamin and associates [12]. Of the 37 discipline specific SL articles that McMenamin and associates evaluated, nineteen (51%) were from nursing and medicine, ten (27%) were from allied health/rehabilitation professions, four (11%) from social work, and four (11%) from pharmacy [12]. This leaves a gap for a better understanding of SL pedagogy in allied health/rehabilitation professions. The tri-alliance of rehabilitation professions consists of three professions (audiology/speech-language pathology, occupational therapy, and physical therapy) which often have joined together to address policy issues that affect all three professions. These three professions work in similar service delivery and education models and use comparable theoretical and practical approaches in education, which differ from medical and nursing academic curricula [15–17]. This suggests that findings related to rehabilitation professions from systematic reviews would be masked if all the studies from disparate professions (i.e., medicine, nursing, allied health/rehabilitation, and others) were analyzed together.
To date, no systematic review has been conducted on SL in the tri-alliance of rehabilitation professions alone. Several systematic reviews in the field of medical and health professions have been conducted which aimed to assess the evidence for the effectiveness of SL related to student learning outcomes [5, 12]. Of these systematic reviews, one did include several articles from rehabilitation professions, but the timeframe of the selected articles was up to the year of 2011, which was at least five years ago [12]. In addition, this review only touched upon study designs without a systematic evaluation of the methodological quality [12]. Therefore, it is important and timely to conduct a systematic review which evaluates the rigor of methodological quality of SL studies provided by the tri-alliance of rehabilitation professional students so as to generate a more accurate and comprehensive appraisal of the effectiveness of this educational strategy.
In addition, none of the volunteer service activities provided by the tri-alliance of rehabilitation professional students in the SL studies included in McMenamin et al.’s systematic review was related to the occupation of work, with 75% of them in the scope of intergeneration or gerontology and geriatrics [12]. One of the objectives in the recent World Health Organization (WHO) meeting on “Rehabilitation 2030: A Call for Action” was to increase the capacity to provide (work) rehabilitation [18]. Therefore, the secondary aim of this study is to evaluate whether there are any increases in volunteer service activities related to the occupation of work that are provided by the tri-alliance of rehabilitation professional students in SL since the previous systematic review conducted by McMenamin and associates.
Methods
Data sources and literature search strategies
This systematic review was conducted according to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses [19]. The process began with a systematic on-line electronic literature search of the following nine bibliographic databases available through the university library system: Academic Search Premier, Cumulative Index to Nursing and Allied Health Literature (CINAHL), Education Full Text, Education Resource Information Center (ERIC), Embase, Health Source: Nursing / Academic Education, PsycINFO, PubMed, and Scopus. Articles published in a time frame from the inception of the database until December 2016 were included. Queries to identify potential relevant publications on SL for students from occupational therapy, physical therapy, and audiology / speech-language pathology were based on Boolean combinations of the following search terms either in the title or abstract section of the paper: (“service learning” OR “community-based education” OR “community-based learning”) AND “student*” AND (“therap*” OR “physiotherap*” OR “speech language” OR “audiolog*” OR “communication disorder*” OR “rehab*” OR “allied health” OR “health profession*” OR “interprofession*”). Reference lists of the retrieved review articles related to SL from health professions were also searched.
Study eligibility criteria
Articles were included if they met the following criteria: (1) the SL experience was coordinated through an academic course or program and linked to specific academic content through a credit-bearing course, the course content included a student reflection component of SL activities; (2) the primary or secondary outcome was the evaluation of student SL experiences; and (3) the majority of the study sample (i.e., >50%) included students in one or more of the tri-alliance of rehabilitation professional programs.
Articles were excluded if they (1) were editorials, letters, commentaries, case reports, or review articles; (2) employed primarily qualitative research methods in data collection and analysis; or (3) employed one-group posttest-only research design (i.e., reported only student evaluation information at the conclusion of the SL project without pre-test assessment data or a non-SL control group). In addition, publications were limited to peer-reviewed journal articles (excluded dissertations, book chapters, and conference proceedings) written in English, with full text available.
Study selection and data extraction
A two-stage screening process of the retrieved articles was conducted. The first stage involved reviewing the title and abstract of all the retrieved articles to determine if they met the eligibility criteria. When the abstract provided sufficient information indicating that the article appeared to meet the eligibility criteria, full text of the articles was downloaded for verification, which was the second stage. Two authors (HKY and LKV) independently conducted the review in each of the two stages to reduce the chance of missing any retrieved articles that met the eligibility criteria from being included for appraisal. The bibliographic software EndNote X8 was used to manage the retrieved papers including removal of duplicates. The flow diagram in Fig. 1 describes the process that we used to select articles for this study, and the results of the literature search and data extraction.

Flow diagram of the selection process and study search results.
Once the final number of eligible articles were identified for appraisal, all four authors independently evaluated the methodological quality of each article using the Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for Quantitative Studies, a standardized critical appraisal instrument for use in systematic literature reviews with strong evidence on inter-rater agreement in rating the methodology quality of studies [20]. The tool consists of eight components: A. selection bias; B. study design; C. confounders; D. blinding; E. data collection methods; F. withdrawal and drop-outs; G. intervention integrity; and H. analyses appropriate to question. Components A to F were rated using a three-point scale (1 = strong, 2 = moderate, and 3 = weak), whereas components G and H were rated qualitatively (yes, no, can’t tell). From the scores of the first six (i.e., A to F) components, each article was assigned a global rating of strong (i.e., no weak ratings from A to F components), moderate (one weak rating), or weak (two or more weak ratings) in accordance with the EPHPP protocol [21]. Additionally as per the protocol, any disagreements related to the assigned quality rating of the articles were discussed among authors until consensus was reached.
Fisher’s exact test was used to test whether the proportion of SL reviewed articles with certain characteristics for two groups was equal or not. All hypothesis tests were based on a two-sided alpha of 0.05. Data analyses were conducted using IBM SPSS Statistics for Windows, version 22 (www.spss.com).
Twenty-two articles qualified for the methodological appraisal which spanned more than 20 years from 1995 to 2016. Sixteen of them (73%) were published within the last 10 years (i.e., between 2006 and 2016). Table 1 presents summarized data from the reviewed articles in the following areas: study design, professional discipline of student participant, sample size, participation rate, type of volunteer services, supervision of student services, service frequency and duration, outcome measurement instrument, and findings. In six of the 22 SL studies, the type of volunteer service activities provided by the rehabilitation professional students was related to the occupation of work, with four related to ergonomics, health and wellness programs, prevention of work-related musculosketal disorders, and treatment of work-related injury [22–25]; one related to Americans with Disabilities Act (ADA) compliance and access, and vocational rehabilitation services [26], and one related to ergonomics, hearing aids and workers [27]. There was a significant increase in the number (percent) of SL articles studying the impact of work rehabilitation volunteer service activities provided by rehabilitation professional students when compared to that of a previous systematic review conducted by McMenamin and associates (p = 0.03).
Description of the summarized data from the reviewed articles (n = 22)
Description of the summarized data from the reviewed articles (n = 22)
Note. ADA = Americans with Disabilities Act; ASHA = American Speech-Language-Hearing Association; AUD = audiology; CCAI = Cross Cultural Adaptability Inventory; CSSES = Community Service Self-Efficacy Scale; FAQ = Palmore’s Facts on Aging Quiz I; GAA = General Ability Assessment; IAPCC-SV = Inventory for Assessing the Process of Cultural Competence among Healthcare Professions-Student Version; IEPS = Interdisciplinary Education Perception Scale; Kogan’s Scale = Kogan’s Attitudes toward Old People Scale; MSASLHP = Methods and Strategies for Assessing Service-Learning in the Health Professions questionnaire; NGSES = New General Self-Efficacy Scale; OT = occupational therapy; OTSCMHPS = Occupational Therapy Student Comfort with the Mental Health Population Scale; PGI = Personal Growth Initiative; PT CPI =Physical Therapist Clinical Performance Instrument for Students: Version 2006; PPTCV SA = Professionalism in Physical Therapy: Core Values Self-Assessment; PT = physical therapy; SAAS = Survey of Attitudes on Aging Scale; SDTLI = Student Development Task and Lifestyle Inventory; SL = service-learning; SLP = speech-language pathology; SRM-SF = Sociomoral Reflection Measure-Short Form; WGCTA = Watson-Glaser Critical Thinking Appraisal. aPercentage of students in a course participated in SL. bOne group pre + post = One group pretest and posttest; Two group pre + post = nonequivalent control group pretest-posttest design; Static-group comparison = nonequivalent control group with posttest-only design. ∓Included a static-group comparison in post-hoc analysis; §The original design was a two group pre + post, and the research question was to investigate whether the use of structured, written reflections affect personal understanding and community self-efficacy more than the use of non-structure reflections among occupational therapy students. †About 12-18% DPT students were selected to participate in international SL experience.
None of the reviewed studies included randomization for group allocation (i.e., all utilized the quasi-experimental design). Seventeen articles (77%) employed one-group, pretest-posttest design, three employed nonequivalent control group with posttest-only design, and two employed with nonequivalent control group pretest-posttest design. Of the two articles employing a nonequivalent control group pretest-posttest design, which provides stronger evidence to indicate that the observed learning outcome may be causally related to the SL, one was published between 1995 and 2005, and the other one published between 2006 and 2016. However, there was no significant difference (p = 0.48) in the proportion of reviewed articles utilizing this quasi-experimental design between the two decade periods (1995–2005 versus 2006–2016).
Results from the evaluation of the methodological quality of the 22 reviewed articles on SL using the EPHPP revealed that all received a global rating score of weak. Rating distribution of the six (A to F) components for the 22 reviewed articles is shown in Table 2, and individual rating score of all eight (A to H) components for each of the 22 articles are presented in the Appendix.
Rating distribution of the six components and global rating of the reviewed articles (n = 22) using the Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for Quantitative Studies
The low methodological quality rating of the reviewed articles was mainly attributed to not controlling for confounders, non-blinding, and using outcome measures that did not have evidence to support their validity. Inability to control for confounders (i.e., incomparability of baseline groups) was related to weak research design as more than 77% of the reviewed articles used quasi-experimental design without a control group (i.e., one group, pretest-posttest design). Lack of intervention integrity and inappropriate data analysis strategies were additional areas of concern that also contributed to low methodological quality of the reviewed studies.
Issues related to research design, confounders and blinding
Findings from the majority of the reviewed studies, especially those employing a one group, pretest-posttest design, demonstrated significant improvement in scores of student learning outcome measures after students participated in SL. However, causal inference about the effect of SL on student outcome measures was weak due to various confounders that bias the conclusion of the findings. Events, in addition to SL, which occurred between the pretest and posttest period could affect student outcome measures. These included course assignments and classroom activities related to the SL experience, and interactions with the same type of clients as those in the SL activities during internships such as level-I fieldwork or practicum experiences [28, 29]. Only a couple of studies [30, 31] specifically stated that none of the content in concurrent courses would affect the outcome measures. Other confounders that could have produced an inflated favorable outcome at posttest include maturation of the students, regression to the mean, and social desirability bias when students, who were not blind to the study purpose and design, completed the self-report outcome measure. Only one reviewed study [22] had one of the two student learning outcome assessments completed by clinical educators. More such standardized assessments should be conducted by masked evaluators. Also, only a few reviewed studies [26, 33] included factual/cognitive knowledge assessments related to SL experiences with scores counted towards student grades as the outcome measure. Such assessments would reduce the social desirability bias.
With the nonequivalent control group with posttest-only design (or static-group comparison design), students allotted to the SL and non-SL groups may have different scores prior to engaging in the SL (i.e., allocation bias, a systematic difference between student participants allocated to the SL or the non-SL group). Three reviewed studies employed static-group comparison designs and did not collect any relevant demographic variables which could have been used to control for potential allocation bias when analyzing the data [22, 31]. It does not appear authors of the 22 reviewed studies collected any demographic information from student participants. Authors only reported demographic information for students who participated in SL, which could be extracted from the database of the class roster, but not the student participants who completed the post-SL outcome measure.
For the two studies employing a nonequivalent control group pretest-posttest design [32, 34], one did not state whether or not the pretest scores and any demographic characteristics of the student participants were equivalent between the SL and the non-SL groups [32]. In Beling’s study [32], the author used a preexisting division of the class into lab sections to assign students into the SL and the non-SL groups, which may reduce allocation bias. However, it is unclear how the students initially were assigned to the lab sections and how students in different lab sections were allotted to the SL or the non-SL group. Students in the non-SL group could exhibit resentful demoralization when completing the self-report outcome measure (i.e., posttest) and contamination bias through discussions among student between groups was unavoidable. In addition, differences in the number of student withdrawals between groups (i.e., differential attrition rate) in both studies could bias the findings.
Issues related to credibility of data collection tools
Only 8 studies used instruments that had evidence to support their reliability and validity as the outcome measure; several studies used more than one instrument to measure the outcome, with one instrument having evidence to support validity, but not the other [22, 36]. In one study [37], the authors mistakenly stated the instrument that they used had evidence to support validity. Even though some of the instruments employed in the reviewed studies have been used in previous work, there was insufficient evidence to support their validity. Such evidence included face validity and limited content validity only. Authors tended to develop their own instruments for the SL study that they undertook without going through the process of validation. The paucity of appropriate psychometrically sound instruments that the investigators could draw from prevented accurate measurement of the impact of SL on rehabilitation professional students’ perceptions related to social and cultural dimensions of health [38].
Issues related to sample representativeness
By definition, SL has to link to an academic course, and most of the courses in the reviewed studies were compulsory, therefore, selection bias (i.e., a systematic difference between students were selected to participate in SL and those who were not), may not be a significant issue [39]. However, ten reviewed studies (45%) provided very limited or no description of the student samples’ demographic characteristics [3, 41], and not all students who engaged in SL completed the pretest and posttest of the outcome measures (i.e., participated in the study), thus the extent to which the sample was representative of the population was not clear [42]. Given that all reviewed studies used purposive sampling and the sample size was relatively small (i.e.,<100), generalizability of the findings was limited.
Issues related to intervention integrity
In more than eight reviewed studies (36%), the student SL experience was without direct supervision from the faculty or community mentor [26, 44]. Some of the interactions between students and recipients living in the community, such as obtaining oral histories and social companionship, did not seem feasible for faculty or representatives from community partners to provide direct supervision [31, 43]. However, none of the eight reviewed studies explicitly stated how the faculty verified the authenticity of the service (including frequency and duration) that the students provided.
Issues related to data analysis
Inappropriate use of statistical procedures for analyzing data was common in the reviewed studies, ranging from not checking the statistical test assumptions before conducting parametric statistics to not controlling for the family-wise error rate. In one study [22], the authors combined data from students who participated in SL and those who indicated interest in participation but did not participate when analyzing the data. Such combinations of responses certainly violated one of the three basic premises of SL, which is an experiential education. In addition, there was no control for family-wise error rate when the authors conducted more than 170 unplanned, post hoc analyses [22], which undoubtedly would increase the probability of detecting significant findings simply by chance. In another study [25], there was no identifier in the pre- and post-SL surveys to link individuals’ scores on the outcome measures between the two surveys, which adversely affects the use of inferential statistical analysis.
None of the 16 reviewed studies (these studies included multiple health care professional students such as nursing, pharmacy, medicine, social work and tri-alliance rehabilitation professional) in McMenamin et al.’s review were related to the scope of the occupation of work. The majority of studies (12/16 or 75%) in which the service activities were provided by the students, were in the scope of intergeneration, and gerontology and geriatrics. Only six of these 16 studies met the eligibility criteria to be included in the current review. On the other hand, 27% (6/22) of the SL studies in the current review were related to the scope of the occupation of work. Such a large increase in the emphasis of SL in this type of service is encouraging, and should be continued in order to meet the objective of increasing the capacity to provide (work) rehabilitation services proposed in “Rehabilitation 2030: A Call for Action” WHO meeting. This review provides evidence that there is an advance in the knowledge of service-learning studies especially for student activities related to the occupation of work in the rehabilitation professions, and supports the practice in occupation of work through service learning among rehabilitation professions students.
Limitations
We acknowledge several limitations in this review. First, the search terms that we used may be too restrictive, as a result we may have missed some appropriate publications related to SL provided by the tri-alliance of rehabilitation professional students. Such publications may have been indexed in the databases under alternative terms. Even though two authors conducted the screening independently, it was still likely that we missed some appropriate publications as we did not screen the full-text of all the retrieved articles. While the EPHPP includes a dictionary to clarify questions related to the components and guide assessors in making appropriate rating judgements [20], ambiguity in interpreting the components and ratings does exist when rating studies employing quasi-experimental design with self-report assessment as the outcome measure. Also, we based our ratings solely on information published in the articles and did not contact authors to request clarifications. Therefore, some elements in the components that we rated may have been misinterpreted.
Due to the small sample size, further sub-analyses involving the six SL reviewed studies related to occupation of work (as the student volunteer service activities) were not performed as findings from such analyses are not likely to be valid. Finally, meta-analysis was considered but we decided that it was not appropriate due to diversity of constructs used in the outcome measures employed in the reviewed studies. Pooling the available data of the 22 studies using meta-analysis may generate findings with poor statistical conclusion validity, therefore, the results were presented in narrative form.
Future studies
In order to show stronger evidence on the potential benefits of SL for rehabilitation professional students, studies with more robust research designs and vigorous methodology are needed. Future studies should (a) consider implementation of random allocation of SL and non-SL groups alternating between student cohorts or academic years, (b) collect relevant student information to control for confounders, (c) develop more psychometrically sound instruments to measure SL outcomes which is necessary to permit valid comparisons between studies and to provide evidence on the effectiveness of SL, (d) use psychometrically sound assessments, preferably objective, that allowed blind evaluators to assess student learning outcomes, and (e) collect the same set of student learning outcome data from several programs that implement similar SL activities across the country and over several years to increase the representativeness of the sample and generalizability of the findings.
Conflict of interest
There is no potential conflict of interest relevant to this article, nor was there any external financial support.
