Abstract
This systematic review and meta-analysis investigated the unique effects of credible sources on performed behavior, rather than behavioral antecedents (i.e., attitudes), and contexts where it is most effective. Six databases were searched to June 2024, yielding 40 effect sizes (N = 7,995, 58.42% females, mean age of 17.75 years old, with mostly White ethnicity 75.81%) from 34 papers. A random effects model indicated a small positive effect, d = 0.14 (95% CI [0.04, 0.23]). Moderator analyses showed significant positive effects when the credible source had a medical professional qualification, communicated with participants in-person, the intervention was verbal or combined verbal and written messages, when behavior occurred once, and immediately followed the intervention. Because of the small effect, the costs associated with generating credible sources should be balanced against their effectiveness.
Keywords
Introduction
Credible source is a frequently used behavior change technique (Hall et al., 2021; Heron et al., 2016; Presseau et al., 2015; Roberts et al., 2017; Stockwell et al., 2019). However, the unique effects of credible source on behavior – defined as observable actions performed by participants as a result of the intervention – are not known, as most investigations are primarily in the context of behavioral antecedents, such as attitudes. It is also not clear under what circumstances credible sources are most impactful on behavior. The aim of the present systematic review and meta-analysis was to: (a) quantify the unique effects of credible source on behavior, and (b) understand the contexts in which credible source is most effective.
The behavior change technique taxonomy (version 1) defines credible source as “present verbal or visual communication from a credible source in favor of or against the behavior” (Michie et al., 2013, p. 11). Similarly, the newer behavior change ontology defines credible source as “presents information from a credible person or organization to influence the behavior” (Human Behaviour Change Project, 2024; Marques et al., 2023). Source credibility is defined as a perceptual judgment, from an information receiver, of the message source’s expertise and trustworthiness (Hovland et al., 1953). Hovland et al. (1953) defined expertise as the extent to which the source of the information is perceived to be competent in providing valid assertions within the relevant domain. Trustworthiness refers to the extent of the audience’s confidence in the source’s intent to provide information that the source believes to be most valid (Hovland et al., 1953). Sources who are perceived as having higher expertise and trustworthiness within the relevant domain are perceived as more credible than sources that are perceived as lower in these characteristics (Pornpitakpan, 2004).
The effects of source credibility have been well established in relation to the antecedents of behavior. Theories of persuasion, such as the Elaboration Likelihood model, have suggested the avenues by which source credibility may best influence persuasion through both peripheral and central processing, discussing both mediators and moderators of these effects (Petty & Cacioppo, 1986; Petty & Wegener, 1998). Narrative reviews of the source credibility effect generally support the notion that highly credible sources are more effective than lower credibility sources– particularly when assessed in terms of the recipient’s attitudes, opinions, intentions, perceptions, and hypothetical decision-making (Pornpitakpan, 2004; Sternthal et al., 1978b). However, it remains unclear how these effects are understood in relation to behavior, because of the evidence base primarily investigating the antecedents of behavior. Wilson and Sherrell’s (1993) review showed that only 6% of included studies investigated the effect on behavior. When interpreting the effect size of behavioral outcomes reported in that review, it is unclear what is the magnitude of the effect. Moreover, the review included the effect of other source characteristics, including attractiveness and similarity, as opposed to credibility alone. It is therefore unclear as to which effects were due to credible source and which were due to other source factors such as perceived similarity to the source (i.e., attitudinal, demographic and experiential similarity between the receiver of a message and its source). One of the aims of the present systematic review and meta-analysis is to focus solely on the effects of highly credible sources, sources who are both expert and trustworthy within the relevant domain, on observable behavior.
A second limitation of previous reviews of credible sources is that they have typically added other behavior change techniques to the intervention in addition to credible sources, thus confounding the measurement of the unique effect on behavior. For example, Tang et al. (2021) systematically reviewed social norms interventions, including the credible source behavior change technique. Overall, the review suggested that a credible source was more effective when compared to control conditions with a small effect. However, the included interventions within the review often changed beyond the credible source when compared to control conditions. As suggested by Armitage et al. (2020), to develop evidence-based, effective interventions, behavior change techniques must be tested and understood in isolation to understand their unique effects. Thus, a second aim of the present systematic review and meta-analysis is to examine the unique effects of credible sources on behavior to understand (a) what is the extent of the unique effect of credible sources on behavior (b) which populations are credible sources, effective for and (c) under what circumstances are credible sources effective (e.g., intervention characteristics)? From this, recommendations to improve the implementation of credible sources to change behavior can be suggested. This review will also identify directions for future research.
Methods
Selection of Studies and Inclusion Criteria
Electronic databases included: Cochrane Central Register of Controlled Trials, PubMed, Web of Science Core Collection, PsycINFO, and The Allied and Complementary Medicine Database through Ovid and ProQuest ASSIA (see Supplemental Material 1 for full search strategy). Four search filters were used to retrieve records from inception until May 2022. A final search was conducted in June 2024. Four search filters relating to source credibility, behavior, included study design, and excluded designs were used. Other methods included: manual ascendancy (i.e., studies that have cited included studies) and descendancy (i.e., reference lists of included studies) of references; searching previous reviews (Pornpitakpan, 2004; Sternthal et al., 1978b; Wilson & Sherrell., 1993); and searching the publication lists and contacting the first authors of the included studies.
The review included studies that tested the unique effect of credible sources on behavior. The credible source condition had to be compared to a comparison condition that was identical, minus the use of credible sources (i.e., credible source + message vs. message only) or using a less expert and/or less trustworthy credible source. Comparisons were excluded if the conditions varied other factors such as the behavior change techniques, message, and/or methods of delivery. For example, Dell (1973) was excluded as it compared an expert interviewer to a lower credibility source, but the quality and delivery of the intervention varied.
Highly credible sources were defined by both their expertise and trustworthiness within the persuasive communication domain. To be included in the review, high credibility sources needed to be high in both perceived expertise and trustworthiness. Theories of persuasion broadly operationalize expertise as the perceived competence of the source in providing valid assertions within the relevant domain. Expertise was evaluated in relation to the source’s description, for example, their expert status, professional qualifications, experience, and training. Trustworthiness is broadly operationalized as the audience’s confidence in the source’s intent to provide information that the source believes to be most valid (Hovland et al., 1953; Pornpitakpan, 2004; Sternthal et al., 1978b). Trustworthiness was evaluated regarding the likelihood that they are ethical, honorable, honest, and likely to provide valid information. Conceptualizations of these constructs were informed by theories (Hovland et al., 1953), existing measures (McCroskey & Teven, 1999), and literature reviews (Pornpitakpan, 2004).
Regarding the selection of the credible source and comparison source, where successful manipulation checks of credibility were available, the most and least credible sources were used. Where a manipulation check only checked one of expertise or trustworthiness, the face validity of the other construct was checked to determine its appropriateness. Where a manipulation check failed (e.g., Primason, 1982), a discussion took place to decide inclusion (e.g., source regarded as highly credible in other studies) or exclusion (e.g., low face validity) of the paper. Where no manipulation checks were provided, sources with high face validity as credible were selected as the credible source. For example, a qualified psychologist was selected as the credible source compared to a lower credibility non-qualified psychologist (Crisci & Kassinove, 1973). Each reviewer (JH and DH) made judgments of credibility independently using McCroskey and Teven’s (1999) expertise and trustworthiness scales as guidance. 1 Discrepancies within this were resolved through consultation of a third reviewer until agreement was reached (TE).
The credible source must provide explicit visual or verbal support for or against the target behavior (Michie et al., 2013). For example, Weick et al. (1973) established expertise through the source’s musical excellence, but as the source did not promote the target behavior of limiting errors, which was the dependent variable, this was excluded. Studies needed an extractable measure of behavior targeted by the credible source. Behavior was defined as the performance of any observable behavioral outcome that resulted from the intervention (e.g., brushing teeth, signing a petition, compliance with a request). Studies that investigated only the antecedents of behavior (e.g., attitudes, intentions, judgments, and opinions), hypothetical scenarios, or measures that only provided a combined measurement of both antecedents and behavior were excluded. There were no limitations in terms of the target behavior.
Lastly, studies were required to randomize participants to conditions (i.e., individually or cluster). Where the use of randomization was not clear, the corresponding author was emailed. If the author was not contactable or did not provide a sufficient response, the study was excluded.
All screening was conducted independently by two authors (JH, DH; MSc Psychology). Inter-coder reliabilities, at each stage of screening, was established via kappa (k) coefficients. All papers where at least one screener suggested full text were sent to the full text stage. JH completed 100% and DH completed 50% of the title and abstract screening. For full text screening JH screened 100% and DH screened 37% of the papers sent for full text screening. There were no disagreements at this stage.
Figure 1 (see Supplemental Material 2) represents the flow of studies throughout the review (Page et al., 2021). The search revealed (k = 25,525) potentially relevant articles. Following the removal of duplicates (k = 3,723), k = 21,802 were screened for title and abstract eligibility, k = 20,456 were removed at this stage. This left k = 1,346 for full text review. The main reasons for exclusion at the abstract stage included: did not investigate the unique effect of credible sources, did not measure behavior, or was not a randomised control trial (RCT). A further k = 1,326 were excluded at full text review. Reasons for exclusions included: conditions differing from differed beyond credible sources (k = 430), did not investigate unique effect of credible sources (k = 189), did not measure behavior (k = 372), or was not an RCT (k = 335). The remaining articles (k = 20) met the inclusion criteria for the analysis, yielding (k = 25) comparisons. Searches of relevant textbooks, reviews investigating credible sources and reference lists of included articles yielded an additional 14 articles, with 15 comparisons. Overall, yielding 34 articles, with 36 studies and 40 comparisons.

Flow diagram of papers included in the meta-analysis.
Meta Analytical Strategy
Cohen’s d (Cohen, 1992) was employed as the effect size metric for the analysis. For continuous outcomes, means, standard deviations and ns for the experimental condition and comparison conditions were used to calculate the effect size. If these data were not available, t-scores, F-values, or F-ratios were used to calculate the effect size. Where data were provided for dichotomous outcomes, X2 frequency tables were used for comparison of the experimental and comparison groups. Where this was not available, the percentages of participants were used within. If frequencies were not available, then the X2 value was used. Where comparisons were reported as having no effect or no significant differences between the experimental and comparison condition and the primary author did not provide data, then a conservative imputation strategy was used, where an effect size of 0 was assumed. A sensitivity analysis was performed to assess the robustness of this imputation strategy.
A random effects model, weighted by inverse sample size, was used to calculate an overall effect size, plus 95% confidence intervals (CI), significance of heterogeneity (Q), and the extent of heterogeneity (I2) for behavior using the metan command in Stata Version 14 (StataCorp, 2015). A random effects model was chosen to account for heterogeneity. To understand the robustness of the model, sensitivity checks were used where (a) only powered studies were included, (b) an Egger test was performed, (c) Winsorized data were used, and (d) imputed data were removed (see Supplemental Material 3 for full details on sensitivity checks).
Following this, a moderation analysis was performed. Discrete variables (i.e., experimental setting) were allocated into separate subsets (e.g., university, school, and public health facility) for each moderator once data were extracted. Where appropriate, to determine if subsets of the moderators differed significantly, the effect sizes and standard errors for each subset of the moderator variable were meta-analyzed using the metan command, and the Q statistic was examined. Within this, where comparisons indicated significant heterogeneity, this indicated a significant difference between the subsets of the moderator variables. To explore the effect of continuous variables (i.e., age), behavior was regressed onto each moderator variable using the revised metareg command in a random effects model. From this, the estimated increase in the effect size per unit increase in the regression coefficient was calculated, as was the adjusted R2 value to understand the percentage of heterogeneity explained by the covariate. Moderator variables with subsets that included 10 cases or over were examined within this analysis. Where the subset of the moderator was below 10, the direction of effects was reported in the Supplemental Tables but no meaningful conclusions were drawn.
Study Coding
Where there were multiple low credibility comparison groups, an effect size was calculated for the experimental condition compared to each comparison with adjustments for multiple comparisons to preserve independence, where the n for multiple comparisons was adjusted for the total number of comparisons. For example, Binder et al. (2020) compared “experts” to “celebrities” and “peers”, two effect sizes were calculated for the “expert” group (n divided by two) using each of the lower credibility groups for comparison. These effect sizes were treated as two independent comparisons (cases) within the analysis. Moreover, where studies used multiple high credibility sources, an effect size was calculated where the total number of participants who performed the desired behavior for each credible source was aggregated relative to those who did not perform the desired behavior and then compared to the lower credibility condition.
Where the analysis involved independent variables other than the credible source’s expertise and trustworthiness, the independent variables were collapsed, and the main effect of credible sources was used. For example, Meyer (1977) provides data that were used to calculate an effect size at the source level, removing the influence of the other independent variables (e.g., amount of advice, level of achievement, and the child’s grade level). If information for the main effect was not provided, an effect size for each independent variable was calculated, and a mean effect size using these values was created. For example, Jones et al. (2004) manipulated message framing and a credible source. We calculated an effect for each level of message framing (e.g., high credible source/positive message frame vs. low credible source/positive message frame) and then calculated a mean of the effect sizes.
Where studies reported data over multiple time points, a conservative strategy was adopted by calculating the effect size using the final time point. If the study included multiple dependent variables, effect sizes were calculated for each behavior, and a mean effect size was used within the meta-analysis. For example, Neimeyer et al. (1989) used two measures for behavior (i.e., the percentage of relevant applications and the range of applications), an effect size was calculated for each behavior measure, and the mean was used in the present analysis. Where an effect size could not be calculated due to small frequencies (Cochran, 1954), the measure was excluded from the analysis. For example, in Lawal et al. (2021) where post-intervention cell counts dropped below five participants for one of the frequency table cells. In the case where the outcomes used both binary and continuous measures, effect sizes were calculated for each outcome and combined to produce an overall mean behavior effect size.
The behavior change techniques used in both the intervention (alongside a credible source) and comparison condition (without a credible source) were extracted by coding intervention elements using the Behavior Change Technique Taxonomy version 1 (Michie et al., 2013). Once extracted, the interventions were coded with 0 representing that the groups do not and 1 indicating that they do include the behavior change technique. This was then analyzed following the procedure for continuous variables.
The risk of bias of the included studies was assessed using the Revised Cochrane risk-of-bias tool for randomized trials and cluster randomized trials (RoB2; Sterne et al., 2019). Risk of bias was assessed through its five standard domains for all studies, with an additional domain for cluster randomized trials (see Supplemental Material 4 for full risk-of-bias criteria).
Data Extraction and Moderator Variables
Information about potential moderators were extracted from each study (see Supplemental Material 5 for full information on data extraction and moderator variables.). The variables were chosen for their theoretical importance, practical relevance, and commonly described features. Extraction included characteristics of the sample (e.g., age, gender, and ethnicity), intervention (e.g., credible source type, comparator source type, and intervention duration), and study (e.g., the study duration, attrition rate, and randomization).
Coding Reliability
All data extraction was conducted independently by two authors (JH and DH). Kappa coefficients were used to calculate inter-coder reliabilities for categorical variables and intra-class correlation (ICC) for continuous variables. Coder reliabilities were acceptable for both categorical (Mk = 0.92, range = 0.74–1.00) and continuous variables (MICC = 1.00, range = 0.99–1.00). For variables where kappa and ICC could not be used (i.e., too few observations in each cell (e.g., being of Hispanic ethnicity) or there was no disagreement), percentage agreement was calculated; in these cases, there was 100% agreement between the coders. Discussion between the authors was used to resolve disagreements for data extraction. If agreement could not be reached, a third reviewer (TE) was consulted. Assessment of risk of bias was conducted independently by JH and TE. Agreement on risk-of-bias classification ranged from 76.92% to 100%, with a mean agreement of 90.11%. If agreement could not be reached, a third reviewer (DH) was consulted (see Supplemental Table S1 for the full intercoder reliabilities).
Transparency and Openness
The review adhered to the PRISMA 2020 guidelines for systematic reviews (Page et al., 2021; see Supplemental Material 2). All data and research materials are available upon request to the corresponding author. The review was preregistered at https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=333372. Amendments to the protocol were done to ensure clarity within the methods of the review, including: changes in anticipated completion, author details, clarity regarding search strategies (i.e., searching the reference lists of relevant reviews), clarity within the communication of comparison groups (i.e., that we are interested in only the unique effects of credible source) and clarity within study screening (i.e., handling of papers at the title/abstract stage of screening).
Results
Characteristics of the Included Studies
Thirty-four articles met the inclusion criteria, which included 36 studies and yielded 40 cases. The samples had more females than males (M = 58.42% females, SD = 21.32), had a mean age of 17.75 years old (SD = 8.17) with mostly White ethnicity (M = 75.81%, SD = 31.59). The types of populations sampled varied; university students were 47.50% (k = 19) of the cases (see Supplemental Table S2 for full sample characteristics).
The highly credible source typically had a medical qualification (k = 26, 65%). The credible source was most frequently compared to a low credibility source (k = 21, 52.5%). The less-credible sources were non-expert professionals in 12 (30%) cases, were non-expert peers in 12 (30%) cases and were defined as “other” sources in 16 cases (40%). The type of behaviors targeted differed between the cases. Single (k = 20, 50%) and repeated (k = 15, 37.5%) were the most frequently investigated behaviors, as opposed to combinations of the above. Of these, 21 (52.5%) cases investigated a delayed behavior, 13 (32.5%) investigated an immediate behavior, and the remaining 6 (15%) cases investigated both an immediate and delayed behavior. The behavioral domain was typically a health behavior (k = 25, 62.5%), and preventive oral health behavior (k = 7, 17.5%) was the most typically investigated behavior type.
Contact with the source was typically achieved through non-in-person means (k = 21, 52.5%), with 12 (30%) cases using written means. Seventeen (42.5%) cases used in-person communication between the source and participant. The intervention was most frequently delivered through a combination of both written and verbal delivery (k = 16, 40%). The mean intervention duration was 7.63 weeks (SD = 16.38), the source had a mean contact with the participant for 2.00 (SD = 2.00) sessions, with the interventions having a mean of 2.13 (SD = 2.04) sessions (see Supplemental Tables S3 and S4 for full intervention characteristics). Sixteen additional behavior change techniques were used: 5.1 information about health consequences (k = 19, 47.5%) and 4.1 instruction on how to perform a behavior (k = 17, 42.5%) were the most frequent (see Supplemental Table S4 for full behavior change technique inclusion).
The intervention was most frequently set in a university (k = 18, 45%) or school (k = 14, 35%) setting, with a mean study duration of 14.88 weeks (SD = 30.59) and follow-up period of 6.50 weeks (SD = 13.02). Of the 40 cases, 32 (80%) were published in peer-reviewed journals. Thirty-one (77.5%) used individual randomization procedures to allocate participants to their conditions. Twenty-two (55%) reported a successful manipulation check of source credibility. The attrition rate within the studies was generally low, with 20 (50%) cases reporting no attrition. Of the 20 (50%) cases reporting attrition, attrition ranged from 3.64% to 46.26% (M = 10.30%, SD = 10.54; see Supplemental Table S5 for full study characteristics).
The quality of studies varied in their risk of bias. Overall, 2 cases (5%) received a high risk of bias, 33 (82.5%) received some concerns, and 5 (12.5%) received a low risk of bias. For the individual domains, 28 (70%) cases received a “some concerns” classification for the randomization process. For cluster randomized trials, all applicable cases (k = 9, 100%) received a low risk-of-bias for the timing of identification or recruitment of participants in the trial. Bias arising due to deviations from the intended intervention where effects of assignment to the intervention were considered were generally quite low, with 37 (92.5%) cases receiving a “low” risk-of-bias classification. Bias relating to missing outcome data was generally low across the cases, with 37 cases (92.5%) receiving the low designation. Most cases used appropriate measures to measure the outcome, with 37 (92.5%) cases receiving a low risk-of-bias designation. In general, procedures for the selection of the reported result were not widely reported in the cases; as such, 30 (75%) cases indicated some concerns on this domain (see Supplemental Table S6 for full risk-of-bias outcomes).
Impact of Credible Source on Behavior
Across the 40 cases (N = 7,995), a credible source had a positive effect on behavior (d = 0.14, 95% CI [0.04, 0.23]; see Supplemental Table S6 for effect size statistics and Figure 2 for Forest plot). This effect size is small as outlined in Cohen’s (1992) criteria for effect sizes (Gignac & Szodorai, 2016). Sensitivity tests were used to understand the robustness of this effect. The effect remained the same (k = 40, d = 0.13 [0.04, 0.22]) when one study was Winsorized (van de Ridder et al., 2015). When taking only powered studies into account, the effect also remained the same (k = 30, d = 0.13 [0.04, 0.22]). This was supported by using the Egger test showing no significant small study effects (B = −0.38, SE = 0.74, p = .606). Moreover, when imputed effect sizes were removed, the effect remained the same (k = 35, d = 0.15 [0.05, 0.25]; see Supplemental Table S7 for details of the impact of credible source on behavior and Supplemental Material 6 for the funnel plot for the Egger test).

Forest plot of effect sizes for main analysis.
Moderator Analysis
The 40 cases were significantly heterogeneous, Q = 132.61, p < .001; the variance attributed to heterogeneity was large I2 = 70.60% (Higgins & Thompson, 2002; see Supplemental Tables S8–S10 for details of moderator analyses). Consequently, a moderator analysis was conducted to explore the heterogeneity.
Sample Characteristics
Credible source effects on behavior were not moderated by gender (k = 35, β = −.00, p = .393), age (k = 24, β = .01, p = .597), percentage of the sample that was White (k = 10, β = .01, p = .069) or type of sample population (university students, k = 19, d = 0.12, 95% CI [−0.02, 0.25]; school children, k = 13, d = 0.10 [−0.02, 0.23]). Due to a dearth of studies, conclusions could not be drawn for other ethnicities or sample types.
Intervention Characteristics
Sources with a medical professional qualification had a positive effect (k = 26, d = 0.13, 95% CI [0.02, 0.24]). There were too few comparisons for analysis of other source types. The effect was significant when a highly credible source was compared to a moderate credibility source (k = 17, d = 0.25 [0.09, 0.41]) and non-significant when compared to a low credibility source (k = 21, d = 0.07 [−0.03, 0.18]). There were too few cases to compare a high credibility source and no source. The effects of a credible source were significant when compared to a non-expert professional (k = 12, d = 0.21 [0.03, 0.38]). However, the effects were not significant when compared to non-expert peers (k = 12, d = 0.11 [−0.08, 0.29]) or other non-credible sources (k = 16, d = 0.11 [−0.00, 0.22]).
A credible source was significantly more effective than lesser credibility sources if the behavior was a single behavior (k = 20, d = 0.20, 95% CI [0.07, 0.33]); however, it was non-significant when the behavior was repeated (k = 15, d = 0.04 [−0.08, 0.15]). Multiple single behaviors and a mix of single and repeated behaviors provided too few cases to draw conclusions from these comparisons. When the proximity of the behavior was considered, immediate behavior had a significant positive effect of a credible source (k = 13, d = 0.20 [0.06, 0.33]); however, it was nonsignificant when the behavior was delayed (k = 21, d = 0.08 [−0.04, 0.20]). Studies that used both delayed and immediate outcomes provided too few cases to analyze. There was no significant effect for studies in the domain of health (k = 25, d = 0.07 [−0.04, 0.17]), and there were too few cases relating to all other behavior domains. Similarly, in relation to the health behavior type, preventative oral health, mental health-related behaviors, diet behavior, physical activity, sexual health, childcare/family planning, medication counseling, alcohol related behavior, and vaccination behavior had too few comparisons for meaningful analysis (see Figure 3 for Forest plot relating to behaviors in the health domain).

Forest plot of effect sizes for studies relating to health behavior.
Credible sources were significantly more effective than lesser credibility sources when participants had in-person communication with the source (k = 17, d = 0.26, 95% CI [0.10, 0.43]); however, the effect was non-significant for non-in-person communication (k = 21, d = 0.06 [−0.04, 0.16]). The use of written (k = 12, d = 0.03 [−0.09, 0.15]) communication was nonsignificant. Other means of communication had too few cases to allow for meaningful conclusions to be drawn.
Verbal messages (k = 12, d = 0.26, 95% CI [0.03, 0.48]) and using a combination of written and verbal messages (k = 16, d = 0.13 [0.01, 0.25]) had significant effects on behavior. The Qb analysis showed that verbal messages were more effective than written and verbal messages, Qb = 5.29, p = .021.
The intervention duration (k = 35, β = −.00, p = .826), the total number of sessions in the intervention (k = 40, β = .00, p = .856), and the total number of sessions with source contact (k = 40, β = −.00, p = .863) did not moderate the effect of credible source.
A credible source was significantly less effective when paired with the behavior change technique 5.1 information about health consequences (k = 19, β = −.26, p = .008). However, 4.1 instruction on how to perform a behavior (k = 17, β = .08, p = .470) did not significantly moderate the effect of a credible source. Other behavior change techniques (k = 1–8) yielded too few studies that included the behavior change technique for meaningful conclusions to be made (see Supplemental Table S10 for details).
To explore the apparent contradictory findings of a nonsignificant effect of credible source in the health domain but a significant negative effect when BCT 5.1 information about health consequences was used, further exploratory analysis was conducted. This analysis of BCT 5.1 information about health consequences in the context of health behaviors only (see Supplemental Table S7 for details) suggested that credible source was not significant when providing 5.1 information about health consequences in the context of health behavior (k = 19, d = 0.01, 95% CI [−0.07, 0.10]) for those studies that included both (see Figure 4 for Forest plot).

Forest plot of effect sizes for studies including both a health behavior and BCT 5.1 information about health consequences.
Study Characteristics
The intervention setting had no effect on behavior when the intervention was delivered in a university (k = 18, d = 0.12, 95% CI [−0.02, 0.27]) or school (k = 14, d = 0.12 [−0.01, 0.24]). There were too few cases to draw conclusions about other settings. The study duration (k = 33, β = −.00, p = .425) and follow-up period (k = 36, β = −.01, p = .131) had no effect on behavior.
The date of publication (k = 40, β = .00, p = .717), published studies (k = 32, d = 0.10, 95% CI [−0.01, 0.21]), cases using individual randomization (k = 31, d = 0.10 [−0.01, 0.20]), and the percentage attrition rate (k = 40, β = −.00, p = .853) of participants did not change behavior. Theses or cluster randomized trials lacked sufficient cases to extract meaningful conclusions from the data.
Cases without a successful manipulation check were effective at changing behavior (k = 18, d = 0.21, 95% CI [0.06, 0.37]). Cases with a successful manipulation check (k = 22, d = 0.08 [−0.02, 0.18]) were not effective at changing behavior. Cases with an overall risk of bias of some concerns were effective (k = 33, d = 0.16 [0.06, 0.27]). There were too few studies to explore the effect of low or high risk of bias.
Discussion
Main Findings of the Study
The present review and meta-analysis sought to understand the unique effects of credible sources on behavior and to further understand the contexts where credible sources are most effective. The meta-analysis suggested that a credible source had a positive but small effect on behavior, and sensitivity analyses supported the robustness of this effect. The analysis suggested that credible source is more effective when: (a) the credible source has a medical professional qualification, (b) the source had communication in-person with the participants, (c) the intervention was verbal or used a combination of verbal and written messages, (d) the behavior occurred once, and (e) the behavior was performed immediately after intervention. Other effects found that a credible source was most effective when these study choices were present: (f) the comparison source was a non-expert professional, (g) the credible source was compared to a moderate credibility source, and (h) when studies did not use manipulation checks. These findings should be interpreted with context; although there is extant literature successfully influencing the antecedents of behavior, the goal is ultimately to change actual behavior. When viewed in this context, even small effects can be meaningful–particularly where an individual may repeat behavior or behaviors are aggregated across groups. Because of the relatively small effects of credible sources and the expense associated with deploying a credible source, intervention developers should weigh the cost of credible sources against the benefits of using credible sources as message providers.
What Is Already Known and What This Study Adds
As outlined, it is generally accepted within persuasion literature that a credible source strongly affects persuasive outcomes (Pornpitakpan, 2004). This study suggested positive effects, but they were small. Much of the persuasion literature that demonstrates strong effects focuses on the antecedents of behavior, specifically on motivational outcomes such as attitudes and intentions (Pornpitakpan, 2004; Sternthal et al., 1978b). However, moving from motivation to behavior often depends on volitional processes (i.e., habit formation and action planning; Schwarzer, 2008). As demonstrated by the intention-behavior gap, antecedents of behavior (i.e., intentions) do not consistently translate to behavior enactment (Sheeran, 2002). Consequently, it is possible that a credible source plays a more important role in motivational processes rather than volitional processes. This distinction in the role of a credible source may explain why findings appear weaker when behavior is assessed.
Previous narrative reviews (Pornpitakpan, 2004; Sternthal et al., 1978b) of credible sources suggest that a credible source is generally more effective than lower credibility sources. The present review confirms these findings in relation to behavior. Previous meta-analyses (Wilson & Sherrell, 1993) of source characteristics suggest that source effects are more persuasive when oral communication is used from the source, when compared to other source delivery characteristics. The present review supports these findings, suggesting that a credible source is significantly more persuasive than lower credibility sources when using in-person communication rather than no in-person communication. Furthermore, Wilson and Sherrell’s (1993) review suggested that source effects were significantly more persuasive in student populations; however, the present meta-analysis did not find support for these findings. Instead, suggesting that a credible source was no more effective in any sample type.
The present review develops on the above research by empirically synthesizing source credibility research and updating prior reviews. Through investigation of source credibility, rather than wider source factors, the review suggests that a credible source is effective in changing behavior. The findings suggest that while a credible source is effective, the effects on behavior are small. As such, when developing interventions, developers should be mindful of the increase in behavior relative to the costs of implementing a highly credible source within their intervention. When credible sources are used within interventions, they should be used in contexts where they are suggested to be most effective. Including: (a) where the credible source had a medical professional qualification, (b) the source had communication in-person with the participants, (c) the intervention was verbal or used a combination of verbal and written messages, (d) the behavior occurred once, and (e) the behavior was performed immediately after the intervention. The analysis of different source types suggested that a credible source was most effective when medical experts provided communication in medical domains. This suggests that when providing medical communication, experts within the medical domain should be used to provide the communication.
Interestingly, the analysis suggested a null finding when a credible source is paired with BCT 5.1 information about health consequences. This finding is of particular interest, as within the health domain credible sources are likely to be providing information about the health consequences of behaviors. However, this finding needs to be approached with caution. This finding does not necessarily suggest that credible sources should not use information about health consequences. The finding suggests that, in general, credible sources should not inform about health consequences if there is no need to. This is reinforced by the nonsignificant findings of studies that used information about health consequences when investigated in the context of health behaviors. The finding suggested that when used in a health context, the effectiveness of a credible source is equal to that of a lower credibility source. Therefore, health professionals within health contexts should provide information about health consequences where appropriate. However, there may be health contexts where lesser credibility sources are equally as effective (i.e., a less-credible source providing individuals with the consequences of not using preventative oral health techniques rather than having a dentist deliver the same message).
In line with these findings, it was suggested that credible sources had a non-significant effect on health behavior outcomes. From the evidence, it is suggested that health behavior types are not widely researched in relation to credible source. While a pooled health behavior effect size suggests no significant effect of credible source on health behavior, there is no single behavior type that has sufficient comparisons to understand the effects of credible source on health outcomes. Furthermore, there is not enough evidence to provide meaningful insight into the specific behaviors that are targeted within the interventions. As outlined, a credible source is most effective when the behavior is immediate to the intervention context and when a single rather than a repeated behavior is targeted. However, studies with these outcomes account for only a small proportion of those investigating health outcomes. This may lead to a credible source being applied where it is not most effective. For example, while the preventative oral health domain provides insight into oral hygiene in general, it does not provide insight into the specific behaviors where a credible source may be most effective (i.e., at a point of decision relating to which toothpaste brand to use).
The present review develops the current understanding of the unique effects of behavior change techniques on behavior. Previous reviews of the unique effects of behavior change techniques found small to medium effects. Medium effects were found by Gollwitzer and Sheeran (2006) for action planning relating to any goal-directed behavior (d = 0.65), and small to medium effects were found for goal setting relating to the performance of any behavior (d = 0.34) and self-affirmation relating to performance of health behaviors (d = 0.32; Epton et al., 2015, 2017). The effect of self-incentives relating to the performance of any behavior (d = 0.17; Brown et al., 2018) had a similar size small effect as the one for credible source.
Limitations and Future Research
While the present systematic review with meta-analysis successfully evaluates the evidence relating to credible source effects on behavior, the study has several potential limitations with key areas for future research to enhance the understanding of credible source effects on behavior. As previously outlined, there is a relative dearth of studies investigating key moderators, for example, relating to the sample, settings, and source types. Furthermore, theoretical mediators and moderators have been suggested within source credibility literature (e.g., initial opinion, Sternthal et al., 1978b; issue involvement, Pornpitakpan, 2004), but there was not enough research to allow for sufficient data extraction. Moreover, despite the wide range of studies on credible sources, there are limited studies on behavioral performance. As such, future research should investigate the moderators and contexts of credible sources more extensively. This will allow for information on the behavioral contexts where a credible source is most effective. By doing this, the researchers will (a) add to the evidence base of credible source effects, and (b) be able to use a message source that is effective within their intervention context.
Furthermore, much of the research was performed in educational settings with students as the population of interest. As outlined, there is a dearth of evidence relating to other populations. Specifically, populations where a credible source is most likely to be applied in practice – such as with the general public. As such, future research should investigate the behavioral effects of credible sources within general population samples. This will allow for a more thorough understanding of and appropriate application of credible sources to practice.
Moreover, as the review had a specific aim relating to understanding whether highly credible sources are effective in changing behavior, the review included studies with no restrictions on study ages. Many of the studies included in the review are greater than 30 years old, with few studies investigating credible sources in recent years. This finding suggests that there is an acceptance that credible sources are more effective than lesser credibility sources. However, as suggested by the findings, the effects of a credible source may be contextual, and there is not enough evidence to suggest the exact contexts in which a credible source should be used. As such, future research should seek to further understand the contexts and behaviors for which a credible source is most effective.
As suggested by the findings in relation to health behavior types and BCT 5.1 information about health consequences, there is no single behavior type that can be meaningfully analyzed to draw conclusions. Moreover, the behavior types could not be meaningfully broken down to understand the specific behaviors therein. This poses a problem as different behavior change techniques are suggested to affect behaviors differently depending on context (i.e., where one behavior change technique may be effective, another may not be). Furthermore, it is of interest to understand the specific types of behavior where different sources of credibility matter more. For example, when considering multiple roles of persuasive variables (Petty & Wegener, 1998), expert sources may be best served in analytical and intellectual decisions where the source is used as argument support, whereas trustworthy sources may be more effective for affective and emotional decisions. As such, future research needs to understand the specific behaviors in which a credible source is most effective. For example, this will allow for a credible source to be applied contextually when targeting a health behavior domain, as the specific behaviors where a specific credible source is most effective can be targeted.
As the review sought to seek understanding of highly credible sources, other sources that may be credible – but are not highly credible – are not considered within the present review. For example, an expert presenting information that is not congruent with their expertise (Bull et al., 2021). As such, other reviews may seek to understand the effects of sources that are credible in one domain, presenting information within another domain. Similarly, source credibility was operationalized by a source’s expertise and trustworthiness. While these are the most widely accepted constructs of credibility, there is debate as to other constructs of credibility, for example, source bias (Wallace et al., 2020). Researchers should consider the role of other characteristics, such as source bias, in how they relate to credibility and how they may affect behavioral outcomes.
Moreover, while the review suggested that a credible source was most effective in medical domains, the variable of credible source type could only be meaningfully broken down into three general categories (i.e., medical professional qualification, other professional qualification, and not a professional qualification). As such, future empirical research should seek to investigate further the role of credible source type. This will allow for a more in-depth understanding of which domains a credible source is most effective within.
Building on this, the findings of the review only investigated the effects of credible sources on behavior. As such, the influence of other source types (i.e., celebrities) and other source characteristics (i.e., source attractiveness) was not considered within this review. Consequently, the effects of these sources are not known in relation to behavior, and future research should seek to clarify the effects of these source types and characteristics on behavior. From this, a more evidence-based approach to intervention development can be used, where the most effective source in the given context can be deployed to change behavior.
When considering the application of behavior change techniques operationalized alongside a credible source, there were only 16 unique behavior change techniques operationalized alongside a credible source. Of these, only two had enough comparisons for meaningful analysis. Therefore, future research into the behavior change techniques used alongside credible sources should be further investigated. For example, research should attempt to understand how to maintain the effects of a credible source over a longer term. One method to achieve this would be to test the effect of a credible source with other behavior change techniques. By understanding the behavior change techniques that interact with credible source to enhance credible source effects, this will allow for a more effective deployment of credible source within behavior change interventions, as it will inform intervention developers of (a) when it is most effective to deploy credible sources and (b) alongside which other behavior change techniques credible sources are most effective.
Conclusions
Credible source has only a small effect on behavior and is effective only in limited circumstances, for example, when the credible source has a professional medical qualification, uses in-person communication with a verbal component, and is used for encouraging single and immediate behavior. Interventionists need to consider if the benefits of using credible sources outweigh the expense. Further empirical research is required to further understand the theoretical and contextual factors that may implicate upon credible source effectiveness.
Supplemental Material
sj-docx-1-psp-10.1177_01461672261445474 – Supplemental material for How Effective Are Credible Sources in Changing Behavior? A Systematic Review with Meta-Analysis
Supplemental material, sj-docx-1-psp-10.1177_01461672261445474 for How Effective Are Credible Sources in Changing Behavior? A Systematic Review with Meta-Analysis by Jack Hamer, Tracy Epton, Danielle Hamer and Christopher J. Armitage in Personality and Social Psychology Bulletin
Footnotes
Author Note
The project was pre-registered on PROSPERO (Hamer et al., 2022). The findings of the review were presented as a poster at the Division of Health Psychology Annual Conference 2024 and the European Health Psychology Conference 2024. Armitage is supported by the NIHR Manchester Biomedical Research Centre and NIHR Greater Manchester Patient Safety Research Collaboration. The views of the authors do not necessarily represent those of the NIHR, Department of Health and Social Care, or the Greater Manchester Mental Health NHS Foundation Trust. The research is not affiliated with the NHS or Greater Manchester Mental Health NHS Foundation Trust.
Ethical Considerations
This article is a systematic review with meta-analysis; as such, it does not collect primary data from human participants. As such, ethics committee approval, informed consent, and consent for publication are not required.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was sponsored by the University of Manchester.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Original data files are available upon request.
Supplemental Material
Supplemental material is available online with this article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
