Abstract
Most research on developmental prevention programs stems from Anglo-American countries. However, in German-speaking European countries, there is also a broad range of family-oriented programs to promote child development and prevent behavioral problems. This article presents a meta-analysis of n = 79 studies on family-based prevention that had a comparison group and were performed in Germany, Austria, or Switzerland. Overall, the data contained 10,667 parents and showed a significant positive mean effect of the programs (d = 0.31). The mean effect for parent related outcomes (d = 0.40) was larger than for measures of child behavior (d = 0.20). There was much heterogeneity across studies and very few had follow-ups of more than one year. Moderator analyses revealed particular influences of methodological study characteristics, e.g. larger effects in smaller samples and less well-controlled studies. Most results of our meta-analysis are similar to what has been found in the English-speaking world. However, as in international practice, the evaluated programs seem to be not representative for everyday prevention reality where many programs are not evaluated at all. Our study confirms the need for more high-quality and long-term evaluation as well as cross-national comparisons and replicated moderator analyses.
Keywords
Development-related prevention takes place in different contexts, such as communities, schools, or families (Farrington et al., 2017; Malti et al., 2009; Welsh & Farrington, 2009). Due to various micro- and macro-social factors, family-oriented prevention programs have become very popular in the industrialized countries (Lösel & Bender, 2016). Family-oriented prevention programs contain different approaches like parent trainings, open family meetings, toddler groups, or prenatal classes. Those programs attempt to improve a variety of outcome variables such as parenting behavior and attitudes, child development, child problem behavior, or parent-child-relationship. International research on family-based prevention has strongly increased since the 1990ies and a number of systematic reviews, meta-analyses, and even meta-meta-analyses have been carried out (Bennett et al., 2013; Chen & Chan, 2016; Farrington et al., 2017; Kil & Antonacci, 2020; Pontes et al., 2019; Smedler et al., 2015). These show positive effects that typically range between 0.20 and 0.40 (Cohen’s d). However, the results are very heterogeneous which is partly due to the influence of various moderators (Lösel, 2018). Some moderators, such as the mode of prevention and sample size, are well-replicated: Selective, secondary or tertiary prevention programs show larger effects than universal approaches, and smaller samples often go along with stronger effects (Bakermans-Kranenburg et al., 2003; Layzer et al., 2001; Lundahl et al., 2006). The moderating role of other program characteristics such as the theoretical background or program intensity is less conclusive (Lundahl et al., 2006; Manning et al., 2010; Pinquart & Teubert, 2010). Therefore, as in whole psychology (Open Science Collaboration, 2015) and other fields, replication is a key issue in developmental prevention research (Kumpfer et al., 2020; Lösel, 2018; Valentine et al., 2011). Most studies in the above-mentioned meta-analyses have been conducted in Anglo-American countries and the transferability across countries and cultures is an important issue that has not yet received much attention (Abate et al., 2020; Barnhart et al., 2020; Gardner et al., 2016; Knerr et al., 2013; Koehler et al., 2013; Lich et al., 2013; Scheithauer et al., 2016; Sundell et al., 2008). One cannot assume that prevention concepts ‘work’ the same way regardless of the cultural environment (e.g., concerning parenting attitudes and practices). Some research suggests that basic program issues can be generalized (e.g. Knerr et al., 2013); but overall we need more differentiated evidence about family-oriented programs and their culture-specific adaptation (Haupt et al., 2014; Knerr et al., 2013; Lösel et al., 2019; Sundell et al., 2016).
Against this background, the present study investigates the findings on effects and moderators of family-oriented prevention programs in German-speaking countries. Whereas a previous meta-analysis in German language could only cover older studies with often weak methodological quality (Lösel et al., 2006; Weiss et al., 2015), there has been a substantial increase in research on family-oriented prevention over the last decade. Therefore, the present study substantially aims for an up-to-date systematic review and also includes data from other German-speaking countries (Austria, Switzerland). The decision to add studies from Austria and Switzerland to our original German study pool was motivated by the similarity of family-based prevention services in these three German-speaking countries (Hänggi et al., 2014; Klepp et al., 2008; Lösel et al., 2006), indicating a common study population. The aims of our meta-analysis are: To estimate the short-term and long-term effects of family-based prevention activities in German-speaking countries on parent and child outcomes. Based on the results of Anglo-American meta-analyses, we expect small to moderate mean effect sizes. To investigate the influence of moderating factors. Based on the international literature, we expect higher effects in studies that use (cognitive-)behavioral programs, selective or indicated prevention, smaller samples, and lower methodological quality. Furthermore, we examine the moderating role of other program characteristics (e.g., intensity, theoretical background), exploratively. Moreover, we wanted to test for publication bias.
Materials and Methods
We searched for studies that met the following criteria: Evaluation of a prevention program based on an educational or psychosocial concept that aims at improving the situation in families with children (aged 0 to 18), excluding individual counseling or therapy. Aimed at either parents or families (not: mere child programs). Conducted in Germany, Austria, or Switzerland. Using a comparison group that has not received the respective program. Published up to the year 2020. Since our first literature search (Lösel et al., 2006), we updated our data base twice (first update reported in Weiss et al., 2015). As our previous reports were only published in German language, we do not restrain this article to the latest update but report data on the complete study sample.
We followed a broad searching scheme including a systematic search in relevant psychological and educational databases 1 . Research strings combined a multitude of search terms, e.g., prevention, family education, parent training, parent groups, parenting. Additionally, we scanned systematic and narrative reviews, research reports, and primary studies, and searched homepages of prevention projects and research institutions for unpublished papers. The literature search revealed about 7,900 publications, 346 of which seemed relevant after first inspection. After excluding duplicates, mere program descriptions, studies with insufficient data report, studies on child-oriented programs, and uncontrolled evaluation studies, 79 studies qualified for further analysis (see Fig. 1 ). Nine studies included more than one independent comparison from multiple training and control groups, leading to k = 88 comparisons from n = 79 studies.

PRISMA flow chart.
The studies were coded concerning formal aspects (e.g., year and type of publication), contents and structure of the prevention program (e.g., theoretical background, number of sessions), characteristics of the participants (e.g., age, gender, socioeconomic status), and methodological aspects (e.g., research design, sample sizes). As we also included quasi-experimental studies, we considered the quality of the research design by coding two aspects: Group assignment (randomization/matching vs. uncontrolled assignment), and pre-test differences between study and comparison group (no vs. minor vs. major differences). The coding followed a detailed coding manual. To check interrater-reliability, 27 studies were independently coded by two team members. Interrater-agreement ranged between 72 and 100 percent for categorical variables and between r = 0.77 und r = 0.99 for metric variables. Coding discrepancies were discussed and resolved within the research team.
The dependent variables were categorized as parent-related outcomes (parental skills; parental sense of competence; personality characteristics; social integration), child-related outcomes (externalizing behaviour; socio-emotional outcomes, e.g., internalizing problems; cognitive variables), or parent-child relations (attachment, quality of relationship).
We used the standardized mean difference (Cohens d) as effect size. In primary studies where data were not reported sufficiently, we used standard procedures to estimate the effect sizes via test statistics (see Lipsey & Wilson, 2001). If this was not possible, we contacted the authors and asked them to provide the necessary data. If a comparison included more than one outcome measure, these were integrated to a single effect size for the respective aggregate level. Analyses were conducted using SPSS with macros written by David Wilson (Lipsey & Wilson, 2001), and using JASP Statistics (https://jasp-stats.org). We also computed the I2 statistic proposed by Higgins and Thompson (2002) to quantify the amount of heterogeneity, conducted moderator analysis, and used the ‘trim and fill’ procedure to assess publication bias (Duval, 2005).
Results
Descriptive Analyses
A list of all 79 studies is included in the appendix. A table with detailed information is available on request. The oldest publication we found dated from 1976, the latest from 2020; 28 studies were published after 2010. Forty-three studies were published in journals or books, 19 studies were published dissertation theses, and 17 studies were unpublished reports. Most studies were conducted and published in Germany (n = 72; Switzerland n = 7; Austria n = 1).
Programs
Studies implied a wide range of prevention programs, ranging from parent training programs over prenatal classes or early childhood prevention programs to comprehensive programs for the prevention of school problems. However, the majority of the studies evaluated parent training programs (n = 62). Most programs based on (cognitive-) behavioural principles (n = 40); fewer programs had other (e.g., attachment theory; n = 17) or eclectic theoretical backgrounds (n = 22).
Format
The programs applied a wide range of didactic methods: Apart from group discussion (72 %) and information (95 %), they often implemented role-play (89 %) or homework (63 %), thus referring to the participants’ everyday experience. Most programs used a group setting (n = 51), with group sizes of 4 to 21 participants (M = 10). The remaining 28 studies included individual sessions to some level. Program duration was mostly up to 20 sessions (M = 9.88, SD = 9.36); session duration ranged between 25 and 360 minutes (M = 126.28 min; SD = 60.74 min).
Participants
The full study pool included data from more than 10.000 parents (5.624 trained, 5.043 control). Sixty-one programs addressed mothers and fathers, and 16 programs were targeted at mothers only (data missing: n = 2). In 16 prevention programs, parents and children took part together. Most programs aimed at families of preschool (n = 38) or primary school children (n = 18). Nine studies dealt with adolescence-related issues, and 14 studies did not aim at a specific age group. Over 60 % of the studies (n = 49) offered selective (n = 37) or indicated prevention (n = 12) for families at risk. These risks referred to family structure (n = 18; e.g., single parents, multi-ethnic families), parental risk factors (n = 10; e.g., general educational problems), or child related risks (n = 21; e.g., attention deficits, premature delivery). On average, 28 % of the samples had a low socio-economic background, although the percentage varied widely from study to study (SD = 31 %), and information on socio-economic status was often sparse (data missing: n = 36 studies).
Methodological quality
Thirty-four studies compared the trained group to a non-equivalent control group, whereas 28 studies used equivalent program and control groups. On the individual study level, the median sample size was 64 parents for the program groups, and 57 parents for the comparison groups. Post-testing took place up to three months after the program. Forty-eight studies also conducted follow-up measurement, mostly up to one year after program termination. Seven studies reported long-term follow-up later than one year after program termination (Baldus et al., 2016; Hahlweg & Schulz, 2018; Hanisch et al., 2010; Heinrichs et al., 2009; Heinrichs et al., 2014; Lösel et al., 2013; Petzold, 1998; Schaub et al., 2019; Sidor et al., 2015). About 50 % of the studies assessed at least some aspects of treatment implementation (e.g., drop out of participants, satisfaction with participation).
Outcome variables
Most comparisons used multiple outcome measures: 70 comparisons used parental measures, such as educational skills (k = 58), parental sense of competence (k = 38), general personality aspects (e.g., anxiety, self-confidence; k = 24), or partnership/social integration (k = 15). Child outcomes, such as externalizing behavior (k = 28), internalizing and other social/emotional (k = 37), and/or cognitive variables (k = 17), were assessed in k = 61 comparisons. Seventeen comparisons reported parent-child relationship as outcome variable.
Treatment Effects
Post effects
Eighty-four comparisons reported post-test data. The remaining four studies reported data that were assessed more than three months after program termination and were therefore classified as follow-up data (see section ‘Follow-up’). Figure 2 depicts an overview of all included comparisons (k = 84) and their average post effect sizes. Concerning parent outcome variables (k = 68), all effect sizes were positive, ranging from d = 0.02 to d = 2.98. Twenty comparisons found small effects (d < 0.20), whereas nine comparisons found strong effects (d > 0.80), according to Cohen (1988). Overall, the effects were very heterogeneous (Q = 134.95, p < 0.001; I2 = 50.4 %). Therefore, we integrated them under the random-effects-model because it assists in controlling for unobserved heterogeneity (Hedges & Olkin, 1985). In parent outcome variables, we found a highly significant total effect of d += 0.40 (z = 10.53, p < 0.001; see Table 1). A differentiated analysis showed that the prevention programs mainly improved (self-reported) educational skills (d+ = 0.47) and parental sense of competence (d+ = 0.40). Smaller effects were found for general aspects of parents’ personality (d+ = 0.32), and parents’ social integration and partnership quality (d+ = 0.23). Effects on parent-child relationship, e.g., family conflicts or parent-child attachment, did not reach statistical significance (d+ = 0.09).

Forest plot of all k = 84 comparisons (overall effects, post, random effects model).
Integrated Post Effect Sizes
Notes. k = number of comparisons; d+ = weighted mean effect size (random effects model); CI95 % = 95% confidence interval; Q = homogeneity statistics; I2 = amount of heterogeneity due to non-random factors; *p < 0.05, **p < 0.01, ***p < 0.001.
Child outcome measures (k = 56) revealed lower effect sizes than parent outcome measures, ranging from –0.35 to 2.45. The mean effect was highly significant (d+ = 0.20, p < 0.001), but again, the effect size distribution was very heterogeneous (Q = 103.02, p < 0.001; see Table 1). Effect sizes concerning externalizing behaviour (d+ = 0.27), and internalizing and other social/emotional aspects (d+ = 0.20) were statistically significant and larger than those reflecting cognitive variables (d+ = 0.14).
As one study (Wünsche & Reinecker, 2006) reported extremely high effect sizes (d = 2.98 for parent outcomes), analyses were repeated excluding this outlier, yielding slightly lower effect sizes (parent outcomes: d+ = 0.38, z = 11.15, p < 0.001; child outcomes: d+ = 0.19, z = 5.11, p < 0.001). Studies reporting effects for both parent and child measures (k = 41) showed that parent and child effect sizes were clearly correlated (rs = 0.57, p < 0.001).
Follow-up Mean effect sizes remained significant in follow-up measurements, both for parent and child outcome measures (parent: d+ = 0.29, z = 5.09, p < 0.001, and child: d+ = 0.17, z = 3.50, p < 0.001). An analysis excluding very long follow up periods (longer than 3 yrs.) left the follow-up effects unchanged.
Moderator Analyses
We conducted moderator analyses to check whether the observed heterogeneity in study outcomes could be attributed to systematic differences in study characteristics. Since data on parent outcomes were reported in most studies, and since the child outcomes were clearly correlated to parent measures (see above), we limited the moderator analyses to parent post-training outcome measures (k = 68). We used both the fixed-effects as well as the random-effects model to examine moderator variables. With heterogeneous effect size distributions, the fixed-effects model is less conservative than the random-effects model (Overton, 1998), whereas the results converge in case of increasing effect size homogeneity.
There were no significant effects concerning type and year of publication, although later studies seemed to have found lower effects (see Table 2). Study design was a significant influential factor, at least when considering the fixed effects model. Highest effects were found in studies comparing non-equivalent training and comparison groups (d+ = 0.47), lowest effects in studies using controlled group assignment (d+ = 0.33). Sample size also influenced effect sizes: Smaller samples led to larger effects (r = –0.25, z=–2.13, p < 0.05). Moreover, studies that did not report treatment implementation (e.g., drop out of participants, satisfaction of participants) yielded higher effect sizes (Table 2). Parent trainings had higher effects than prenatal classes/early childhood prevention programs. Behavioural and non-behavioural/eclectic programs did not differ in their effects (see Table 2). However, studies evaluating behavioural prevention programs had higher methodological standards: 51 % of the behavioural vs. 21 % of non-behavioural programs pursued controlled group assignment strategies (χ2 = 8.69, p < 0.05). The method of instruction, setting, and intensity of the training did not influence effect sizes significantly (data not shown). Selective and indicated prevention programs did not show significantly higher effects than universal prevention programs (Table 2).
Moderator Analysis on Parent-Related Post Effects (k = 68 Comparisons)
Note: h = significant heterogeneity remaining in the categories; k = number of comparisons; Qb = testing for difference between the categories; d+ = mean weighted effect size (random effects model); †p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001.
Sensitivity analysis While moderator analyses did not indicate a publication bias, the funnel plot was considerably asymmetrical (Kendalls Tau = 0.22, p < 0.01; see Fig. 3 ). With effect sizes indicating low or even negative effects missing, this is an indicator of publication bias. We used the ‘trim and fill’ method to check if the originally significant mean effect would be stable in case of publication bias (Duval, 2005). The imputed effect sizes are shown as transparent dots on the left-hand side of the funnel plot ( Fig. 3 ). Including those imputed effect sizes, the mean effect size on post parent outcomes decreased slightly to d+ = 0.33 [95 % CI: 0.24–0.41].

Funnel plot of post effect sizes on parent post outcomes by sample size (black dots) and after “trim and fill” imputation (transparent dots).
Discussion
This article reports a comprehensive meta-analysis of family-based prevention in German-speaking countries. Our research questions concerned the effects of these prevention programs on parent- and child-related outcomes, the influence of program and methodological moderator variables, and the amount of publication bias. Compared to our previous smaller meta-analyses in German language, the current study pool was now much larger indicating a considerable increase in this research field. With 79 studies, the study pool almost tripled since Lösel et al. (2006). In addition, newer studies used more elaborated strategies of group assignment, larger sample sizes, and longer follow-up periods. Thus, not only the amount of research has increased, but also its methodological quality which reflects a turning point after a long-lasting delay in evaluation research of developmental prevention programs in Europe (Junger et al., 2007).
One of our key questions was the comparability of German findings to international research from mainly Anglo-American countries. The effects reported in our studies fit well to what has been found in international meta-analyses on family-based prevention. With a mean overall effect size of d = 0.31, our results showed a rather moderate, but significant impact of family-based prevention. This is in the same range as the typical effects found in international meta-analyses. Concerning the research context, the transfer of mainly Anglo-American concepts to other countries seems to work (see also Gardner et al., 2016; Pedersen et al., 2019). Our data confirmed that family-oriented prevention is basically effective and leads to a small, but practically relevant improvement in parent- and child-related outcomes (Farrington et al., 2017; Lösel & Bender, 2016). For parental outcome measures, the mean effect was slightly higher (d = 0.40), whereas we found only a small overall effect (d = 0.20) in child outcomes. This difference may partially reflect that parental outcome measures are often based on parent self-report questionnaires, implying risk of social desirability or impression management. One of our findings was that parent- and child-related outcomes were correlated, indicating that programs that led to an increase in parental skills showed higher effects on child-related outcome measures. The pattern of follow-up effects also suggested that the child effects were mediated by the improvement of parenting skills: Child-related effects remained quite stable in follow-up measurements, whereas parent-related effects decreased over time. This may be due to a deterioration of initial changes in parenting behavior over time after program participation.
The moderator analyses yielded only some clear results, particularly for methodological factors. Even after taking into account all kinds of moderating variables, the effect size distribution remained heterogeneous. This is in accordance with international research where even identical programs show different effects without having a sound explanation for such variation (Lösel, 2018). For example, we found higher effects in parenting programs than in early childhood programs. Although some American pre-birth and very early family-oriented prevention programs showed positive effects (Olds et al., 2010), one cannot yet generalize such effects to other approaches and across outcomes (see also Pinquart & Teubert, 2010). We did not find a significant effect of theoretical background, program intensity, or setting, which underlines the heterogeneous findings from other reviews. In contrast to previous findings, although effects of selective or indicated prevention programs were larger by trend than those of universal prevention programs, the difference did not reach statistical significance. This is also different from international research on other types of developmental prevention like child skills training programs (Beelmann & Lösel, 2021). As there was much heterogeneity of effect sizes, the influence of methodological factors may have played a role for this finding. Methodological characteristics had a substantial influence on effect sizes: As in the international literature, we found lower effects in studies with stronger designs and larger sample sizes. Therefore, the mean effect sizes might overestimate the real effects of prevention programs. Although published and unpublished studies did not differ in effect size, the pattern of effect sizes suggested some influence of a publication bias. A sensitivity analysis using the ‘trim and fill’ procedure revealed a lower estimate of the mean effect, but this was still statistically significant and confirmed the general effectiveness of family-based prevention programs.
Limitations
First, we had to include not only studies with a rigorous evaluation design in order to receive a comprehensive overview of all research activity in the field of German-speaking family-based prevention. Second, we only found few studies with a long-term follow up what may question how far our results can be generalised from a developmental perspective. Third, and perhaps most important, our study pool is not representative for the many programs that are carried out in routine practice but have never been systematically evaluated (e.g. Lösel et al., 2006). This problem is similar to the international experience (Mihalic & Elliott, 2015). For example, only one study in our review dealt with toddler groups and this was published almost forty years ago (Knödel, 1983). In practice, low-structured parent-child groups or toddler groups are by far the most frequent prevention format in German-speaking countries (Hänggi et al., 2014; Klepp et al., 2008; Lösel et al., 2006). As in other countries, structured parent training was over-represented in our study pool, compared to everyday routine (Gewirtz & Youssef, 2017; Junger et al., 2007; Lösel et al., 2006). Additionally, whereas a large part of the studies in our meta-analysis evaluated programs of selective or indicated prevention, many programs in everyday practice aim for universal prevention (Lösel et al., 2006).
In conclusion, our meta-analysis of studies on family-oriented prevention in the German-speaking countries showed similarity to the findings in the Anglo-American countries. This is encouraging with regard to the cultural transfer of knowledge. However, we found similar problems as reported internationally, in particular, a need for more high-quality and long-term evaluations as well as replicated and differentiated knowledge about outcome moderators (Lösel, 2018).
Author Note
A part of this study was funded by the German Federal Ministry for Family Affairs, Senior Citizens, Women and Youth.
Data can be requested from the authors. The authors declare that they have no conflict of interest.
Footnotes
Bio Sketches
Dr. Maren Weiss is a senior researcher at the Institute of Psychology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany. She has carried out research on developmental and clinical prevention, psycho-oncology, and youth delinquency.
Dr. Martin Schmucker is a senior researcher at the Institute of Psychology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany. He has carried out research on sexual offender treatment, developmental prevention, and program evaluation.
Professor Dr. Friedrich Lösel is an Emeritus Professor at the Institute of Criminology, University of Cambridge, UK, and at the Institute of Psychology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany. He has carried out research on juvenile delinquency, prisons, offender treatment, developmental prevention, football hooliganism, school bullying, personality disordered offenders, protective factors and resilience, close relationships, child abuse, family education and program evaluation.
Appendix
PsycInfo, PSYNDEX, Psychology and Behavioral Sciences Collection, The Philosopher‘s Index, Exceptional Child Educational Research, Education Index, FIS Bildung data base
