Abstract
Perinatal epidemiology research is concerned with identifying the effects of events during pregnancy on pregnancy outcomes that include maternal, fetal, and neonatal health outcomes. Randomized trials in perinatal research face many challenges, including randomization difficulties, ethical considerations, and inadequate statistical power due to the small number of subjects eligible for participation. For these reasons, most epidemiological studies conducted in this research field are observational and include different types of bias. This review describes the key methodological difficulties in the design and analysis of randomized and observational studies in perinatal epidemiology, and provides potential corrective approaches.
Introduction
One of the most vulnerable periods of human life is the period of intrauterine growth and development. Events during pregnancy have important influences on the outcome of pregnancy and the health and wellbeing of the newborn. 1 What happens in pregnancy and the very early stages of childhood will have a profound impact on child and adolescent development. 2 There is also increasing evidence for the role of early adverse experiences during pregnancy on childhood and adult health, as well as the possibility of intergenerational effects of events (i.e. effects of events during pregnancy on the outcomes of pregnancy and health in subsequent generations). 1 The “fetal origins hypothesis” describes that maternal health and nutrition in the prenatal period send signals to the fetus about the relative harshness of world in which he, or she, will be born. 3 For instance, supporting this hypothesis, several studies have found associations between low birth weight and long-term health outcomes such as diabetes and heart disease.4–7 Perinatology is a medical specialty that was established to provide integrated care to woman and fetus and to bridge the gap between the obstetrician’s concern for the pregnant woman and the pediatrician’s concern for the infant. 1 Building on the existence of perinatology as a medical specialty, perinatal epidemiology has developed as a subspecialty of epidemiology. 1 Perinatal epidemiology research is concerned with identifying the effects of events during pregnancy on pregnancy outcome, including maternal, fetal, and neonatal health outcomes. 8 It also encompasses the study of the effects of factors inherent to the pregnant woman such as age and ethnicity, voluntary harmful exposures during pregnancy (e.g. smoking and alcohol use), environmental exposures, diet, genetic constitution, the effects of illness, and the use of medications. 1 The following discussion briefly highlights some of the special concerns in conceptualizing research in the field of perinatal epidemiology and the methodological challenges that accompany these issues, and potential ways to address them.
Traditional epidemiology vs. perinatal epidemiology
Perinatal epidemiology research differs in at least three ways compared with traditional epidemiological studies focused on measuring disease incidence and prevalence. 9 First, the broader view of health rather than disease is especially appropriate in perinatal research. Pregnancy as an outcome is a healthy life transition, where changes in social and role function are expected, and many of the symptoms of pregnancy, such as first trimester nausea or third trimester backache are considered “normal”. Hence, the model is not one of curing the disease, and outcomes evaluations should consider the normal process of childbearing and its impact on normative functioning. 9 Second, with pregnancy, as opposed to most acute and chronic diseases, there is a predictable progression and time course, which is generally 40 weeks’ gestation (±2 weeks, from the last menstrual period), with a key definable outcome to the health state – delivery of the infant. Third, during the perinatal period, there are two patients, the woman and the baby, and measures to assess outcomes need to include both patients. 9
Methodological and design issues
One important methodological challenge in the design and conduct of perinatal research is that randomizing women is not always feasible. The importance of evidence from well conducted and reported randomized clinical trials (RCTs) is now widely recognized, as they are considered the most appropriate way to evaluate the impact of an intervention in clinical practice. RCTs are often referred to as the gold standard of research methods due to their ability to account for confounding factors and selection bias.10–13 Randomization is, theoretically, the ideal way to draw strong inferences about the effect of an exposure on maternal, fetal, neonatal, and infant outcomes, or to evaluate complex interventions, since it ensures that the intervention and comparison group(s) are comparable in terms of factors other than the one being studied.1,10 However, randomizing women is not always feasible and sometimes may be ethically questionable. In addition, attaining sufficient enrollment for adequately powered RCTs in a reasonable amount of time can also be challenging. For instance, many factors that affect the outcome of pregnancy cannot be assigned at random, and consequently when these factors are of interest, RCTs cannot be conducted. Factors such as age, ethnicity, and genetic constitution are non-modifiable, and as a result, they cannot be studied in randomized trials as they are not subject to manipulation by the researcher. 1 In addition, exposure to harmful risk factors in pregnancy, including cigarette smoking, alcohol use, and cocaine and heroin use, cannot be assigned at random for practical or ethical reasons. 1 Likewise, it is very difficult to detect clinically important effects in RCTs examining rare adverse or unintended effects of interventions in pregnancy because of the sample sizes needed.
Further obstacles in perinatal research can be the small number of pregnant subjects eligible, or likely to volunteer, to participate in clinical investigations, especially if the research topic is sensitive. Even when an exposure like medication use can be assigned at random, it can be challenging to attain enough enrollment for adequately powered RCTs to detect significant differences in clinical outcomes, in a reasonable amount of time. 14 RCTs in the perinatal research may have additional recruitment barriers compared to other RCTs. Women’s concerns about the risks associated with a trial may be greater during pregnancy which leads them to decline to participate. Such concerns may include, for example, lack of choice, uncertainty concerning fetal safety, and anxiety about random allocation of treatment.15–18 Likewise, in studies on the effectiveness of treatments, strong patient or clinician preferences can make it difficult to recruit for RCTs. For instance, it is estimated that asthma occurs in approximately 5% of pregnancies. A randomized trial comparing two medications for the treatment of asthma, that sought to enroll 200 women (100 in each group), would require a base population of 4000 pregnant women if all the women with asthma were eligible for the study and consented to enroll. However, eligibility criteria and unwillingness to participate would reduce the number of pregnant women available for a trial. Therefore, a study involving 200 pregnant women with asthma would require a large base population (e.g. 16,000–20,000) of pregnant women to be successful. 1 Consequently, inadequate statistical power in clinical trials must be always considered in the interpretation of non-significant results given the fact that a large number of pregnant women are required to avoid inadequate statistical power. The issue of clinical trials with small sample sizes was also highlighted in previous reviews that summarize the evidence of published systematic reviews and meta-analyses of RCTs in the perinatal epidemiology field.19–22
An additional challenge in perinatal research is associated with the inclusion of multiple births and subsequent births from the same mother. When a study includes women with multiple pregnancies, this may cause issues in the analysis because the outcomes of twins and infants from higher order multiple births are non-independent.23–25 Siblings, especially those from the same gestation, share genetic, environmental, and iatrogenic exposures that certainly could cause the outcomes of multiples to be correlated. 23 Therefore, it is possible that non-independent data influence the point estimates of effect size and confidence intervals in such populations, with the degree of impact increasing as the proportion of multiple pregnancies in the study increases or when the outcomes of infants from multiple pregnancies are more closely correlated.23,24
Bias and confounding in observational studies
Because of the inability of randomization, ethical considerations, lack of resources, or lack of equipoise, most epidemiological studies conducted in perinatal research are observational. 26 The concern with observational (non-experimental) studies is bias, which might arise from flaws in the study design, recruitment, conduct of the study, or in the presentation of the results. 1 In fact, this issue was recently explored in umbrella reviews that evaluated the credibility of evidence of previously published systematic reviews and meta-analyses of observational studies in crucial pregnancy complications, such as preeclampsia and gestational diabetes, and predictors for subsequent disease such as birth weight.27–29 In particular, the credibility of each proposed statistically significant association (e.g. risk factor or intervention) derived from already published systematic reviews and meta-analyses was assessed using a transparent and replicable set of prespecified methodological criteria and statistical tests (e.g. the significance of the summary effect, 95% prediction interval, presence of large heterogeneity, small study effects, etc.) and then categorized into different levels of epidemiologic credibility. According to their results, only a minority of the examined associations had strongly significant results and achieved high credibility level as a large proportion of the existing associations displayed very large between-study heterogeneity, hints of excess significance bias, and small-study effects. As shown in these umbrella reviews, the risk of reporting, selection, and other inherent biases may exaggerate the reported associations in the existing studies, highlighting the need for cautious interpretation of observational evidence.27–29
Obtaining unbiased effect estimates requires researchers to identify and control confounding. Mixing the effect of exposure on occurrence of outcome with a third factor, called confounder, happens when a confounder is an independent risk factor for the outcome and has an independently statistical association with the exposure of interest.30,31 A confounder should also not be an intermediate pathway between exposure and outcome. 31 Depending on the interrelation between confounder with exposure and outcome, uncontrolled confounding leads to over or under estimation of measure of association and consequently to erroneous conclusions. For instance, preterm birth, birth weight, and fetal growth are difficult to study because of the multiplicity of potentially confounding variables. Likewise, miscarriage abortion is difficult to study because fetal loss is difficult to ascertain in a specific time period, as are all known predictor parameters. Not all women with a miscarriage seek medical care, and those who seek it are likely to differ in their education, income, and lifestyle, leading to the possibility of reporting bias. There is also higher possibility of loss to follow-up in women who have a miscarriage that enroll during early pregnancy. 1 The issue of baseline population comparability, often referred to as risk-adjusted or adjustment for case-mix, is a primary methodological issue in the design and conduct of perinatal outcome studies. 9 When comparisons are made across treatments, programs, providers, or institution, the case-mix of those groups must be considered. 32 For example, comparing maternal or neonatal outcomes between women who deliver at levels I versus levels III regional perinatal hospitals should consider the perinatal risk of women being treated at each hospital, since the perinatal outcomes of the level III hospital would be expected to be worse, as these hospitals typically have more high-risk women. 9
Difficulties in the definition of perinatal indicators and outcomes
A cursory search of the literature indicates that there is variation in definition, measurement, and construction of indicators in perinatal research that may reflect the varying registration practices and definitions used from different settings, countries, and practices. Such heterogeneity precludes inclusiveness of all studies in a review and synthesis of evidence. For example, measuring preterm birth or intrauterine growth restriction requires a valid estimate of gestational age. The use of ultrasound to estimate gestational age may vary between developed countries due to country-specific practices, while gestational age is often difficult to be estimated in developing countries because of the inaccessibility of early ultrasound examination, late and infrequent access to prenatal care, or insufficient documentation of the date of the last menstrual period.33,34 Lack of standardization of definitions can have a significant impact on the study findings, both on estimates of incidence and the identified risk factors, but could also impact the assessment of the efficacy of different management techniques. 35 For perfect comparability, the recording systems for measuring perinatal health should ideally be identical or at least be approximated. 34
In addition, many traditional measures in perinatal research could be considered as intermediate measures. One much used example is low birth weight (<2500 g). Although, the number of studies that highlighted birthweight as a predictor of neonatal and infant mortality increased dramatically, birthweight, per se, is not a disease, but birthweight lesser than 2500 g is highly predictive of many diseases of the newborn, including respiratory disease, 36 cancer, 37 and psychiatric outcomes. 38 Despite the fact this research has contributed to our understanding of the predictors of neonatal health and has had considerable effects on public health programs, we must still recognize it as an intermediate outcome. 9 One more issue in trying to define outcomes is to differentiate process from outcome. An example in perinatal epidemiology is the frequent use of caesarean section as an outcome for maternal health.39–41 However, caesarean section as a dichotomous variable merely describes that it is a procedure, and not an outcome that reflects the actual health status of either the woman or the infant. 9 The matter is further complicated as a caesarean section can be voluntary or medically indicated with considerable variation across populations.
Another methodological challenge in perinatal epidemiology is the duration of measurement. Conceptually, research attempts to move beyond the defined medical event, to examine the wider and sometimes longer-term impact of medical care on the individual or population. 9 However, because of the potentially lengthy lifetime of a woman and newborn after birth, long-term examinations can be unbearable, and a shortened period of interest might be used, such as the first few months or the first few years of life. Yet, there may be potential bias when using shorter time periods, as significant events beyond the specified period would not be accounted for. 9
The methodological issue of the multiplicity of outcomes of interest in perinatal epidemiology should also be considered. Factors that affect pregnancy outcome are complexly interrelated and this makes the field challenging because it requires an understanding of the outcome’s pathophysiology as well as the factors that affect each one. 1 For instance, the magnitude and severity of perinatal exposures and outcomes dependent on several individual factors (e.g. biological mechanisms, socioeconomic variables, etc.), contextual factors (e.g. unemployment, neighborhood conditions), as well as the accessibility to health services. Recognition of the interplay between several factors on the outcome of interest is not only important in the design and conduct of perinatal research studies but also complicates the interpretation of the findings.
Recommendations
The complexities of perinatal research field are immense and challenging, given that several methodological challenges exist in the design and conduct of epidemiologic studies primarily because of recruitment challenges, inability of randomization, control of confounding, the comparability across study groups and services, as well as the appropriate definition of perinatal outcomes or exposures.
Notwithstanding the above-mentioned challenges, several approaches can be used for future studies for advancing the evidence in perinatal research. For instance, randomized trials can address the recurring issue by using methods like the Bayesian approach that utilizes all available data to calculate probabilities to examine the probability of treatment benefit and harm even for a conventionally underpowered trial. 42 In addition, group sequential designs can be helpful to reduce sample size by stopping studies early if effect sizes differ from what is expected.26,43 Adaptive and internal pilot designs are also promising as they can use interim power analyses to adjust the sample size for misspecification of parameters, while the second allows early stopping for efficacy and futility.43,44 Recruitment challenges in perinatal research could be overcome by enhancing the visibility of research in social media, increasing sampling frame, ensuring that the research team provides adequate information and support, and explicitly address fetal issues to strengthen women’s self-efficacy, as well as networking with clinicians at recruitment sites.45–47 Such strategies could be implemented relatively easily into study protocols and improve recruitment and ultimately the validity of research studies.
Another approach that could support the selection, recruitment, randomization, and data collection processes of clinical trials is routinely collected health data, such as electronic healthcare records, registries, or administrative claims data.48,49 Although such sources are most likely to be applicable to pragmatic trials, they still provide additional advantages as they can inform trial feasibility (e.g. reduce cost, time, and resources), expand the research agenda in research questions otherwise not amenable to trials, maximize efficient trial design, and improve external validity.48–51 However, although using routinely recorded data in clinical trials shows promise, still, the feasibility and efficiency of such data should be examined beforehand, ensuring that important items are accurately and fully recorded.48,49 Another method of increasing participant recruitment that offer the potential of embedding research into routine practice is the cohort multiple Randomised Controlled Trial (cmRCT) design.52,53 In the cmRCT design, patients with a condition of interest compose an observational cohort in which routine data collection occurs at regular intervals. Randomization to either treatment or control group follows, but inform consent is sought only from eligible participants who are selected for the experiment group.52,53 Although this design has to be tested fully in terms of its ethical, clinical, and patient acceptability, cmRCT offers many potential benefits, including the potential to undertake multiple trials within the same cohort, efficient recruitment, as well as reduction of disappointment bias and attrition in the control group.52–55
Observational studies are considered an important approach in perinatal research. However, addressing confounding is considered an important methodological issue. Given that our understanding of complexities of bias has progressed, traditional methods for confounding control, such as stratification, restriction, and matching, are limited. The application of newer methods is needed, such as marginal structural models and propensity calibration, in order to deal with complex confounding problems found in pregnancy studies. 56 For instance, the main advantage of marginal structural models is that they allow the consideration of time – varying exposure and confounding (e.g. other medication use and changes in disease severity), which is extremely relevant in perinatal research due to the varying degrees of fetal vulnerability that occur though the course of the pregnancy.56–58 Also, propensity score calibration is a method based on regression calibration, that allows for adjustment for multiple unmeasured confounders.56,59
It is also important to explore existing international collaboration networks such as the International Network of Obstetric Survey Systems and the International Network of Paediatric Surveillance Units, which can provide high quality evidence when investigating rare and severe outcomes in pregnancy and childhood.35,60 These multi-country collaboration networks provide a suitable approach for the conduct of robust population-based observational studies, less subject to many of the biases associated with observational design, as they use sufficient sample sizes, unified definitions, and methodologies, all add to the validity of studies, which ultimately improve clinical practice and the quality of care.35,60
In the past, researchers showed a tendency to ignore non-independence between infants from multiple pregnancies and to assume that each baby is an independent observation with a potential of invalid estimates.24,61,62 However, various analytical approaches have been proposed for taking account the issue of non-independence in studies examining outcomes from multiple pregnancies such as regression-based approaches, 62 cluster-level summary measures, 63 and mixed models. 64 More statistically complex methods, such as multilevel modeling and generalized estimating equations could be also found useful in observational studies or small RCTs where there are chance differences in baseline characteristics between the groups. However, these methods are complex and more difficult for many researchers to apply and for readers to understand.65,66 A recent simulation study that investigated the performance of several methods used to analyze datasets of binomial outcomes containing twins found that generalized estimating equations provide the best balance between estimation of the standard error and the parameter for any percentage of twins. 25 Yet, this remains a fertile area for further methodological research, and researchers need to keep in mind that the choice of analytical method should be determined primarily by the question being addressed and that the non-independence of siblings should be addressed with an analytic approach as standard practice to present the most valid results possible.23–25
In addition, sources of bias should be acknowledged and discussed in epidemiological studies, and if possible quantified, by performing sensitivity analysis. Furthermore, complete reporting of participants flow into clinical trials is essential for the generalization of results. Finally, it is also essential for future studies in this field to apply a standardized terminology and protocols for exposures, outcomes, and statistical analyses which may allow the computation of more precise estimates, enable a multi-national comparison of incidences, risk factors, and outcomes, and ultimately promote clinical and public health practice.
Conclusion
In this paper, the key methodological challenges in the design and analysis of RCTs and observational studies in perinatal research were discussed. Although some of these challenges may not be exclusive to perinatal research, an understanding and careful consideration of the issues identified in this paper is important for conducting rigorous studies in this research field which will eventually improve the evidence base.
Footnotes
Summary of key recommendations
Explore methods to overcome recruitment challenges in clinical trials such as enhancing the visibility of research, strengthen women’s self-efficacy, as well as increased networking with interested parties. Consider the use of routine data in the selection, recruitment, randomization, and data collection processes of clinical trials. Apply novel methods such the cmRCT design to increase participant recruitment and embedding research into routine practice. Evaluate the potential of using group sequential designs, adaptive and internal pilot designs, as well as Bayesian approaches to address statistical power in clinical trials. Use marginal structural models and propensity calibration as an alternative method to deal with complex confounding problems in observational studies. Explore the potential of international collaboration networks to conduct robust population-based observational studies. Address non-independence between infants from multiple pregnancies using an analytic approach that accounts for the non-independence as default option. Apply a standardized terminology and protocols for future studies relating to terms such as exposures, outcomes, and statistical analyses.
Contributorship
KG is the sole author.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethical approval
Not applicable.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Guarantor
KG is the guarantor of the present work.
Informed consent
Not applicable.
