Abstract
Discussions regarding responsible genomic data sharing often center around ethical and legal issues such as the consent, privacy, and confidentiality of individuals, families, and communities. To ensure the ethical grounds of genomic data sharing, oversight by both research ethics and Data Access Committees (DACs) across the research lifecycle is warranted. In this article, we review these oversight practices and argue that they reveal a compelling need to clarify the scope of ethical considerations by oversight bodies and to delineate core elements such as “objectionable” data uses. Ethical oversight of genomic data sharing would be considerably improved if the relevant ethical considerations by research ethics and DACs were coordinated. We therefore suggest several mechanisms to achieve greater clarification of ethical considerations by these committees, as well as greater communication and coordination between both to ensure robust and sustained ethical oversight of genomic data sharing.
Background
R
This is reflected in various policy statements and guidelines that recommend that data stewards and/or custodians (who may also be the data producers) and (downstream) data users obtain appropriate ethical, legal, and institutional permissions before sharing and using data, based on the ethical principle of respecting the privacy of individuals and the dignity of communities. As the recent Organization for Economic Co-operation and Development (OECD) Principles and Guidelines for Access to Research Data from Public Funding (2017) reiterate, “Research organizations and government research agencies should actively disseminate information on research data policies to individual researchers, academic associations, universities, and other stakeholders in the publicly funded research process.” Permissions to data sharing and access are typically granted by competent bodies that assess the adequacy of data sharing plans before data collection and sometimes also the ethical acceptability of specific data access requests.
Typically, Research Ethics Committees (RECs) (also known in some countries as Institutional Review Boards or IRBs) are seen as the competent body to prospectively assess the ethical acceptability of data sharing plans. The guidelines adopted by National Institutes of Health (NIH)-designated data repositories (as described in the NIH Genomic Data Sharing Policy, 2014) and the European Genome-phenome Archive (EGA)—two major data sharing platforms—exemplify this approach. Accordingly, data submitters to the NIH-designated data repositories such as the database of Genotypes and Phenotypes (dbGaP) are asked to provide an Institutional Certification which indicates, among other things, that an ethics review has been conducted by a responsible oversight body and that the body has granted a favorable opinion toward the proposed plan for data sharing. 2 Similarly, the EGA requires a letter from the representative of the study, for example, the Principal Investigator, confirming that any major ethical issues have been considered and addressed before depositing data (Table 1).
NIH, National Institutes of Health; REC, Research Ethics Committee.
In addition to the requirement of a prospective ethics review of data sharing plans by RECs, the nature of genomic data sharing in the current era, which allows for global downstream data uses, has led to calls for the establishment of a new tier of oversight. 3 This need mainly stems from the fact that researchers cannot foresee all downstream data access requests and uses from the outset of their initial collection. To fill this gap in ethical oversight of the genomic data sharing pipeline, Data Access Committees (DACs) have been established in different research infrastructures to assess and authorize data access requests. 4
Despite the emergence of DACs to bridge the divide of ethical oversight of initial data collection—governed largely by RECs—and downstream data use, there are no internationally accepted access review guidelines for DACs. Recent empirical research indicates, however, that many DACs engage in similar core considerations. 5 This includes verifying the qualifications of the data users to ensure that they are bona fide researchers and affiliated with a reputable institution, which can be held responsible for the actions of the data user. DACs also assess the consistency of proposed data uses with particular restrictions on the data use, as defined by the consents signed by participants or information provided to those participants6,7; some studies prohibit research for commercial gain or exploration of particular research questions that are sensitive in nature and can lead to stigmatization of participants, for example, the genetics of intelligence. Some DACs may also require the data users to obtain ethics approval from their home institution, which leads to involvement of users' home institution's REC/IRB in the ethical oversight of genomic data sharing. In addition, some DACs evaluate scientific feasibility of the data access requests when undertaking an access review assessment.4,5 It has been reported that such a review seeks to address three main goals, namely, protecting the participant, protecting the study, and protecting the researcher; responding to issues of ethics, reputation, and trust, and intellectual property, respectively. 8
Regarding the first goal, namely protecting the participants and addressing the ethical issues, we can identify three main ethical considerations for DACs when assessing data access requests: ensuring ethically appropriate downstream data uses, checking the consistency of the proposed data use(s) with the consent forms from the original data collection, and ensuring data use applicants have obtained the relevant ethical-legal approvals (Table 2). This has been informed by the results of an empirical study with DAC members 5 and a review of practices of a sample of DACs, 9 among others, which provided insights into the current practices of DACs.
DACs, Data Access Committees.
In this article, we discuss these three ethical considerations in detail and highlight the associated concerns DACs may encounter in fulfilling this task. We argue that despite the development of overarching data sharing and access policies such as the Framework for Responsible Sharing of Genomic and Health-Related Data developed by the Global Alliance for Genomics and Health 10 and recommendations prepared by the UK Expert Advisory Group on Data Access, 4 on the whole, DACs are left with little guidance on how to undertake these ethical assessments in a consistent and robust manner. Furthermore, ethical oversight of genomic data sharing would be considerably improved if the relevant ethical considerations by DACs and RECs were to be coordinated. We therefore suggest several mechanisms to achieve greater clarification of ethical considerations by RECs and DACs, as well as greater communication and coordination between both to ensure robust and sustained ethical research oversight of genomic data sharing across the research lifecycle.
Ensuring ethically appropriate downstream data uses
To obtain access to genomic datasets that are made available through controlled-access mechanisms, applicants complete a “data access request” form and submit it to the relevant DAC. Templates of the request forms that are provided on the EGA website show that these forms may include questions about the description and aims of the proposed study and also the relevant ethical issues that may arise as a result of proposed research. One of the main reasons for these questions is to ascertain the ethical acceptability and applicants' awareness of the ethical and social implications of their work, which may need to be addressed by the DACs or which may bring the study into disrepute and, thereby, damage trust relations with study participants. The UK Expert Advisory Group on Data Access underscores this issue as a point to consider for DACs in a 2015 report on Governance of Data Access, which states: “There may be grounds for refusing applications thought likely to bring the main study into disrepute, for example, if the applicants are attempting to investigate a contentious topic in a way which cannot for scientific reasons be supported by the available data.” 4 Therefore, according to this report, concerns about the research agenda could constitute legitimate grounds for DACs to refuse the data access requests.
Among the issues to consider within this question include elucidating what would constitute an “objectionable” proposed research use. Currently, examples of objectionable research use that have been provided in the literature include culturally or politically sensitive topics; ancestry studies in a small isolated population; and correlating cognitive ability and education to race.11,12 Howevebr, lack of guidance surrounding the definition and scope of “objectionable” proposed data uses can lead to opaque or arbitrary interpretations by DACs, resulting in uncertainty among all stakeholders about the a contrario definition, namely what is considered an “acceptable” research use, and eventually lead to inconsistencies between DACs.
Some have suggested that data access review by a DAC should prohibit any research that “may impact or harm dignity of the human beings in a way that is undesirable or unacceptable in a democratic society.” 13 To clarify these terms, actual incidents of objectionable data uses that are reported in previous studies could be consulted. For example, Fullerton and Lee's study provided instances of objectionable research uses, in the absence of consent from the research participants, namely genetic associations of addiction, mental health, and brain size with certain social identities. They concluded: “…it is not hard to imagine that some contributing participants would regard as objectionable research that attempts to correlate genetic variation with social identity or geographic location or implies ethnic differences in addiction, mental illness, or intelligence.” 12 Similarly, de Vries et al.' study revealed the researchers, funders, and REC members' concerns regarding the risk of “ethnic stigmatization” that may arise from downstream uses of the MalariaGEN project's (a large genomic collaboration based in Africa, Asia, and Europe examining malaria) data. 11 Perceptions of the general public and research participants themselves concerning objectionable research uses should also inform the discussion. 14 The implications of morally objectionable research uses could extend far beyond the individuals and concern social values; therefore, gathering the viewpoints of the public on this matter seems crucial. Moreover, objectionable data uses may vary across populations and depend on cultural issues and contextual factors, which need to be taken into consideration.
Currently, some DACs perceive ethical considerations as a core component of data access review, whereas others believe ethical considerations are best left to RECs—as their name explicitly states. 5 The latter approach aligns with the practices of some DACs that only perform a “light-touch” review that seeks to identify only those applications deemed controversial and then escalate them to a heightened level of scrutiny, often involving a REC. 7 Nevertheless, a clear referral procedure and a protocol to identify the relevant and responsible REC to review such “controversial” applications have yet to be clearly described.
One key question that persists is whether all data access requests should receive the same degree of scrutiny by DACs. An argument could be made that only access requests for so-called “high-risk data” should undergo a full committee review by DACs, whereas requests for minimal-risk data could receive a DAC review waiver or an expedited review, as is done by RECs in many jurisdictions for research studies presenting no material ethical issues. When approving the data sharing plans in the beginning of the studies IRBs/RECs could determine if the full data access review by DACs is needed. Data may be considered high risk when there is a higher possibility of reidentification of the research participants or when the data concerns include potentially stigmatizing genetic, phenotypic, behavioral, or social traits.15,16 For example, higher levels of concern associated with sharing potentially stigmatizing data, such as data from HIV-positive participants, have been reported in the previous empirical studies. 17
Checking the consistency of the downstream data uses with underlying consent forms
The downstream data uses that result from sharing data should be consistent with underlying consent forms or ethics approvals. In other words, follow-up studies making use of the data should be in accord with the permissions granted by the data donor at the time of original collection. This aligns with the principles of research ethics that endorse respecting the wishes of research participants in the entire course of conducting biomedical research, from the sample and data collection to future downstream data uses.
DACs often take the responsibility of verifying the consistency of data use proposals with underlying consent forms and whether they run against any data use limitations relating to the requested dataset(s). This is mainly because recontacting research participants for a new consent for each and every downstream data use (i.e., new research study) is perceived as disproportionate and impracticable. In doing so, DACs review the original consent form, which is obtained for the purpose of data collection. However, this is not always a straightforward task for DACs. Given that data use limitations (e.g., use of data only for certain types of research studies) may not be clearly articulated in underlying consent forms, careful interpretation is required at times. For instance, the implications of data use limitations on data use for commercial purposes are not always crystal clear, nor is it always clear-cut what is a commercial versus noncommercial purpose (particularly in modern academia). Another example is related to uncertainties that could arise from interpretation of data use limitations, when data use is limited to a specific disease in the consent form, and the data use applicants request to use data for a research on a closely related disease or comorbidities, which are associated with the initial disease. These problems are often intensified in retrospective uses of previously collected data, where data sharing and potential ethical implications for research participants were not even considered at the time of data collection, which may be many years previous. 7
One can argue that interpretation of consent forms should be grounded in an assessment of the reasonable expectations of research participants and informed by contextual factors (e.g., the research setting and preferences of the individuals) tied to the initial data collection. However, DACs may have limited to no knowledge about the research context of the participants and original data collection beyond what is written in the consent form. To inform the DACs, collecting the research participants' inputs is warranted. On some occasions, the consent form may be inaccessible to DACs because the data have been collected elsewhere, for example, in a hospital, underlining the importance of attaching consent forms to datasets.
Consequently, DACs need to be assisted with adequate tools and expertise in carrying out the task of interpretation of consent forms, which may be delicate in certain cases; it is expected that DACs may need to seek assistance from external experts in handling such cases. One way to address this problem could be to clearly record and standardize data use conditions at the time of data collection. To this end, Dyke et al. have suggested developing “consent codes” based on a structure for recording data use “categories” and “requirements” on the basis of existing consent provisions for major genomic databases. 18 Others, although, avoid “encoding complex data uses restrictions” in informed consent models to reduce the need for “complex review processes.” 19 Yet complex review processes are more or less the norm for many genomic data sharing projects, given the breadth and depth of the studies and geographic scale of the collaborative science.
Ensuring the pertinent ethical and legal approvals have been obtained
Access by users to individual-level genomic data may be conditioned upon obtaining appropriate legal and ethical approvals from their home institution. The underlying rationale for such a requirement may lay in an initial data use condition in the consent form, which requires new ethics approval for any downstream data uses. For instance, Kaye et al. report that the majority of cohorts from BioSHaRE-EU (a pan-European research consortium that aims to facilitate data sharing across multiple biobanks and databases) “require separate REC approval” from the data users. 20
However, the experience of some DACs reveals that making this a requirement in all instances could be problematic and disproportionate, leading to impediments in knowledge production, a public good that may also be considered a reasonable expectation of the participants who provide data. Several challenges are at play. First, verifying the adequacy of the evidence provided of REC approval or waiver could be challenging due to linguistic diversity, diversity of the forms, and diversity of ethical (or legal) standards in different jurisdictions. Second, the necessity of obtaining REC approval for each downstream data use is questionable and, indeed, is not required in all jurisdictions. The results of a study by Simpson et al. in the United States revealed that there are different opinions concerning the necessity of obtaining ethics approval for downstream data uses. Accordingly, some requests to access and analyze data from dbGaP must go through a full-board IRB review, while others receive an expedited review or a waiver; still on other occasions, applying for an IRB approval is not required. 21 One can argue that if downstream data uses fall within the scope of the prior consent or relevant authorization, then obtaining secondary REC approval is not warranted, because it is disproportionate, in particular, when one considers the significant delays that could result from meeting multiple ethics requirements before data access. 8 This approach resonates with a recent Recommendation (2016) 6 of the Committee of Ministers of the Council of Europe on research on biological material of human origin. Article 21 of the Recommendation stipulates that a requirement of re-consent or authorization only needs to be met when the proposed data uses fall outside the scope of prior consent or authorization. Local jurisdictions have also adopted this approach. For example, Quebec's Réseau de médecine génétique appliquée (RMGA) Consolidated Statement of Principles from 2016 states: “The participant can give a broad consent for a range of research projects. Research projects within this range and consistent with the original consent constitute a primary use of data and samples.” 22
Third, different perceptions about risks associated with data sharing, such as reidentification and privacy breaches, have been observed among REC members. Lemke et al.' investigation on the views of REC members on broad genomic data sharing revealed that their perceptions toward reidentification risks and the potential harms resulting from privacy breaches vary considerably. 23 The limited technical expertise of REC members in the view of “the complexity of current data-handling systems” may also challenge the ethics review. 24 The existing and well-documented inconsistency between RECs—although not unexpected—further undermines efforts toward “mutual recognition” of ethics review of international data-intensive research, which is premised on avoiding duplicative reviews and conducting a similar level of ethics scrutiny by all RECs. 25 In itself, mutual recognition does not argue against the need for RECs to consider local context—that is, morals and cultural practices within some defined “community”—in some situations (e.g., consent practices). But we should exercise some caution against this defense of local REC review; indeed, one may argue that the charge of “local values” can at times be as much rhetoric as reality—values that do not derive so much from a community as from institutional and subjective personality factors. 26
In response to these limitations, some DACs consider the applicants' self-declaration in the Data Access Agreement on meeting the ethical requirements of their home institution (or relevant authority) as sufficient. Trust, it seems, backed to some degree by the contractual agreement, governs many of the relationships between DACs and data user applicants.
The Path Forward
Based on our assessment of extant practices and functions of DACs and RECs, ethical oversight of genomic data sharing by RECs and DACs should be better coordinated to ensure robust and sustained oversight across the research life cycle. Better coordination will prune the redundancies between these two bodies and can provide confidence that all steps before actual data sharing will have been scrutinized sufficiently.
Achieving this will require three mechanisms. First, delineating the scope of “objectionable” data uses and clarifying the requirements for obtaining secondary ethics approvals from users' home institutions through international guidelines and policies seem necessary. Results of our investigation of DAC reveal that they are not well guided for this matter, and this could result in arbitrary and inconsistent decisions. In addition, new tools and mechanisms such as “consent codes” could facilitate reviewing the consistency of the downstream data uses with the wishes of research participants. Concurrently, self-assessment tools could aid researchers to learn what types of ethical-legal approvals are needed before data sharing. 18
Regardless, DACs may still need to work with RECs (those who approve the original data collection and data sharing plans) when consistency of proposed data uses with the underlying consent form is clouded with uncertainty. Given that RECs are charged with ethically approving the original data collection, consulting them—in the case of any uncertainty about access—could be instructive. Thus, a second mechanism is that RECs could be consulted when proposed data uses are considered ethically objectionable by DACs and when a further evaluation (or second opinion) is needed. Currently, RECs and DACs deal with the ethical considerations of data sharing separately in two silos—data collection and then downstream data sharing—suggesting that there is no established bridge of communication between these oversight bodies. To avoid gaps in ethical oversight and better assure robust protection across the research lifecycle, RECs and DACs must be seen as working together. It should be noted that DACs and the RECs who approve the original data collection or data sharing plans are sometimes located in different institutions, making interaction between them challenging and highlighting a need to adopt suitable models of interaction between DACs/RECs.
Third, both of these oversight bodies should be armed with adequate expertise to be able to fulfill the goal of robust ethics assessment. Although accommodating broad expertise in well-resourced DACs is feasible, in less-resourced DACs with a small number of incoming data access requests, this may not seem an immediate priority. Similarly, RECs may fail to adequately consider the concerns associated with data sharing due to the lack of awareness or expertise in genomics, data-intensive science, data protection regulation, and data ethics. As highlighted by Rothstein: “The increased scale of research and new computer technologies demand a more nuanced assessment of the risks and benefits of research using a range of deidentified information and biological materials.” 1 In turn, such assessments could assist in developing evidence-based guidelines with a checklist of points to consider for both oversight bodies to illuminate real risks and concerns associated with cross-border data sharing. It is recommended that such a document would provide further guidance for the oversight bodies on how to proceed when confronting uncertainties in reviewing high-risk/sensitive data or when there is an uncertainty regarding the ethical acceptability of the proposed uses.
To date and too often, RECs and DACs act as though they are in two separate worlds. Together, we believe that these three recommendations will foster greater synergy in the ethical oversight functions performed by RECs and DACs. Both RECs and DACs have a critical role to play in protecting the rights and interests of data donors and promoting the social value and public good of genomic data sharing. Concurrently, coordinated and well-functioning oversight bodies could address the requirements of adopting organizational measures and safeguards when processing personal data, including genomic data, as it is recently laid out by the EU General Data Protection Regulation and under the research exemption provisions (Article 89). Arguably, consent or anonymization approaches toward governance of genomic data sharing would be complemented in the light of the strengthened role for DACs and RECs.
Recommendations
• What constitutes “objectionable” data uses and requirements for obtaining secondary ethics approvals from users' home institutions should be delineated, in accordance with national laws and international policies.
• When possible, communication between DACs and RECs in assessment of controversial cases, or when there is uncertainty in interpretation of consent forms, should be facilitated.
• Data use limitations should be clearly recorded within consent forms at the time of original data/sample collections.
• DAC and REC members should be equipped with adequate expertise (including by providing them the ability to seek advice from specialist referees) concerning the informational risks associated with data-intensive research.
Footnotes
Author Disclosure Statement
No conflicting financial interests exist.
