Abstract
Objectives
Data sharing is well established in biological research, but evidence on sharing of clinical trial or public health research study data remains limited, in particular studies of research participants’ perspectives of data sharing. This study systematically reviewed international evidence of research participants’ attitudes towards the sharing of data for secondary research use.
Methods
Systematic search of seven databases, and author-, citation- and bibliography-follow up to identify studies examining research participants’ attitudes towards data sharing. Studies were thematically analysed using NVivo v10 to identify recurring themes.
Results
Nine studies were eligible for inclusion. Thematic analysis identified four key themes: (1) benefits of data sharing, including benefit to participants or immediate community, benefits to the public and benefits to science or research; (2) fears and harms, such as fear of exploitation, stigmatization or repercussions, alongside concerns about confidentiality and misuse of data; (3) data sharing processes, in particular the role of consent in the process; and (4) the relationship between participants and research such as trust in different types of research or organization and the relationship with the original research team.
Conclusions
The available literature on attitudes towards sharing data from clinical trials or public health interventions remains scant. This study has identified four themes regarding research participants’ attitudes and preferences, which should be considered by policy makers, and explored with further research.
Background
In 2011, a ‘joint statement of purpose’ 1 from global health funding agencies, academic researchers, international organizations and journals promoted data sharing in health research. Similar statements have followed since, such as those by the International Committee of Medical Journal Editors (ICMJE) 2 and the US Institute of Medicine (IOM). 3 These highlight the ‘ethical obligation to share’ 2 and they encourage a culture where ‘data sharing is the expected norm’. 3 Many journals 1,2,4,5 now require research data to be shared after study completion, for example, through a recognized repository. 6
Pooling data or conducting secondary analysis of data already collected for another study is expected to accelerate the ‘pace of discovery’, 1 advance science and clinical knowledge 3 and identify safe and effective patient treatments more quickly. Sharing also allows independent confirmation of results 2 and minimizes repetition of research and so reduces associated costs. It encourages transparency and reproducibility, so increasing the overall quality of research. 7 The value of participants’ participation is maximized 1 by ‘potentially facilitating additional findings beyond the original…outcomes’. 3 Overall, data sharing increases value for money for funders, while both honouring the contribution that participants made and fulfilling researchers’ moral obligation to participants, who may have put their health at risk to take part in research. 2
However, although there is a well-established culture of data sharing in the genetic and genomic communities, data sharing is less ingrained in public health research. 1 To increase the potential for sharing, a combination of gaining consent, anonymizing and regulating access is needed. 8 Much attention has been paid to anonymization and data control, but it is unclear how well suited the consent process is to sharing.
There is a limited amount of literature on perspectives of research participants on data sharing, particularly those participating in clinical trials or public health interventions. Available work tends to focus on primary care, (electronic) health records or bio-bank data,9–11 where participants are accepting of their data to be used in clinical studies, but it should be anonymized with consent sought in advance.9,10–13 Evidence on genetic and health record data further suggests that participants prefer to be contacted before their data are used in subsequent research.14–16 They may also want to re-consent based on the type of secondary research to be conducted, distinguishing between ‘acceptable’ (e.g. health service) and ‘unacceptable’ (e.g. commercial) research.9,17–19
This study reviews the international literature on research participants’ attitudes towards data sharing in the context of clinical trials and other public health research. It specifically explores participants’ understanding of data sharing, their attitudes towards sharing and whether awareness of data sharing could affect consent to take part in research.
Methods
A protocol was developed using the PRISMA-P, 2015 checklist 20 and followed throughout the systematic review process (online Appendix 1). The protocol was not eligible for PROSPERO registration as it does not concern health outcomes. 21
Search strategy
We piloted search terms in a Medline scoping search and then used broad search terms (online Appendix 2) relating to data sharing and participant, patient or public attitudes to interrogate the following databases: Medline, Embase, Web of Science, ASSIA, CINAHL, HMIC and PsychINFO. Key terms were taken from studies already identified and adapted for each database. Letters to the editor, books, conference proceedings and editorials were excluded. Reference and citation lists of included studies, publications of included first authors and references of systematic reviews were also searched; systematic reviews as such were excluded.
Inclusion and exclusion criteria
To be included studies had to report qualitative, quantitative or mixed methods empirical research. They had to address data sharing regarding secondary use of research data already collected as part of a trial, study or intervention. Included studies further had to examine attitudes of research participants or potential participants, i.e. members of the public. They had to be published between 1995 (year of publication of EU Directive 95/46/EC; the Data Protection Directive) 22 and 25 January 2017.
Studies concerning biobank data, human tissue, blood samples, routinely collected primary and secondary care data (health records) or ‘data-linkage’ were excluded.
We set no restrictions on language or country of origin.
Study selection
One reviewer (NH) screened all titles and a second reviewer (DNB) independently screened 20%, erring towards inclusion if uncertain. The same reviewers then independently screened all accepted abstracts against the inclusion criteria, noting reasons for exclusion. Reconciliation of disagreement was achieved through discussion and by erring towards inclusion. The remaining full papers were read in detail (by NH), with later group discussion including three authors (DNB, EM, NH) (Figure 1).

Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flow diagram.
Data extraction and quality appraisal
Detailed data were extracted from each included study by one author (NH) according to country of origin, date of research, study design, participant characteristics, study aims and key themes identified by authors of included papers (Table 1).
Characteristics of nine included studies addressing participant attitudes towards data sharing.
CASP: critical appraisal skills framework qualitative appraisal tool.
Each included study was assessed for quality by two reviewers (NH and either EM or DNB), using the Critical Appraisal Skills Framework Qualitative Appraisal Tool (CASP) 23 for qualitative studies24–31 and the Best Bets Survey Checklist Quality Assessment Tool 32 for the study using quantitative methods. 33
Data synthesis and analysis
Results sections from included studies were analysed using thematic synthesis. 34 This was done in NVivo Software Version 10 by one author (NH) using a line-by-line approach, inductively highlighting all relevant quotes from participants or descriptions from the authors of the original studies to form ‘free codes’. We did not consider quotes and data from non-participants (e.g. where researchers were also interviewed) or any corresponding author description. It was possible that the same sentence was assigned more than one code. The discussion section of the sole quantitative paper 33 was also analysed to provide a greater richness of descriptive data.
Free codes thus derived were then amalgamated into descriptive groups by two reviewers (NH and ELG), using a hierarchical structure. This process was repeated until the groups became broad themes, which were interrogated by all authors. Grouping codes in NVivo was an evaluative process; the original text was referred to ensuring that codes were not taken out of their intended context. There was no attempt to produce analytical themes, 34 as the purpose of this review was simply to report emerging themes, not to speculate as to why they occurred.
Results
Description of included studies
Of the 16,309 records identified by searches, nine met the inclusion criteria (Figure 1). The studies were published between 2002 and 2016 originating from Japan, Thailand, India, Kenya, Canada, Vietnam and the USA. All but one study used qualitative methods, such as focus groups or interviews. The remaining study 33 quantitatively analysed a telephone survey. Five studies concerned data sharing in low- and middle-income countries. These were part of the same funding award and shared many of the same authors and employed common methods.
Quality appraisal
All studies scored highly on the quality appraisal (Table 1). However, three qualitative studies lacked detail about the relationship between researcher and participant.24–26,31 The study by Platt and Kardia 33 did not explicitly state sample size, response rate and non-responders.
Themes arising from qualitative analysis
Analysis of the studies identified four themes: (1) benefits of data sharing, (2) fears and harms, (3) data sharing processes and (4) relationship between participants and research. We examine each in turn.
Benefits of data sharing
In all studies, participants identified benefits of data sharing, with three main types emerging: benefit to participants or immediate community, benefits to the public, and benefits to science or research.
Most participants wanted to see the benefits of data sharing in their local community, with one participant summarizing: ‘Data sharing is acceptable if the community benefits…; there is no point in merely writing about issues’. 26 There should be ‘local translational benefits’ 28 for ‘the community that contributed’, 25 particularly if the research in question focussed on a burden the community faced. 26
The ‘expectation of benefit’ 33 from data sharing also extended to the wider public, with phrases such as ‘greater good’, 29 ‘social value’ 25 and ‘actually helping people’ 30 used in one form or another by research participants. Jao et al. 28 reported that public benefit was sometimes seen as ‘satisfied by the involvement of international institutions… such as the World Health Organization’, suggesting that the perception of benefit may be as important as actually experiencing it.
Participants also appreciated the benefits to science and research, explaining that sharing ‘increased the efficiency of research and researcher opportunities’, 29 ‘generated evidence’ and ‘avoided duplication of effort’. 26 Participants thought that local researchers should also benefit, and that their ‘careers should not be “overtaken” by others who had made less investment’. 28
Fears and harms
Participants expressed fear of exploitation, stigmatization or repercussions, with some mentioning specific ‘harms’ 33 such as being reported to social services or an attempted abduction of their child, 29 alongside more mundane concerns such as third-party contact or telemarketing. Some participants reported that they would be hesitant to share their data as they were sceptical that it would be used in the right way, and so were likely to consent somewhat ‘reluctantly’.24,29,31
Participants wanted to maintain an element of control of their data, highlighting feelings of powerlessness, as there was ‘no way for us to know whether or not our personal information is dealt with anonymously’. 24 ‘Personal information’ was described as ‘something that can let people know who you are’. 29 Concern about being identifiable or the desire for privacy/confidentiality was referred to in most of the included studies.24–26,28,29,31 Some participants talked about the distinction between ‘sensitive’ (e.g. personal details, ethnicity, 25 HIV status, history of abuse 26 ) and less sensitive data such as routine demographics. 27 The potential sensitivity of data was, however, related to its intended use. 27
Participants were concerned that data could be ‘misused’,26,28,29,31 either unintentionally (misinterpretation) or deliberately, in order to contact participants or manipulate data to suit a particular purpose. ‘Misuse’ was therefore about both confidentiality and about aligning secondary research with participants’ principles. Harm was considered more likely to occur if data were shared ‘outside the original research team’, 25 with participants worrying about identification if data were used in ways not initially anticipated. One participant reflected on the need for ‘penalties’ for secondary researchers if their data were used in ways ‘not affiliated’ with the original research: ‘…if they use it for personal gain or a third party company…’. 29
Data sharing processes
Identified barriers to data sharing included their ‘novelty’, 27 ‘limited precedent’ 26 and practicalities such as the time or work involved to prepare data for sharing.25–27,29,31 Participants recognized the resources required to implement data sharing, with phrases such as ‘resource implications’, ‘funding and capacity building’ 25 and ‘substantial work’ 26 given by authors to paraphrase participants’ views.
Studies based in low- and middle-income countries25–28,31 specifically emphasized community or stakeholder involvement, while participants’ desire to be involved in the data sharing process was identified in all studies, as was the desire to be notified when their data were (re)used and to be informed of the results of studies using their data.
Participants showed varying degrees of understanding of the consent process. Some participants saw the consent process as an informative tool that can play a ‘wider educational role’: 27 ‘Perhaps you can explain in the consent form… other researchers can access my data to do further research’. 31
Seven studies24–27,29–31 discussed different levels of consent with participants. For some, a broad initial consent would be acceptable, while others wished for ‘individual informed consent’ or ‘personal permission’. 24 Participants evaluated the practicalities of each approach but stated their preference based on ideals of respect and transparency: ‘we always like to be asked…I don’t think [the project-specific consent model is] a great idea, but I think it would make us feel good’. 30 For others, it depended on whom the data would be shared with, and they would evaluate on a ‘case-by-case basis’. 25
Re-consenting was described in one study as an ‘unnecessary inconvenience’ 27 and an ‘annoyance’ or ‘irritation’, 30 which risked inviting more questions than if researchers had just shared data anyway. References were made to the practical difficulty of re-consenting participants.25,27,30,31
To be more comfortable with data sharing, participants wanted better data governance or gatekeepers, with processes to store data and manage access requests.25–27,29–31 Research data repositories (RDRs) could act as ‘stewards’ for data 29 perhaps with a committee who could oversee data sharing requests.26,27,29,30 A committee would be ‘a group trusted to make decisions’, 27 ideally with lay representatives, who could reach a consensus, and be held accountable for sharing decisions. 29
Participants identified other conditions that they would like to see in place before they could comfortably agree to share their data, including participants having understood that their data could be shared (transparency), risks mitigated, the research being in the publics’ interest and the research being congruent with the participants’ values. 24–27,29,31
Relationship between participants and research
Some studies reported that participants were largely unaware that researchers might already be sharing their data.24,26–30 Participants wanted data sharing to be better publicized, or be given the option to choose whether or not to share. Where participants were subsequently informed about data sharing, it was largely accepted as a ‘necessary sacrifice’ for scientific or medical progress. 29 There were then ‘high levels of uncertainty about how data might be used once it had been shared’, 28 with a desire for transparency regarding the recipient’s intentions.
The idea that their data could be shared with a secondary researcher prompted participants to consider acceptable types of research or researcher. Although one participant was content with anyone ‘[a]s long as it’s a qualified researcher’, 29 others wanted information about the researchers before agreeing that their data could be shared, 24 based on the idea that you ‘…approve secondary researchers, not their projects’. 29
Most participants agreed that their data should not be used for commercial gain. ‘[T]hird parties’27,29 or ‘industry based researchers’ 29 were distrusted because they might use data in a way that is inconsistent with the values of the participant or attempt to contact them ‘for nefarious or unconsented purposes’ 29 (e.g. telemarketing). One participant stated that if their data were to be shared with ‘a for-profit research group or something, I would want to know and at that point I would actually probably opt out’. 29
If participants allowed their data to be shared, researchers should ensure that they make good or proper use of it.24,29 It would be ‘wrong’ to use the data in a way that the participant is unlikely to have agreed to or understood.27,28 Participants were placing a great deal of trust in researchers to share their data with appropriate collaborators.
The researcher–participant relationship was described as ‘socially unequal’, a ‘tacit agreement between the researchers and patients’ 24 and similar to the ‘patient-provider’ relationship, 33 with the researchers indebted to participants. 26 The relationship with the originating researcher was crucial because it was they who would inform, reassure and garner a willingness to share. The primary researcher was also the preferred point of contact regarding re-consent ‘…just you guys’. 30
Discussion
Previous reviews have explored participants’ attitudes towards the sharing of biological and health record data, or data linkage.11,35,36,37 Our review identifies similar concerns: participants are open to and understand the advantages of data sharing, but they lack awareness and have concerns regarding confidentiality, potential data misuse, governance and commercial data use.
Participants in the included studies wanted appropriate data protection, and they identified processes that they thought could be modified in order to promote acceptance of data sharing. There was less evidence regarding the effects of data sharing on agreement to participate in research in the first place.
Implications for research and practice
In the current global drive to accommodate data sharing from the outset of studies,1–5 this review provides evidence of research participant’s concerns and preferences, which, if acknowledged by researchers and funders, will ensure that advances in research align with the values of the participants who contribute data.
This review found that although participants lacked awareness of data sharing, once given examples or vignettes, they agreed with sharing in principle.27,28,31 They suggested that the consent process should be a tool that explains sharing 31 so that consent is not just ‘informed’ by name, but in practice. Although more evidence is needed to determine whether there is a causal relationship between information provision and acceptance of data sharing, it is possible that strategies promoting the translational benefits of sharing to existing or potential participants could allay fears and encourage participation (or reduce opt-out). Further research is required to determine how and by whom this promotion should be delivered, and it could be tied to the varying degrees of participant trust in different stakeholders.
Consistent with other research,11,38 we find that some participants expressed a preference for re-consenting before sharing, and at the very least, the majority of participants preferred a thorough initial consent process with agreed terms as opposed to ‘inferred consent’, 39 or inferred opt-in where researchers neglected to ask for explicit consent to share. In practice, researchers may not be able to re-contact and re-consent before secondary data use, and this review identified that participants understand these challenges. 28 However, simply giving the option to opt-out at any stage could help persuade research participants to consent to data sharing. 38 The benefits of this must be weighed against the effects of opt-out on the primary study dataset. Excluding opt-out participants means the primary data set differs to that shared, limiting reproducibility and meta-analyses. 3 Excluding participants who decline to share their data will require time on part of the researcher (in addition to the resources already required) to prepare the dataset for secondary usage, which risks missing targets for timely sharing, such those initially proposed by the ICMJE. 2
To refine the consent process researches should seek input from patient focus groups or members of the public combined with the evidence gathered in this review.24–27,29–31 An ‘active communication’ at time of consent regarding likely future data uses 39 and types of researchers or archives with which data may be shared could be included as standard on consent forms.26,29,30
The review found that participants wanted adequate protection for their data and suggested that RDRs should be appropriately managed, with requests for data dealt with by committees 29 which include lay members, echoing a recommendation by the Institute of Medicine. 3 However, there are currently no hard rules or restrictions 2 regarding the location of shared data, ranging from supplementary material in a journal, within an institution’s repository or in a recognized (disciplinary) archive. 40 It also remains difficult defining what exactly an RDR is. 41 It therefore is difficult to see at this time how researchers could promise RDR management as well as community or lay representation in them. Funders and journal editors should also take account of participants’ preferences regarding location of shared data and who has access to it, and consider making exceptions. 2
Engaging participants in research that uses their own data by keeping them informed of the types of studies using their data, and the resultant outcomes (e.g. via newsletters) could enforce researcher accountability. It would ensure that the original participant’s consent is honoured and that data are used in projects that align with participants’ sensibilities or at least, and perhaps more realistically, benefit the population (disease or geographic) of which the participant is a part.25–27
Other implications for research and practice resulting from this review concern reassurance for participants that data are thoroughly anonymized prior to sharing.
Strengths and limitations
The key strength of this review is the extensive literature search, increasing the likelihood of capturing all relevant published evidence. It is the first review known to the authors concerning secondary use of trial/study data.
The included studies were found to be of reasonable or high quality, providing support for the validity of the results. There were no protocol deviations in conducting this review.
The main limitation is the paucity of studies regarding participants’ attitudes towards secondary use (sharing) of trial or study data, despite broad search criteria. It is questionable whether a review containing nine studies is truly representative of participant attitudes, although it was important to consolidate the small evidence base so that further work can be based upon its conclusions.
Five of the nine included studies25–28,31 originated from the same research team and funder, set in low- and middle-income countries. Any similarities in findings may be due to comparative methodologies or populations. Two studies by Manhas et al. used data from one research project.29,30
The included studies capture a wide range of countries including high-, middle- and low-income (but not the UK or Europe), potentially enhancing generalizability of findings.
Conclusions
The available literature on participant attitudes towards sharing data from clinical trials or health interventions is scant, and this review reflects that. This study identified four themes, which can be applied to policy or practice and tested with further research. Participants were clear about their conditions for data sharing, and research should move away from a culture of vague consent, which does not permit assessment by participants of the extent to which their data will be re-used, towards one of transparency. Better information about the benefits of data sharing, alongside the desired governance, may foster a willingness to share, so increasing the availability of data for secondary use.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship and/or publication of this article.
