Data Sharing and Materials Sharing

Delamothe T. Whose data are they anyway? British Medical Journal. 1996 May 18;312(7041): 1241-2.
This editorial from the deputy editor of the BMJ contends that data sharing is in the best interests of the research participants as it allows for the most benefit to come from their voluntary participation in research. These interests outweigh the interests researchers have in hoarding their data. The author suggests a number of measures which could be taken to facilitate data sharing: grant giving bodies could make funding conditional on willingness to share data, clearing houses for shared data could be set up, searchable registers of active and completed projects could be established, ethics committees could insist that protocols allow for data sharing.

Carlsson B, Fridh Ann-Charlotte. Technology transfer in United States University. Journal of Evolutionary Economics. 2002 March; 12: 199-232.
The author of this paper compares the roles of offices of technology transfer of 12 U.S. universities in 1998 in patents, licenses, and start-up companies. The findings reveal that weighing what qualifies as success in transfer requires more investigation into the interactions between the universities and other surrounding non-university communities, and the aims, culture, and organization of the OTTs.

Gardner D, Toga AW, Ascoli GA, Beatty JT, Brinkley JF, Dale AM, et al. Towards effective and rewarding data sharing. Neuroinformatics. 2003;1(3): 289-95.
The authors discuss the need for data sharing. Although they commend the NIH for its then newly introduced policy mandating data sharing for researchers receiving $500,000+ annual funding, they suggest that it does not go far enough to address some of the barriers to data sharing. The authors recommend the creation of recognized, descriptive and usable standards for data presentation. Furthermore, they establish the need for a cite and credit paradigm for shared data (similar to journal publications), and additional funding for informatics methods which enable sharing of data.

Reichmann JH, PF Uhlir. A contractually reconstructed research commons for scientific data in a highly protectionist intellectual property environment. Law and Contemporary Problems. 2003; 315.
The authors discuss various legal issues and trends affecting the production, distribution, and use of scientific data, including government-funded data and their commercial applications, public domain, private sector data, scientific data as a private good and pressures on public domain, shifting boundaries between public-private domains, commercial exploitation of academic research, intellectual property and contracts, proposals for the government, academic, and private sectors as a response to the legal and economic pressures.

Streitz WD, Bennett AB. Material Transfer Agreements: A University Perspective. Plant Physiology. 2003 September; 133: 10-13.
The authors discuss the change from a traditionally open community of material sharing among scientists to an era in which open exchange of materials is becoming more difficult. The most significant factor influence this change has been the narrowing of the gap between fundamental research and commercial developments. The authors examine the influence of the Bayh-Dole act on material transfers between universities and conclude that while the NSF and NIH promote and often maintain and open exchange of data and materials, these organizations are the exception rather than the norm. Furthermore, the authors argue that universities and private companies each have very legitimate interests that they are trying to support when engaging in material transfers and when these interests collide it can be very difficult to find common ground. This issue can be assuaged if both sectors focus on the goal of supporting research advances.

de Wolf VA, Sieber JE, Steel PM, Zarate AO. Part I: what is the requirement for data sharing? IRB: Ethics & Human Research. 2005 Nov-Dec;27(6): 12-6.
This is the first of a three part series that aims to educate institutional review boards on how to comply with data sharing requirements. The authors provide an overview of how data sharing has been conducted in various scientific fields. They outline a number of initiatives that encourage data sharing and provide a brief discussion of the benefits of shared data. They further explain some of the different methods of sharing scientific data, such as public access data archives, restricted access archives, unilateral sharing, and reciprocal sharing arrangements.

The other two papers in the series contain more detailed information on de-identifying and preparing data for sharing, and organizing/restricting access to data that is less relevant to stem cell science. These two papers are:

de Wolf VA, Sieber JE, Steel PM, Zarate AO. Part II: HIPAA and disclosure risk issues. IRB: Ethics & Human Research. 2006 Jan-Feb;28(1): p.6-11.

de Wolf VA, Sieber JE, Steel PM, Zarate AO. Part III: meeting the challenge when data sharing is required. IRB: Ethics and Human Research. 2006 Mar-Apr;28(2): p.10-5.

*Vogeli C, Yucel R, Bendavid E, Jones LM, Anderson MS, Louis KS, Campbell EG. Data Withholding and the Next Generation of Scientists: Results of a National Survey. Academic Medicine. 2006 February; 81(2): 128-136.
The authors provide data on the nature, extent, and consequence of withholding among life science trainees. Of the 1,077 second-year doctoral students and postdoctoral fellows surveyed, twenty-three perfect reported that they had asked for and been denied access to information, data, materials, or programming associated with published researchers. Just over twenty percent reported the same result for unpublished research. Almost fifty one perfect of researchers reported that withholding had a negative effect on the progress of their research. The authors argue based on their results, that data withholding demonstrates negative effects on trainees. Failure to address this issue could result in delayed research, inefficient training, and a culture of withholding among future scientists.

*Blumenthal D, Campbell EG, Gokhale M, Yucel R, Clarridge B, Hilgartner S, Holtzman N. Data Withholding in Genetics and the Other Life Sciences: Prevalances and Predictors. Academic Medicine. 2006 February; 81(2): 137-145.
The authors explore the variety and prevalence of data withholding in genetics and other life sciences, as well as the factors associated with these behaviors. They surveyed a sample of 2983 scientists and received responses from 1849 of them. Forty-four percent of geneticists and thirty-two percent of scientists in the life sciences reported having participated in some form of data withholding within the past three years. The authors explain that data withholding is common in biomedical science, can take multiple forms, and is influenced by a variety of characteristics. They encourage openness during the formative experiences of young investigators and argue that this may be critical to increased data sharing.

Noor MA, Zimmerman KJ, Teeter KC. Data sharing: how much doesn’t get submitted to GenBank? Public Library of Science – Biology. 2006 Jul;4(7): e228.
The authors discuss the issue of non-compliance with data sharing policies set out by journals. The authors conducted a study to determine what proportion of authors of genomics papers (of those published in journals that require scientists to deposit their data in shared databases) had deposited their data in the GenBank database. The rate of non-compliance was between 3% and 15%. The authors suggest that journals should be more proactive in enforcing data sharing policies.

Editors. Patenting, Licensing, and Social Responsibility. Journal of the Association of University Technology Managers. 2006 Fall; XVIII(2)
This issue of the AUTM journal contains multiple articles on the topics of patenting, licensing, and social responsibility. In “Human Embryonic Stem Cells: A Review of the Intellectual Property Landscape,” Irene Abrams discusses the complex patenting and licensing world of human embryonic stem cells. The author provides a history of hESCs and outlines the major patents in the hESC field and their availability for licensing in both research and commercial areas. In the second article. “Technology Licensing for the Benefit of the Developing World: UC Berkeley’s Socially Responsible Licensing Program,” Carol Mimura describes Berkeley’s Social Responsible Licensing Program. The third paper, “Parallel Importation: A threat to Pharmaceutical Innovation?,” the authors address the industry’s concern of smuggling therapeutic drugs from developing countries to developed countries. In the fourth paper, “Canada’s Helping Hand: Jean Chretien’s Pledge to Africa Legislation Allowing Export of Pharmaceuticals under Compulsory License,” the authors familiarize readers with Canada’s Jean Chretien Pledge to Africa. This unique legislation allows for compulsory licenses for manufacture in Canada of lower-cost versions of patented pharmaceuticals for export to countries unable to manufacture them. In the final article. “Surveying the Need for Technology Management for Global Health Training Programs,” the authors survey technology licensing offices regarding their practices with inventions related to global health and determine that technology managers need education to effectively handle global health technologies arising from university research.

Burgoon LD. The need for standards, not guidelines, in biological data reporting and sharing. Nature – Biotechnology. 2006 Nov;24(11): 1369-73.
The author’s main argument suggests that productive data sharing requires clear standards of reporting to be agreed upon, and that the issuing of guidelines hampers efforts. Burgoon uses the Minimum Information about a Microarray Experiment (MIAME) specifications in Microarray science as an example of the problems associated with having ambiguous guidelines. The author suggests that standards for data reporting must: 1) be clear, specific and unambiguous in their requirements; 2) be straightforward, with only one acceptable format 3) undergo comprehensive vetting by the community and 4) maintain a list of all prior proposed and adopted standards.

Editors. Democratizing proteomics data. Nature – Biotechnology. 2007 Mar;25(3): 262.
This editorial accompanies a statement that reads: “Beginning this month, Nature Biotechnology is recommending that raw data from proteomics and molecular-interaction experiments be deposited in a public database before manuscript submission.” The authors explain that this is to foster the exchange, comparison and reanalysis of experimental results, and promote the development of new algorithms and statistics that could improve the confidence in data and conclusions of proteomics research. They point to the success of data sharing in genomics and suggest a number of online depositories where data can be shared. Their goal is to enhance the utility, reproducibility and dissemination of proteomics data.

Piwowar HA, Day RS, Fridsma DB. Sharing detailed research data is associated with increased citation rate. Public Library of Science. 2007;2(3): e308.
The authors describe a possible benefit to researchers in making their data publicly available. The study found that “cancer clinical trials which share their microarray data were cited about 70% more frequently than clinical trials which do not”. It is suggested that this increased citation rate could be related to the decision to share data (although it is acknowledged that there may not be a causal link). In the discussion section the authors describe a number of disincentives researchers face when deciding whether to share data (effort and time involved, additional costs, de-identifying personal information), and the ways in which these disincentives are being addressed (development of sharing infrastructure and standards, funding for data sharing, software for de-identification).

Butcher J. Alzheimer’s researchers open the doors to data sharing. Lancet Neurology. 2007 Jun;6(6): 480-1.
The author discusses an instance of data sharing in the field of neuroscience. He describes how scientists in the US have decided to make data from the Alzheimer Disease Neuroimaging initiative freely available via the internet. He further emphasizes the importance of having agreed upon, standardized protocols to allow the easy collection and comparison of data from multiple locations and sources.

Jain S, George G. Technology transfer offices as institutional entrepreneurs: the case of Wisconsin Alumni Research Foundation and human embryonic stem cells. Industrial and Corporate Change. 2007 June; 16(4): 535-567.
The authors highlight the growing role of technology transfer offices as institutional entrepreneurs involved in the building legitimacy for novel technologies. They approach this question through a qualitative study of WARF’s initiatives to support the emergence of human embryonic stem cell technology. The authors through their narrative explore how the dual missions of technology transfer officers, their private and social interests, can influence how they take on different roles and eventually influence the future of developing technologies. Through the author’s narrative of WARF’s experiences in sponsoring hESC technology, they shed light on the role that TTOs can play as institutional entrepreneurs. Their analysis reveals that involvement in this activity requires commitment, resources and imagination, is highly uncertain, risk-laden and entails conflicted behavior, and involves operating in social, cognitive, political and technical landscapes.

*Ku K. and Henderson J. The MTA – Rip it up and start again? Nature – Biotechnology, 2007. 25(7): 721.
In this commentary, two university licensing experts debate on the pros and cons of material transfer agreement (MTA). According to Ku, “this emphasis on legal contractual arrangements for the exchange of scientific materials and reagents has now gone too far. Some universities now require their researchers to use MTAs before sharing materials, even when the researchers want to share materials without constraints. Companies also routinely require an MTA when sharing materials with university researchers, even materials that are not valuable to the company or purchasable by anyone on the open market. The experience of my colleagues and I—and the experience of most universities—is that very few MTAs result in intellectual property (IP) of value (or any IP at all for that matter!). And yet firms often try to exploit MTAs to ‘capture’ IP value.” Ku also makes several suggestions: “universities and companies consciously decide which materials warrant an MTA and which do not… researchers should reestablish the clear understanding and tradition of respect for other people’s work in that researchers will not casually distribute materials received from others without the provider’s consent… for materials that require an MTA, a company or university should post the agreement on a website for researchers and contracts people to review.” This could save time and resources for everyone.

As counterpoint, Henderson argues that MTAs are important in protecting the institution’s interests, including but also beyond IP and revenue protection. “First, they ensure that the contribution of a provider scientist and the materials they transfer are properly acknowledged in downstream research… Second, MTAs provide important liability indemnification… MTAs act as a control point allowing the technology transfer office to identify any preexisting third-party obligations and determine if a transfer is legally possible.” Henderson also makes several suggestions for improving the MTAs, including: “create a ‘nonprofit’ outgoing MTA template modeled on the UBMTA and authorize their investigators to execute it… Universities should include a statement in their commercial licenses that states they retain the right to transfer patented materials to other nonprofit academic institutions for research purposes.”

Samson K. Data sharing: making headway in a competitive research milieu. Annals of Neurology. 2008 Jul;64(1): A13-6.
Data sharing is presented as a positive but difficult step for the scientific community to make. The author uses anecdotal and empirical evidence to present the view that data sharing is not widespread in science today. Issues of motivation by publication and patents, and a lack of enforcement of data sharing standards are identified as impediments to sharing. Samson discusses a number of positive data sharing initiatives, but concludes that ongoing lack of funding for research will reinforce the tendency for scientists to hoard data.

Kaye J, Heeney C, Hawkins N, de Vries J, Boddington P. Data sharing in genomics–re-shaping scientific practice. Nature Reviews Genetics. 2009 May;10(5): 331-5.
The authors discuss some of the difficulties that have been associated with the shift to data sharing in genomics and how these could be addressed. The four areas of focus are:  1) The difficulties of acknowledging individual contributions to the generation of data – three possible solutions suggested are: having large numbers of authors on a paper; including the original data producers as ‘contributors’ rather than ‘authors’  on a paper; and recognizing the data sets themselves used in papers by some system which allows the original contributors to be recognized. 2) The way that these policies change the responsibilities towards participants – adequate consent requires the patient to be informed of how their data is to be used and the protection of the trust between [participant and researcher is vitally important. This is achieved by having a system where researchers can appeal that their data not be shared if doing so could compromise the confidentiality of participants. 3) The implications that this has for maintaining public trust – adequate measures of privacy protection and consenting are necessary to maintain public trust in researchers. 4) The new mechanisms that have been developed for oversight of access to data – the formation of data access committees to determine who has access to which data are one method of ensuring the goals of data sharing are pursued whilst still protecting participant and researchers interests. Having international regulations as well as national and regional regulations would add another layer of bureaucracy for researchers and could be a disincentive that does little to protect participants/researchers.

Eastman, Q. Proteomics researchers solidifying principles for data sharing. Journal of Proteome Research. 2009 Jul;8(7): 3220.
The authors report on a summit of proteomics researchers, journal editors and funding agency representatives which met in 2008 to discuss the future of data sharing in Proteomics. The summit concluded that it was best to share raw data but recognized that this varied based on the instruments used. The development of a common data format could overcome this, but would not be available for a number of years. They suggested that individual researchers should submit data upon publication but larger community funded projects should submit data as it is generated.

Rodriguez H, Snyder M, Uhlen M, Andrews P, Beavis R, Borchers C, et al. Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: the Amsterdam principles. Journal of Proteome Research. 2009 Jul;8(7): 3689-92.
The authors report on a one-day summit that aimed to begin defining policies and practices that would govern the release of proteomics data into the public domain. The authors point to the success in the field of genomics of sharing research data publicly. The summit decided that raw data accompanied by appropriate metadata were the most valuable information to share. The authors recommend cooperation between scientists, journals, and funding agencies, and emphasize the importance of central repositories having their own standardized metrics as necessary measures to improve data sharing.

*Nelson B. Data sharing: Empty archives. Nature. 2009 Sep 10;461(7261): 160-3.
The author reports on how some data sharing infrastructures have been set up yet remain underutilized. After describing a number of these data sharing failures and some new initiatives the author suggests that a cultural shift is necessary in science before data sharing becomes common—that is publications and data production need to be more equally valued. Another theme mentioned is that data sharing needs to be fast, easy and cheap in order to make it more successful and appealing to researchers.

Birney E, et al. Prepublication data sharing. Nature. 2009 Sep 10;461(7261): 168-70.
Rapidly releasing sequencing data before publication has allowed the field of genomics to develop and progress.  However, on the basis of the opinions of participants in a Toronto meeting, policies governing prepublication release of data need to be reviewed and renewed to cater to the changing research landscape and its requirement of a culture of data sharing.

Schofield PN, Bubela T, Weaver, T, Portilla L, Brown SD, Hancock JM, et al. Post-publication sharing of data and tools. Nature. 2009 Sep 10;461(7261): 171-3.
The authors focus on sharing in the field of mouse research, primarily the sharing of mouse types and murine embryonic stem cell lines. The authors suggest that there needs to be more sharing of mouse types through the use of repositories (currently only a third of all types of mice are available in this way). They point out that patenting of new mouse lines is not a financially rewarding endeavor, and suggest a shared research commons that makes data and tools more accessible to reasearchers. The authors advise that work needs to be done to ensure that shared data conforms to a consistent format and is accompanied by standardized metadata. Only then will data be made properly available to other researchers.

Anderson BJ, Merry AF. Data sharing for pharmacokinetic studies. Paediatric Anaesthesia. 2009 Oct;19(10): 1005-10.
The authors review some of the pros, cons and unresolved issues with sharing of scientific research data, both generally and with reference to pharmacokinetic studies. Savings of cost, time and effort and the increased efficiency that this brings to research and the practical benefits that come from this research are noted as the main benefits of sharing. The reasons against data sharing noted include; primary researchers not receiving adequate recognition for their work, the closing off the possibility of future publications from data once it is freely available, and the difficulty in accurately interpreting another researcher’s data. The authors go on to describe some of the practical problems of sharing, including the costs of data storage and management, the question of ownership of data and authorship for publications using multiple data sources.

Editors. Sharing data. Nature – Cellular Biology. 2009 Nov;11(11): 1273.
The editors state the journal’s position on data sharing. It concludes: “Large reference datasets that benefit the wider community and that cannot be analysed efficiently by the data producers should enter the public domain without delay, as long as appropriate attribution and credit can and is given. Scientific culture has to change so that data is valued alongside publications.”

Guttmacher AE, Nabel EG, Collins FS. Why data-sharing policies matter. Proceedings of the National Academy of Science 2009 Oct 6;106(40): 16894.
The authors acknowledge the advantages of data sharing but focus on the necessity of protecting the interests of those involved, research participants and especially researchers themselves. They suggest protection of participants interests be provided by adequate consent and confidentiality procedures and IRBs. They focus on the need to protect the interests of researchers by providing a period of time (6-12 months is suggested) when the producers of the data have exclusive rights to apply for publication.

Singh JA, Daar AS. Intra-consortium data sharing in multi-national, multi-institutional genomic studies: gaps and guidance. Hugo Journal. 2009 Dec;3(1-4): 11-4.
The authors focus on data sharing in an international consortium. They recommend that data sharing policies be established as early as possible and with contribution from developing world partners in the consortium. It is acknowledged that developing world researchers face more obstacles in utilizing shared data and it is suggested that developed world partners have a moral obligation to amend this by providing support and development of analysis skills to developing world partners. The authors recommend that consortiums have ethics committees to resolve clashes, which may arise between the interests of the consortium and one of its constituent members. Specifically, they recommend that publication of site-specific data should be delayed if publication would be against the interests of the consortium, unless delayed publication would cause a significant negative impact on the progress of science.

Cormier et al. Protein Structure Initiative Material Repository: an open shared public resource of structural genomics plasmids for the biological community. Nucleic Acids Research; 2010 Jan; D743-749.
The authors discuss the Protein Structure Initiative Material Repository and how it overcomes some of the traditional problems with material transfer agreements.  This repository contains a plasmid annotation that includes the full length sequence, vector information and associated publications, and is stored in a freely available and searchable database. PSI-MR has also developed an expedited process MTA, in which they created a network of institutions that agree to the terms of the transfer in advance of a material request, and thus eliminates unwarranted delays. The authors hope that by creating such a repository they will help accelerate the accessibility and pace of scientific discover.

Pisani E, AbouZahr C. Sharing health data: good intentions are not enough. Bulletin of the World Health Organization. 2010 Jun;88(6): 462-6.
The authors argue that data sharing in the field of public health is both necessary and possible. They point to the success of data sharing in genomics as proof that the obstacles to data sharing – a lack of incentive for researchers to share, adequate mechanisms for sharing, and the potential for breaches of consent and confidentiality – can be overcome. They suggest that it is time for some organizations to take leadership in investing in and working towards increasing incentives to share data, improving data management and the formation of data libraries.

Groves, T. The wider concept of data sharing: view from the BMJ. Biostatistics. 2010 Jul;11(3): 391-2.
The author reiterates the position of the BMJ on data sharing. Groves contends that researchers who publish in the journal should include in their paper a statement that says what additional data is available and how to access it. Data sharing is consistent with the BMJ’s policy of open access and transparency in reporting medical research and has great potential to increase understanding and use of information. The challenges associated with data ownership, privacy and lack of incentives are acknowledged.
*Contreras JL. Prepublication data release, latency, and genome commons. Science. 2010 Jul 23;329(5990): 393-4.
An analysis of the latency-based framework based on genome commons leads to the conclusion that scientific information sharing and accessibility policies need to be tailored for the targeted discipline, as general policies may be too restrictive for certain scientific disciplines.

Sommer J. The delay in sharing research data is costing lives. Nature – Medicine. 2010 Jul;16(7): p.744.
The author highlights the delay in publication of data after its original development and the consequences of this for groundbreaking scientific discoveries. The author contends that the process of medical innovation is unduly dragged out because of a lack of collaboration between researchers, and that the lives of patients are the ultimate consequence of this.

Cragin MH, Palmer CL, Carlson JR, Witt M. Data sharing, small science and institutional repositories. Philosophical Transactactions Series A Mathematical, Physical, and Engineering Sciences. 2010 Sep 13;368(1926): 4023-38.

*Note: entries are presented in chronological order within each category. Entries marked with an asterisk are those that we found to be particularly helpful as we developed materials for this project.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s