Scientific data archiving

From Wikipedia, the free encyclopedia

? This article or section may contain original research or unattributed claims.
Please help Wikipedia by adding references. See the talk page for details.

Scientific data archiving refers to the long-term storage of scientific data and methods. Some scientific journals and funding agencies have policies that require scientists to archive their data and methods so other scientists can audit the data, replicate the research and build on their findings.[citation needed] These policies are generally not enforced.[citation needed] The need for data archiving and due diligence is greatly increased when the research deals with health issues or public policy formation.[1][2]

In order to prevent data loss or corruption, some journals—such as Nature—require data not be held by the researcher alone. In these cases, data must be archived either at a government data center, an accredited independent data library or the publisher of the journal.[3]

Contents

[edit] Policy of NSF in Grant General Conditions

36. Sharing of Findings, Data, and Other Research Products

a. NSF expects significant findings from research and education activities it supports to be promptly submitted for publication, with authorship that accurately reflects the contributions of those involved. It expects investigators to share with other researchers, at no more than incremental cost and within a reasonable time, the data, samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages awardees to share software and inventions or otherwise act to make the innovations they embody widely useful and usable.

b. Adjustments and, where essential, exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results, or the integrity of collections or to accommodate legitimate interests of investigators.[4]

[edit] Policies by journals

Nature: An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols available to readers promptly on request. Any restrictions on the availability of materials or information must be disclosed at the time of submission of the manuscript, and the methods section of the manuscript itself should include details of how materials and information may be obtained, including any restrictions that may apply. One preferred form of disclosure is a link from the methods section to a copy of the relevant Material Transfer Agreement (MTA) form, which is hosted as Supplementary Information on the journal's web site. Authors may charge a reasonable fee to cover the costs of producing and distributing materials. If materials are to be distributed by a for-profit company, this should be stated in the paper.
Any supporting data sets for which there is no public repository must be made available to referees at submission and any interested reader on and after the publication date from the authors directly, the author providing a URL to be used in the paper on publication.
Such material must be hosted on an accredited independent site (URL and accession numbers to be provided by the author), or sent to the Nature journal at submission, either uploaded via the journal's online submission service, or if the files are too large or in an unsuitable format for this purpose, on CD/DVD (five copies). Such material cannot solely be hosted on an author's personal or institutional web site.
After publication, readers who encounter a persistent refusal by the authors to comply with these guidelines should contact the chief editor of the Nature journal concerned, with "materials complaint" and publication reference of the article as part of the subject line. In cases where editors are unable to resolve a complaint, the journal reserves the right to refer the correspondence to the author's funding institution and/or to publish a statement of formal correction, linked to the publication, that readers have been unable to obtain necessary materials or reagents to replicate the findings.[5]
Science: Materials sharing After publication, all reasonable requests for materials must be fulfilled. A charge for time and materials involved in the transfer may be made. Science must be informed of any restrictions on sharing of materials [Materials Transfer Agreements or patents, for example] applying to materials used in the reported research. Any such restrictions should be indicated in the cover letter at the time of submission, and each individual author will be asked to reaffirm this on the Conditions of Acceptance forms that he or she executes at the time the final version of the manuscript is submitted. The nature of the restrictions should be noted in the paper. Unreasonable restrictions may preclude publication.[6]

[edit] Controversies involving data archiving

[edit] Heart research

Dr. Singh published research regarding heart attack victims. His research was questioned. The medical journal investigated for 12 years before deciding the research was probably fraudulent. If Dr. Singh had archived his data and methods prior to publication, the issue may have been resolved more quickly.[7]

[edit] Academic genetics

Withholding of data has gotten to be so commonplace in academic genetics that researchers at Massachusetts General Hospital published a journal article on the subject. The study found that “Because they were denied access to data, 28% of geneticists reported that they had been unable to confirm published research.”[8]

[edit] Data archives

[edit] References

  1. ^ "The Case for Due Diligence When Empirical Research is Used in Policy Formation" by Bruce McCullough and Ross McKitrick. [1]
  2. ^ "Data Sharing and Replication" a website by Gary King [2]
  3. ^ "Availability of Data and Materials: The Policy of Nature Magazine[3]
  4. ^ "National Science Foundation: Grant General Conditions (GC-1)" published April 1, 2001 (page 17)[4]
  5. ^ "Availability of Data and Materials: The Policy of Nature Magazine[5]
  6. ^ "General Policies of Science Magazine" [6]
  7. ^ "Medical Journal Editor Finds Truth Hard to Track Down" published by Alliance for Human Research Protection" [7]
  8. ^ "Data withholding in academic genetics: evidence from a national survey" by EG Campbell et al. [8]

[edit] Literature

  • Gauch Jr Hugh G (2002) Scientific Method in Practice, Cambridge University Press [ISBN-13 978-0521017084]
  • Popper, KR (1959) "The Logic of Scientific Discovery" (English translation, 1959) [ISBN-13: 978-0415278447]
  • Wilson F (2000) The Logic and Methodology of Science and Pseudoscience, Canadian Scholars Press [ISBN 1-55130-175-X]

[edit] External links

  • Statistical checklist required by Nature [9]
  • The US National Committee for CODATA [10]
  • Studies examine withholding of scientific data among researchers, trainees [11]
  • The Role of Data and Program Code Archives in the Future of Economic Research [12]
  • Data sharing and replication – Gary King website [13]
  • Some thoughts or disclosure and due diligence in climate science [14]
  • The Case for Due Diligence When Empirical Research is Used in Policy Formation by McCullough and McKitrick [15]
  • Thoughts on Refereed Journal Publication by Chuck Doswell [16]
  • “How to encourage the right behaviour” An opinion piece published March, 2002.[17]
  • “The Selfish Gene: Data Sharing and Withholding in Academic Genetics” by Eric Campbell and David Blumenthal published May 31, 2002.[18]