Research Dissemination: The Narrative
A typical empirical scientific workflow goes something like this: a research experiment is designed to answer a question; data are collected, filtered, and readied for analysis; models are fit, hypotheses tested, and results interpreted; findings are written up in a manuscript which is submitted for publication. Although highly simplified, this vignette illustrates the integral nature of narrative, data, and code in modern scientific research. What it does not show is the limited nature of the research paper in communicating the many details of a computational experiment and the need for data and code disclosure. This is the subject of the sections ''Research Dissemination: Data and Raw Facts'' and ''Research Dissemination: Methods/Code/Tools.'' This section motivates the sharing of the research paper, and discusses the conflict that has arisen between the need for scientific diss00emination and modern intellectual property law in the United States.
A widely accepted scientific norm, labeled by Robert K. Merton, is Communism or Communalism (Merton 1973). With this Merton described an ideal in scientific research, that property rights extend only to the naming of scientific discoveries (Arrow's Impossibility Theorem for example, named for its originator Kenneth Arrow), and all other intellectual property rights are given up in exchange for recognition and esteem. This idea underpins the current system of publication and citation that forms the basis for academic rewards and promotions. Results are described in the research manuscript which is then published, typically in established academic journals, and authors derive credit through their publications and other contributions to the research community. They do not receive financial or other material rewards beyond recognition by peers of the value of their contributions. There are many reasons for the relinquishment of property rights over discoveries in science, but two stand out. It is of primary importance to the integrity of our body of scientific knowledge that what is recognized as scientific knowledge has as little error as possible. Access not just to new discoveries, but also to the methods and derivations of candidates for new knowledge, is imperative for verification of these results and for determining their potential admission as a scientific fact. The recognition that the scientific research process is error prone—error can creep in at any time and in any aspect of research, regardless of who is doing the work—is central to the scientific method. Wide availability increases the chances that errors are caught ''many eyes make all bugs shallow.'' The second reason Intellectual Property rights have been eschewed in scientific research is the historical understanding that scientific knowledge about our world, such as physical laws, mathematical theorems, or the nature of biological functions, is not subject to property rights but something belonging to all of humanity. The U.S. federal government granted more than $50 billion dollars for scientific research last year in part because of the vision that fundamental knowledge about our world isn't subject to ownership but is a public good to be shared across all members of society. This vision is also reflected both in the widespread understanding of scientific facts as ''discoveries'' and not ''inventions,'' denoting their preexisting nature. Further, current intellectual property law does not recognize a scientific discovery as rising to the level of individual ownership, unlike an invention or other contribution. Here, we focus on the interaction of intellectual property law and scientific research article dissemination.
Copyright law in the United States originates in the Constitution, when it states that ''The Congress shall have Power … To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive
Right to their respective Writings and Discoveries''. Through a series of laws and interpretations since then, copyright has come to automatically assign a specific set of rights to original expressions of ideas. In the context of scientific research, this means that the written description of a finding is copyright to the author(s) whether or not they wish it to be, and similarly for code and data (discussed in the following two sections). Copyright secures exclusive rights vested in the author to both reproduce the work and prepare derivative works based upon the original. There are exceptions and limitations to this power, such as Fair Use, but none of these provides an intellectual property framework for scientific knowledge that is concordant with current scientific practice and the scientific norms described above. In fact far from it.
Intellectual property law, and how this law is interpreted by academic and research institutions, means that scientific authors generally have copyright over their research manuscripts. Copyright can be transferred, and in a system established many decades ago journals that publish the research manuscripts typically request that copyright be assigned to the publisher for free as a condition of publication. With some notable exceptions, this is how academic publication continues today. Access to the published articles requires asking permission of the publisher who owns the copyright owner, and usually involves paying a fee. Typically scientific journal articles are available only to the privileged few affiliated with a university library that pays subscription fees, and articles are otherwise offered for a surcharge of about $30 each.
A transformation is underway that has the potential to make scientific knowledge openly and freely available, to everyone. The debate over access to scientific publications breaks roughly into two camps. On one side are those who believe tax-payers should have access to the fruits of the research they've funded, and on the other side are those who believe that journal publishing is a business like any other, and the free market should therefore be left unfettered. The transformation started in 1991 when Paul Ginsparg, Professor of Physics at Cornell University, set up an open repository called arXiv.org (pronounced ''archive'') for physics articles awaiting journal publication. In the biosciences, a new publishing model was brought to life in 2000—Open Access publishing—through the establishment of the Public Library of Science, PLoS. PLoS publishes scientific articles by charging the authors the costs upfront, typically about $1300 per article, and making the published papers available on the web for free. The PLoS model has been extraordinarily successful, gaining in prestige and publishing more articles today than any other scientific journal.
The U.S. government has joined in this movement toward openness in scientific literature. In 2009 the National Institutes for Health (NIH) began requiring all published articles arising from research it funds to be placed in the publicly accessible repository PubMed Central within 12 months of publication. In January of 2011, President Obama signed the America COMPETES Reauthorization Act of 2010. This bill included two key sections that step toward the broad implementation of Open Access mandates for scientific research. The Act both required the establishment of an Interagency Public Access Committee to coordinate dissemination of peer-reviewed scholarly publications from research supported by Federal science agencies, and it directed the Office of Science and Technology Policy in the Whitehouse to develop policies facilitating online access to unclassified Federal scientific collections. As a result, on November 3, 2011 the Whitehouse announced two public requests for information on, ''Public Access to Peer-Reviewed Scholarly Publications Resulting From Federally Funded Research'' and ''Public Access to Digital Data Resulting From Federally Funded Scientific Research,'' As this chapter goes to press, the Office of Science and Technology Policy at the Whitehouse is gathering plans to enable Open Access to publications and to data from federal funding agencies.
These events indicate increasing support for the public availability of scientific publications on both the part of regulators and the scientists who create the content. The paradoxical publishing situation of sustained high charges for content generated (and subsidized) for the public good came about in part through the scientific norm of transparency. As mentioned earlier, establishing a scientific fact is difficult, error-prone work. The researcher must convince skeptics that he or she has done everything possible to root out error, and as such expose their methods to community scrutiny in order to flush out any possible mistakes. Scientific publication is not an exercise in informing others of new findings, it is an active dialog designed to identify errors and maximize the integrity of the knowledge. Scientific findings and their methodologies that are communicated as widely as possible have the best chance of minimizing error.
Scientific knowledge could be spread more widely, more mistakes caught, and the rate of scientific progress improved. Scientists should be able to share their published articles freely, rather than remitting ownership to publishers. Many journals have a second copyright agreement that permits the journals to publish the article, but leaves copyright in the hands of the authors. We are in need of a streamlined and uniform way of managing copyright over scientific publications, and also copyright on data and code, as elaborated in the next section.
-  The Science Insider: news.sciencemag.org/scienceinsider/budget_2012/
-  U.S. Const. art. I, §8, cl. 8
-  Association of American Publishers Press Release: publishers.org/press/56/
-  See: blogs.plos.org/plos/2011/11/plos-open-access-collection-%E2%80%93-resourcesto-educate-and-advocate/ for a collection of articles on Open Access
-  See plos.org/publish/pricing-policy/publication-fees/ for pricing information
-  See scholarlykitchen.sspnet.org/2011/06/28/plos-ones-2010-impact-factor/ for recent impact factor information
-  PubMed Central: ncbi.nlm.nih.gov/pmc/
-  America COMPETES Reauthorization Act of 2010: gpo.gov/fdsys/pkg/BILLS111hr5116enr/html/BILLS-111hr5116enr.htm
-  See whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federallyfunded-research
-  Unsurprisingly, the journal publishers are not so supportive. Just before the 2011 winter recess, House representatives Issa and Maloney introduced a bill that would do enormous harm to the availability of scientific knowledge and to scientific progress itself. Although no longer being considered by Congress (support was dropped the same day that publishing giant Reed-Elsevier claimed it no longer supported the bill), the ''Research Works Act'' would have prohibited federal agencies and the courts from using their regulatory powers to make scientific articles arising from federally funded research publicly available
-  See for example Science Magazine's alternative license at sciencemag.org/site/ feature/contribinfo/prep/lic_info.pdf (last accessed January 29, 2013)