Genes Gone Wild: Open Access: Data

Showing posts with label Open Access: Data. Show all posts

Tuesday, February 25, 2014

Why would any scientists fuss over making your data public and accessible?

Well colour me naive. When PLoS announced their new data archiving policy a few days ago, I hardly felt like it was a "big deal". Providing a public archive of raw data used in a published study seems like a no-brainer (except in very limited circumstances with medical records, location data for at risk species, etc), and is becoming standard practice, right? Clearly my naivete knows no bounds, given some of the really negative reaction to it. (here is one example, while discuss the issue a bit more broadly here ).

In the fields I am most active (at the intersection between Genomics, Genetics and Evolution), there have been numerous recent (and successful?) efforts to make sure data associated with studies becomes archived and publicly available in repositories. While they (the repositories) are not perfect, data archiving seemed to be working and generally useful, and I was always happy to do so myself. Yes some of the issues with getting the data and meta-data formatted for NCBI GEO (or SRA) could be annoying at times, but this was such a minor concern relative to all of the other efforts in collecting and analyzing the data, writing the manuscript (and getting it accepted for publication) that the day spent so that it would be available to other researchers long term seemed pretty minor. Other scientists have always sent me reagents and (when they could find it) data, so this seemed like an easy way to be helpful and inline with the scientific process (and hopefully progress).

More importantly, having tried to get data from other researchers over the years (with huge numbers of "old hard drive failures" always seeming to be the reason why it could not be made available to me). I have recently been involved with a large meta-analysis of previously published data. Rarely was the raw available, and because only summary statistics were available (rarely with associated measures of statistical uncertainty), we were very limited in what analyses we could do. There would be so much more we could do if the raw data had been available.

So, I do not want other researchers to have to deal with these frustrations because of me. By archiving data generated by myself or members of my lab, other researchers could get it without hassling me, and I would not have to worry about finding it at a later date (like 10 years down the road), where it may have taken far more time to recover, then putting it in a repository in the first place.

In Evolutionary biology, most of the journals simultaneously started a data archiving policy (generally associated with DRYAD) a few year ago, I was quite happy. Not only did I put data from new studies up in DRYAD, but also from my older studies (see here). I naively expected most evolutionary biologists to do the same. After all, there are many long term data sets in evolutionary biology that would be of great value, in particular for studies of natural selection, and estimating G matrices, where there is still much active methodological development. Some of the publications generated data sets requiring heroic efforts, and would be a huge community resource.

So I was a little surprised when DRYAD was not rapidly populated by all of these great legacy datasets. I think that folks "hoarding" data are a very small minority, and the majority of folks were just very busy, and this did not seem like a pressing issue to them. In any case, I have also spent some effort at my institution (Michigan State University) discussing such issues with students about the importance of data archiving. All of the benefits seem obvious, making our science more open, and making our data available for those who may be able to address interesting and novel questions in the future. Fundamentally, it is the data and the analysis (and interpretation) that represents much of the the science we do. Our scientific papers representing a summary of this work itself. Better to have it all (data, analysis and interpretation) available, no?

So, when PLoS made the announcement, this seemed like par for the course in biology. Funding agencies are mandating data management and sharing plans, other journals too. So who could be either shocked or dismayed by this?

Like I said, I can be naive.

Even after reading (and re-reading) posts (above) and discussion threads about concerns, I am still baffled. Yes there will be a few "corner cases" where privacy, safety or conservation concerns need to be considered. However for the vast majority of studies in biology this is not the case, or the data can be stripped of identifiers or variables to alleviate such issues, at least in many of these situations.

So what's the problem?
Does it require a little work on the part of the authors of the studies? Perhaps a little. However, I always remind folks in the lab that the raw data they generate, and the scripts they use for analysis will be made available. I find that to my benefit the scripts are much easier to read. Furthermore, keeping these issues in my mind makes it that much easier to get it organized for archiving. The readme files we generate make sure we do not forget what variable names mean, or other such issues. Handling data transformations or removing outliers in scripts means we can always go back and double check the influence of those observations.

In their post, DrugMonkey suggests that the behavioural data they generate in their lab is too difficult to organize in such a fashion as to be broadly useful. While I agree that the raw video (if that is what they are collecting) still remains difficult (although perhaps figshare could be used, which is what we will try for our behavioural trials), we find that the text files from our "event recordings" are very easy to post, organize and generate meta-data for. Does the data need to be parsed from its "raw" format, to one more useful for analysis? Sure, but we will also supply (as we do in our DRYAD data packages) the scripts to do so. Perhaps there is something I am missing about their concern. However, I do not concede their point about the difficulties about organizing their data for archiving. How hard should it really be to make such files useful to other experts in the field?

Even for simulations, we supply our scripts, configuration files, and sometimes data generated from the simulation (to replicate key figures that would take too long to generate by replicating the whole simulation).

It is always worth reminding ourselves (as scientists) of this quote (attributed to Sir William Bragg):

"The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them"

Thursday, October 24, 2013

My thoughts for the panel on "open access and the future of scholarly publishing"

On Tuesday, as part of open access week I participated in a panel "Publishing, Authoring, and Teaching in the Evolving Open Access Environment: A Panel Discussion". While this is not a word for word write-up, this is more or less the gist of what I said.

When I was asked whether I would be willing to participate in a panel discussion here at Michigan State University on the role of open access journals and the future of academic publishing I said yes. While I am not convinced I am particularly knowledgeable about it, I thought that it provided an opportunity to collect my thoughts, a manifesto of "how I communicate science, and why I do it that way".

While I do tweet and blog about aspects of open science, including open access publishing, I am not one of the most outspoken advocates, and only a moderate practitioner. I publish, review and edit in/for open access journals, but not exclusively. I continue to publish in many "subscription journals" that represent the journal of record for my field, or those with some inferred "prestige". I do happen to regularly discuss issues about open science, including open access publishing with many folks, but as you will see I am not sure where I fall down on it.

I work in the basic life sciences, at the interface of evolutionary biology, genetics and genomics. The norms of scholarly communication differ substantially from field to field, in terms of what is considered productive scholarship, books VS. articles, authorship and a host of other issues. Even within the natural sciences, scholarly communication differs between biology and say physics. So, my experience and understanding remains narrow and I claim no expertise.

I think that the future of scholarly publishing will be open access, in some shape or form. That is, the majority of published manuscripts will eventually be freely available to anyone with internet access. How do we get there? I have no idea. Will this be due to broad mandates from funding agencies and Universities to deposit manuscripts into repositories? Will journals generally agree to make content freely available after a fixed amount of time (6-12 months) - so called green open access? Or will gold open access become the norm where authors pay to have work reviewed or published? Likely a combination of these and other approaches, but I am not good at such guess work.

So why do we care about OA in the first place?

Several reasons.

This has been discussed by many before, so my thoughts on this are brief. For more detailed thought, check out Peter Suber's book "open access".

If you happen to be on the MSU library site, and happen to click the faculty page you will see on the right hand side links to a number of things including "Crisis in Scholarly Communication". The discussion on these pages is about the increasingly difficult access to scholarly publications. The basic reason is that while academic library budgets tend to be relatively flat, the cost of subscriptions to academic journals continues to increase very rapidly. Often this is because many of the subscription (i.e. for profit) publishers are commonly practicing bundling of journals. So if you want journal A, you also have to subscribe to x (pick a country) journal of y (sub-field) of z (pick organism).

Why should scholarly work be behind a paywall, and thus inaccessible.

In particular for scientific (and medically relevant work) it could benefit researchers, patients and doctors (who would otherwise not have access). Open access allows the whole public to look at the research if they so choose. This also removes one small barrier in the perception of the ivory tower, and rebuilding some trust with the public (more on this later).

Who is paying for the research? Generally not the publisher making money for the paper.

At least in the sciences, research is usually paid from grants from federal agencies, and salaries are paid from those grants, or from the University (such as MSU) which in part comes from state allocations and tuition dollars. The manuscript is then reviewed by referees usually for free (as part of our scholarly role) including as scientific editor (which is also not usually a paid position, at least for associate editors). Under the current system most referees get nothing (neither money nor any other incentive) for this essential service, and their pay is from their institution (and does not depend upon them performing this service). The publisher may maintain the electronic system to shuttle the draft manuscript to the referees, and if accepted performs copyediting and typesetting. There are exceptions to the rule (I have had absolutely excellent editing advice on both the writing and communicating the science for a recent paper in Trends in Genetics from the managing editor for instance), but this has not generally been the case for me.

Thus the publisher is making a great deal of profit, despite having only done a fraction of the work. They (not the authors) retain the copyright on the work. This is potentially a big problem.

How I got into this

I will tell you about how I got engaged with the ideas of open access publishing, as a small part of the larger endeavor to make science in academia more open, transparent, reproducible, and in such a way that scientific ideas and data are communicated more quickly and effectively. But I also will describe many of the remaining stumbling blocks that relate to views of open access journals specifically, and the nefarious concept of prestige in publishing and how that influences hiring, grants, and promotion.

The crisis in scholarship is much bigger than open access.

Why is there a crisis?

While open access of published work is certainly a factor in the crisis in scholarly communication, and the one I will speak the most about today, it is not the only factor (issues with peer review, reproducible research, sharing of data and code, etc..). There have been a slew of articles including a few in the Economist and the Guardian in the last few weeks on aspects of this crisis. Essentially it is argued that scientific research is in a tailspin, there is no effort to do quality science, and everything is about quantity of papers and prestige (i.e. spin) with little effort to make sure the work is valid, well reviewed or replicated. That is the incentive system for scientists is completely out of whack with the process of good science

Is there a crisis? I am a skeptic and cynic, so I think there are some real concerns. However the little optimist voice in my head also points out that there are some great opportunities as well to help to not only resolve this crisis, but make science/scholarly research better, and far more dynamic. Indeed there is a vocal and active community trying to make this all much better.

Before I delve into the specifics about the need for open access (and what might be stopping some of us from diving into it completely), I want to speak about the larger crisis of the scholarly enterprise in general, and how open science initiatives can fix this. In my mind this comes down to an issue of trust. Trust between collaborators. Trust between scientists working in the same field. Trust between researchers at different stages of their careers (graduate students, post-docs and PIs). Finally, we need trust between scientists and the general population. Not just how the public perceives scientists (and scholars in general) and the work they do, but that we do our science in an open way that leads to the appropriate self-correcting mechanisms. However, even beyond the large number of anti-science and anti-intellectual movements out there, there has been a substantial loss of trust in how scholars operate, and what motivates us.

Between the large number of research articles that get picked up by the popular press about "cures" or of genes for "this that and the other", only to have such research shown to be largely (or entirely) incorrect a few months later. Combine this with the many news articles that point out the lack of repeatability of scientific studies, or examples of scientific fraud and misconduct, our lay audience (and those that ultimate help pay for our research and salaries) are perhaps becoming quite skeptical. The lack of access to the scientific literature for the public due to paywalls (from subscription based journals) is simply another large nail in the coffin for the trust that scientists and science communicators have been trying to build with the public.

Frankly our motives are questioned, and not just by the public at large. They are also questioned by our graduate students too. As undergraduates (or from watching nature shows) they see this amazing wonderful universe to study, but then come to do research as graduate students and realize that a business model has taken place with a culture of "scientific stardom" being the goal for many. Worse than that, they see a perverse incentive system where quantity of publications and prestige over where articles are published has taken hold, and the overall quality of scientific research is the perceived victim (has this been evaluated?). There are many folks to blame, including university administrators, the "high prestige" journals, etc, but we first need to look at ourselves (practicing scientists) for accepting and adopting this system as it has developed into the status quo.

So how do we fix it?

Let's think about the aspects of open science. Not only do we need to communicate what we do more effectively, but we need to make everything far more transparent.

Open Science

Much has been written about open science, and the open science movement. It has many goals, but I would say in general the two most important ones are to increase transparency of the scientific process and to speed (and open) up science communication. There are many aspects going all of the way from open "lab notebooks" (here and here). Submitting pre-prints (papers prior to peer review and formal acceptance at a journal) to repositories to speed up science communication. Increasing openness in peer review, so that all can see the comments of the peer referees and editors. This can include pre and post publication review (pubpeer, pubmed commons). Sharing of raw data associated with research papers as well as all of the details of how the analysis was performed (the computer code associated with it). It all comes down to the fact the published paper is not the science itself, but a precis (or an advert) for the actual scientific work. All of the data, the work and even the peer review process itself is part of the scholarship of science. Making some or all of this available will not only help with transparency, but will speed up the scientific process. Having access to the raw data may also help to answer all sorts of new and interesting questions. I have always loved this quote by Sir William Bragg:

"The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them"

Open Access and the entanglement of scholarly publishing with prestige and other incentives

Crossroads for publishing

At least in the life sciences, it is clear that publishing articles in "high profile" journals like Nature, Science, PNAS (and a few others) can make or break a career. The prestige associated with such articles can trump many things. Having such publications can open many doors in terms of jobs, grants, tenure, invited talks and more. Indeed last week in Nature, articles were written about just this phenomenon, the so called "golden club". Not surprisingly most of the traditional journals with such cachet are subscription based, although there are at least two open access journals that are certainly up there (PLoS Biology and ELife).

Since most of the prestigious journals are subscriptions based, and there are such strong incentives to publish in them, it makes it very difficult for many researcher to move to publishing in open access journals (although they can still submit papers to institutional or disciplinary repositories). If I had the opportunity to publish in Nature or Science, would I? Yes, precisely because I know that having such a publication will open doors, aid in getting grants, promotions and raises. While I feel strong support for open access, I am frankly not above such concerns. Some of this may simply reflect my petty needs for external validation of my science (which I can get over), but grants and raises potentially influences the quality of my life and my work. It is hard to pass that up.

Perceptions of open access journals

There remains a common misperception that many open access journals are nothing more than predatory or "vanity" publications with little or no rigorous peer review. A recent "sting" by the science journalist John Bohannon in Science has done little to help this perception. Too much has already been written about this article, mostly highly critical of his methods, biased sampling approach and lack of a control group. While It was presented as a news piece (not a scholarly article), Bohannon has stated on several occasions that his original plan was to submit this to PLoS One (an open access article with peer review) so this remains an issue.

As Peter Suber (among many others) describes in his book there are currently two models of open access. The first (so called green OA) means that while papers may be published in subscription based journals, a free version of the accepted manuscript (usually without final copyediting and typesetting provided by the journal) is placed in a repository such as pubmed central or an institutional archive. Often this "free" version has a 6-12 month delay before being released. Since this model of green OA is a required stipulation for projects funded by organizations such as the NIH, some journals (where the majority of authors are funded by such agencies) are now just making all of their content open access after a 12 month embargo).

The other major model for open access publishing is gold open access. In this case, once a manuscript has gone through peer review by expert referees and academic editors, and it is accepted, then the authors of the manuscript are charged a fee for typesetting and (usually online) publication. Thus the model is that the authors, not the readers are charged. In the life sciences the funds for this usually (more on this in a second) come from granting agencies, although many journals have fee waivers (or no fees at all).

The concern with this of course is this may create a perverse incentive system, such that journals would increase their acceptance rate to increase profit (~1-2K/paper accepted). Thus the rigor of peer review could be negatively impacted, resulting in so called "vanity publications" that have the veneer of scientific rigor and peer review, but in fact do not. Couple this with so called predatory (scam) journals (that are much like other scam spam). Before open access journals existed, such vanity journals already existed among the subscription models. And as the Bohannon sting has shown us, journals that are published by well known publishers like Elsevier, are not above being "stung" and by accepting faux articles with obviously flawed methodologies as well. Beyond that, somewhere on the order of 70% of all open access journals have no author side fees.

Despite this issue, and the existence of predatory and possible vanity journals (such as many of those found on Beall's list), the Open access scholarly publishers association has a code of conduct for journals aiming to maintain reputable scholarly journals from predatory ones. From my perspective, the journal PLoS One which in many ways represents the flagship of open access journals (peer review entirely based upon technical soundness of the experiments and interpretation, not upon subjective assessment of novelty) was noted for how thorough the review process (and rejection was). The other worth while point is to take a look at the re-analysis done by Brian Wood. Seems like the one thing that journal impact factor might be useful at doing is predicting whether a reasonable amount of peer review might take place.

It is also worth pointing out that problems with sufficient peer review occurs with subscription based journals as well. In addition there has been a history of these so called vanity journals even among subscription publishers. In addition, many journals with subscription based models also have page and figure charges that the authors must pay, so some of the same incentives also apply to these journals. In my own personal experience, these page charges end up being about the same cost as publication in open access journals. So many of the same charges against OA journals can equally be leveled against such journals.

Why have I not embraced open access completely?

It is probably clear from my perspective on all of this, that I am firmly in favour of open access models of publishing, and like I stated from the outset, I do think that this is where everything is going to, although by what model I am not sure.

So given all of this, why don't I publish exclusively in open access journals? Well there are two reasons, or possibly one reason arising from two different parts of my mind.

The first relates to "establishment" journals. In my field, there are several well regarded journals that have persisted for a very long time, some for over a century such as Genetics. In my field, publishing my work in journals such as Genetics, or in Evolution means that A) It has a natural readership. B) These are the same journals that shaped my understanding of the field during my intellectual development and so I have a fondness for them and C) While they may not have the cachet of Nature, Science and the like.. there is no doubt that in my field they are considered well regarded journals. D) These also represent the journals for my professional societies, which I actively support and promote above and beyond their role in scholarly communication.

As for the second set of reasons.. I am not sure I am willing to be a martyr. In other words, I may be acting with a great deal of cowardice. Despite having been an editor at PLoS One for many years, and I stand by the rigor of reviews by my referees and myself, there is no doubt that there are many in the community who still believe that it (as a journal) accept anything. If I choose to publish all of my work (and that of my students and postdocs), I risk losing readership. If such views are held by university administrators I risk loss of salary raises promotions and grants (depending on the panel).

Thus until the incentive system has changed, and this can only change by concerted effort between university administrations, grant program officers and well established scholars in each of our fields embracing such changes, many researchers like myself will continue using this screwed up system, because of the incentives, risking further erosion of public trust. Is my half way attitude a cop-out. Yes. Some horrible mix of rationalization, cowardice and avarice I suppose. Am I likely to change my behaviour? Probably not until my mortgage is paid off and my kids have finished university.

Monday, October 14, 2013

Fallout from John Bohannon's "Who's afraid of peer review"

As many many scientists, librarians and concerned folk who are interested in scientific publishing and the state of peer review are aware, the whole 'verse' was talking about the "news feature" in Science by John Bohannon entitled "Who's afraid of peer review?".

The basics of the article was a year long "sting" operation on a "select" group of journals (that happened to be open access.. more on this in a second) focusing in part on predatory/vanity journals. That is some of the journals had the "air" of a real science journal, but in fact would publish the paper (?any paper?) for a fee. Basically Bohannon generated a set of faux scientific articles that at a first (and superficial) glance appeared to represent a serious study, but upon even modest examination it would be clear to the reader (i.e. reviewers and editors for the journal) that the experimental methodology was so deeply flawed that the results were essentially meaningless.

Bohannon reported that a large number of the journals he submitted to accepted this article, clearly demonstrating insufficient (or non-existent peer review). This and the head line has apparently lead to a large amount of popular press, and many interviews (I only managed to catch the NPR one I am afraid).

However, this sting immediately generated a great deal of criticism both for the way it was carried out, and more importantly the way the results were interpreted. First and foremost (to many) that ALL of the journals that were used were open access, and thus no control group for journals with the "traditional" subscription based models (where libraries pay for subscription to the journals). In addition, the journals were sieved to over-represent the shadiest predatory journals. That is it did not represent a random sample of open access journals. One thing that really pissed many people off (in particular among advocated of open access journals, but even beyond this group) that Science (A very traditional subscription based journal) used the summary headline: "A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals.", clearly implying that there was something fundamentally wrong with open access journals. There are a large number of really useful critiques of the article by Bohannon including ones by Michael Eisen, The Martinez-Arias lab, Lenny Teytelman, Peter Suber, Adam Gunn (including a list of other blogs and comments about it at the end). There is another list of responses found here as well. Several folks also suggested that some open access advocates were getting overly upset, as the sting was meant to focus on just the predatory journals. Read the summary line from the article highlighted in italics above, as well as the article and decide for yourself. I also suggest looking at some of the comment threads as Bohannon does join in on the comments Suber's post, and many of the "big" players are in on the discussion.

A number of folks (including myself) were also very frustrated with how Science (the magazine) presented this (and not just for the summary line). Making the "sting" appear to be scientifically rigorous in its methods, but then turning around and saying this is just a "news" piece whenever any methodological criticism is discussed. For instance, when readers commented about both the lack of peer review and the biased sampling of journals used for the "sting" operation for Bohannon's article, this was a response by John Travis (managing editor of News for Science magazine):

I was most interested in the fact Science (the journal) had an online panel consisting of Bohannon, Eisen and David Roos (as well as Jon Cohen Moderating) to discuss these issues. Much of it (especially in the first half hour) is worth watching, I think it is important to point out that Bohannon suggests he did not realize how his use of only OA journals as part of the sting operation would be viewed. He suggests that he meant this as largely a sting of the predatory journals, and that if he did it again he would have included the subscription based journals as a control group. You can watch it and decide for yourself.

The panelists also brought up two other important points that seem to not get discussed as much in the context of open access vs. subscription models for paying for publication or for peer review.

First, many subscription based journals (including Science) have page charges and/or figure charges that the author of the manuscript pays to the journals. As discussed among the panelists (and I have personal experience with paying for publication of my own research), these tend to be in the same ballpark as for the publication of open access papers. Thus the "charge" that the financial model for publication for OA journals would lead to more papers being accepted is true for many of the subscription journals as well (in particular for journals that are entirely online).

Second (and the useful point to come out of Bohannon's piece) is that there are clear problems with peer review being done sufficiently well. One suggestion that was made by both Eisen and Roos (and has been suggested many times before) is that the reviews provided by the peer referees of the manuscript and the editor could be published alongside (or as supplemental data on figshare) the accepted manuscript, so that all interested readers can assess the extent to which peer review was conducted. Indeed there are a few journals which already do this such as PeerJ, Embo J, ELife, F1000 Research, Biology Direct and some other BMC-series (see here for an interesting example), Molecular Systems Biology, Copernicus Journals. Thanks to folks on twitter for helping me put together this list!

This latter point (providing the reviews alongside published papers) seems to be so trivial to accomplish, and the reviewers names could easily remain anonymous (or they could provide their names providing a degree of academic credit and credibility to the scientific community) if so desired. So why has this not happened for all scientific journals? I am quite curious about whether there are any reasons NOT to provide such reviews?

Sunday, June 17, 2012

Where to publish.... Is there still a role for society journals?

Many folks who follow scientific blogs, or science in the media are probably aware of the recent uptick in the discussion of the role of publishing companies in the dissemination of scientific papers. While the Open Access movement (access to the scientific papers are free to all, at least those with access to the internet) has been steadily gaining steam over the past decade or so, there has also been an effort to make the publications associated with any publicly (via Government sources) financed scientific work available to the public. The logic to this is essentially that since funds from public coffers made the work possible, the summary of the results (as presented in a scientific paper) should be available at the very least to those who paid for it (if not to everyone on the planet, keeping with best scientific traditions). Publishers are still able to recoup their costs, and some profits via page costs, and library subscriptions during an initial phase of "paid access only". In my own estimation, this is pretty sound logic. Of course, I am a Canadian....

The reason that this effort is necessary is that most scientific journals are published via commercial publishers, and that access to many of the papers in these journals is behind a paywall. If you are at an institution that pays for it, you are granted access, otherwise you need to shell out $$. This does not seem to make much sense given that public funds were already used for the work itself. Several years ago the NIH created a policy where published papers associated with research performed with NIH funds need to be made publicly available six months after publication. The reason you may have heard about some or all of this during the past six months was because of the "research works act", which attempted, but failed to get rid of this policy (and other ones that might come to fruition in the future). The retraction of the proposed legislation was in no small part due to a very irate scientific community and general public (including a well publicized boycott)

I will in the future add some links to the great discussions available about this. However, I think a few places (here, here and here) sum it up well. Indeed, in addition to damaging the reputation of one publisher, Elsevier in particular, it has acted to really to generate a great deal of energy and discussion within the scientific community about the role of publishing companies in disseminating scientific papers, and more generally in how to open up the scientific process more generally. My friend has a nice review of the open science model, and I also recommend looking at sites like ResearchGate for an interesting experiment in combining science and social networking. There are also some interesting points to discuss about how the movements for open access to publishing, data and reproducible research seem to have not really connected well, despite some obvious shared goals (of access to the raw scientific data, the analysis used with the data, and the published summary of the findings associates with the data). However, that will need to wait for another post.

However my point for this post is somewhat different. It is easy to generate a caricature of the publishing companies as greedy corporate profiteers who use free labour (in the form of reviewers and editors of scientific papers), often charge scientists "page charges" (to copyedit and format the manuscript for publication), and then charge again for the finished product (to University libraries, and the public at large). Certainly a number of companies have lived up to this caricature as well. The open access publishing movement in its various forms (examples include journals by PLoS, Frontiers, BMC and the new PeerJ) are publishing many new journals that counter these issues (well the third issue, which is really what gets most people upset). I provide a role for several journals from these publishers (PLoS One and Frontiers in Genetics), and I am in general a strong supporter of them. But....

One nagging concern (other than a Hollywood style event that simultaneously destroys all of the hard-drives in the world, thus making all of this work vanish) is the fate of "Society Journals". Most scientific disciplines are backed by a society of researchers working on (often loosely) related research questions. For my own work, two of the societies I belong to are the Genetics Society of America (GSA), and the Society for the Study of Evolution (SSE). Now many scientists think that there societies primary role is to A) Organize a big annual scientific meeting, and B) to publish the "Journal of Record" for the field, where scientific advances are summarized in publication. Again in my field, those journals would be "Genetics" and "Evolution" respectively (I know very surprising names given the field). However, in addition to the two roles discussed above, many of the societies have other roles like public outreach & education, and lobbying on behalf of the scientists. Of course, until recently most small organizations did not have the ability to copy-edit, typeset and publish journals by themselves so they partnered to varying degrees with private publishing companies. Now of course, any computer savvy individual can do all of this.

I have absolutely no idea of how the proceeds/profits are split, and how much the scientific society receives (as compared to the publishing company). What I am wondering (and simply do not have any answers to) is how (assuming scientific publishing moves largely to Open Access), will scientific societies fund themselves? Do the journals (like Genetics and Evolution) have a plan to transition to pure Open Access? Do they have a model to sustain themselves? I am a strong supported of my societies, and I think that a lot of harm would be done if they vanished. At the same time, I am in complete support of the OA movement, and think it is likely the future. I have spoken with a number of people about this casually, most notably Michael Eisen, who has blogged a great deal on the need to move entirely to Open Access. However, so far, I have not heard any real mechanism for this. Any ideas?