Genes Gone Wild: 2013

Monday, November 11, 2013

pictures in my head: What is that on the wing of the fly? What does it tell us about adaptation?

Over the past week or so, there has been an absolutely amazing image that has made the rounds on the internet of a fly (Goniurellia tridens)with markings on its wings reminiscent to many viewers of ants.

Goniurellia tridens is a 3-in-1 insect [photo: Peter Roosenschoon] pic.twitter.com/i8ThAOkrvN
— Ziya Tong (@ziyatong) November 4, 2013

As described in these blog posts and articles (Anna Zacharias, Jerry Coyne, Morgan Jackson, Andrew Revkin, Joe Hanson, also here and here) about it, the assumption is that these images are used by this fly to mimic the ant (or more likely a spider- more on this below), to act to ward off potential predators. However, there has been relatively little discussion about the context in which it uses it (but see Morgan Jackson's post), and demonstration of its adaptive utility. As pointed out by many evolutionary biologists, and discussed in detail by Gould and Lewontin in one of the most famous papers in evolutionary biology (The Spandrels of San Marco and the Panglossian paradigm: A critique of the Adaptationist programme), it is easy to make a "just so" adaptive story, but as scientists we need to perform critical experiments demonstrating the adaptive utility of this picture on the wing.

As numerous commenters on the blogs and on twitter have pointed out this fly is part of the family of true fruit flies (Tephritidae), that include several that are known to startle jumping spider (causing them to do a short retreat). This retreat is likely because the flies have evolved to mimic aggressive behaviours of the spiders themselves. This work was initially described over 25 years ago in a pair of papers in Science (One by Erik Greene, Larry Orsak and Douglas Whitman. The other paper by Monica Mather and Bernard Roitberg). These papers beautifully demonstrate the adaptive utility of markings on the wing combined with a rowing action of the wings that could achieve this mimicry. Neither the markings on the wings nor the rowing behaviour alone were sufficient to induce the aversion behaviour in the spiders (the spiders retreat). Indeed those of us who took biology courses in University in the early to mid 1990's probably remember this example being taught to us. What's more is that it seems to be fairly wide spread among species in this family of flies (each research group used a different species of Tephritid fly and spider.) Another paper (Oren Hasson 1995) tested about 18 different species of jumping spiders with the medfly (also a Tephritid), and showed that most spiders responded with the retreat as well. This suggests that this adaptive wing morphology and behaviour combination is probably pretty ancient.

Here I want to show you a video of a picture-winged fly, with a jumping spider. This fly is from a totally different family of flies (the picture-winged flies Ulidiidae (formerly Otitidae)) than the ones discussed above (Tephritids), but apparently does the same thing to startle jumping spiders (as a way of escaping being eaten) as the true fruit flies.

A few years ago, when I was hosting a lab bbq in my backyard, we were lucky enough to get to watch the intricate little behavioural "routine" between a fly and a jumping spider (in this case the bold jumping spider, Phiddipus audax). The spider approached the fly, got into its attack posture, and then the fly did its "wing rowing" display, the spider "retreated" (took a short jump back), and the fly took off, successfully evading getting eaten. Not too shabby, plus how often do you get to watch this for real!

Two years ago I got to watch this happen again, and this time I happened to have some collecting vials. So I collected the flies! I then put the fly in a small dish with a jumping spider (the zebra spider Salticus scenicus) so I could get some simple video of it. Here it is in all its grainy, low quality glory.

Given that I am not a great entomologist, I sent a picture of the fly off to a colleague (Jim Parsons, our collection manager in the MSU Entomology department), and he pointed out to me that this was not a true fruit fly (Tephritid) at all, but a picture-winged fly (from the family Ulidiidae). This particular fly is called Delphinia picta.

This was clearly really exciting, as it shows the potential for a whole other group of flies demonstrating a similar set of anti-predation behaviours. While both of these families belong to the same super-family, their last common ancestor lived probably 75 million years ago (give or take several million years). Is this an example of two different groups of animals independently adapting the same way (convergence) to a similar selective pressure (not getting eaten)? Or is it an adaptation that has survived for millions of years across many species? Finally the possibility exists that some aspects of the behaviours and wing spots allow this to evolve as an anti-predator adaptation over and over again (parallelism)? Whatever it is, it suggests that something even deeper and cooler has happened in evolution, and it will be great to figure this out (hint to new graduate students seeking projects!). As my colleague Rich Lenski mentioned to me (when I showed him this video), it also makes one think carefully about the appropriate "null hypothesis" regarding putative adaptations!

In my lab, one of the things we study is the fly Drosophila melanogaster, and how it evolves in response to potential predators, including jumping spiders. Drosophila is the little fly that you used in high school or university biology. Many call it a fruit fly, even though it isn't (pomace fly and vinegar fly are both used as its common names). For Drosophila we have never observed this kind of behaviours at all. However Drosophila does display a pretty wide range of behaviours, and we are writing up a paper about it right now. For a taste of some of it, check out my graduate students poster over on figshare describing some of the behaviours.

Let me know if you want more, and maybe I can post some additional video. However, to whet your appetite here is another related video that we posted a while ago to youtube (of flies with a mantid). The action starts at about 2:30 into the video.

Thursday, October 24, 2013

My thoughts for the panel on "open access and the future of scholarly publishing"

On Tuesday, as part of open access week I participated in a panel "Publishing, Authoring, and Teaching in the Evolving Open Access Environment: A Panel Discussion". While this is not a word for word write-up, this is more or less the gist of what I said.

When I was asked whether I would be willing to participate in a panel discussion here at Michigan State University on the role of open access journals and the future of academic publishing I said yes. While I am not convinced I am particularly knowledgeable about it, I thought that it provided an opportunity to collect my thoughts, a manifesto of "how I communicate science, and why I do it that way".

While I do tweet and blog about aspects of open science, including open access publishing, I am not one of the most outspoken advocates, and only a moderate practitioner. I publish, review and edit in/for open access journals, but not exclusively. I continue to publish in many "subscription journals" that represent the journal of record for my field, or those with some inferred "prestige". I do happen to regularly discuss issues about open science, including open access publishing with many folks, but as you will see I am not sure where I fall down on it.

I work in the basic life sciences, at the interface of evolutionary biology, genetics and genomics. The norms of scholarly communication differ substantially from field to field, in terms of what is considered productive scholarship, books VS. articles, authorship and a host of other issues. Even within the natural sciences, scholarly communication differs between biology and say physics. So, my experience and understanding remains narrow and I claim no expertise.

I think that the future of scholarly publishing will be open access, in some shape or form. That is, the majority of published manuscripts will eventually be freely available to anyone with internet access. How do we get there? I have no idea. Will this be due to broad mandates from funding agencies and Universities to deposit manuscripts into repositories? Will journals generally agree to make content freely available after a fixed amount of time (6-12 months) - so called green open access? Or will gold open access become the norm where authors pay to have work reviewed or published? Likely a combination of these and other approaches, but I am not good at such guess work.

So why do we care about OA in the first place?

Several reasons.

This has been discussed by many before, so my thoughts on this are brief. For more detailed thought, check out Peter Suber's book "open access".

If you happen to be on the MSU library site, and happen to click the faculty page you will see on the right hand side links to a number of things including "Crisis in Scholarly Communication". The discussion on these pages is about the increasingly difficult access to scholarly publications. The basic reason is that while academic library budgets tend to be relatively flat, the cost of subscriptions to academic journals continues to increase very rapidly. Often this is because many of the subscription (i.e. for profit) publishers are commonly practicing bundling of journals. So if you want journal A, you also have to subscribe to x (pick a country) journal of y (sub-field) of z (pick organism).

Why should scholarly work be behind a paywall, and thus inaccessible.

In particular for scientific (and medically relevant work) it could benefit researchers, patients and doctors (who would otherwise not have access). Open access allows the whole public to look at the research if they so choose. This also removes one small barrier in the perception of the ivory tower, and rebuilding some trust with the public (more on this later).

Who is paying for the research? Generally not the publisher making money for the paper.

At least in the sciences, research is usually paid from grants from federal agencies, and salaries are paid from those grants, or from the University (such as MSU) which in part comes from state allocations and tuition dollars. The manuscript is then reviewed by referees usually for free (as part of our scholarly role) including as scientific editor (which is also not usually a paid position, at least for associate editors). Under the current system most referees get nothing (neither money nor any other incentive) for this essential service, and their pay is from their institution (and does not depend upon them performing this service). The publisher may maintain the electronic system to shuttle the draft manuscript to the referees, and if accepted performs copyediting and typesetting. There are exceptions to the rule (I have had absolutely excellent editing advice on both the writing and communicating the science for a recent paper in Trends in Genetics from the managing editor for instance), but this has not generally been the case for me.

Thus the publisher is making a great deal of profit, despite having only done a fraction of the work. They (not the authors) retain the copyright on the work. This is potentially a big problem.

How I got into this

I will tell you about how I got engaged with the ideas of open access publishing, as a small part of the larger endeavor to make science in academia more open, transparent, reproducible, and in such a way that scientific ideas and data are communicated more quickly and effectively. But I also will describe many of the remaining stumbling blocks that relate to views of open access journals specifically, and the nefarious concept of prestige in publishing and how that influences hiring, grants, and promotion.

The crisis in scholarship is much bigger than open access.

Why is there a crisis?

While open access of published work is certainly a factor in the crisis in scholarly communication, and the one I will speak the most about today, it is not the only factor (issues with peer review, reproducible research, sharing of data and code, etc..). There have been a slew of articles including a few in the Economist and the Guardian in the last few weeks on aspects of this crisis. Essentially it is argued that scientific research is in a tailspin, there is no effort to do quality science, and everything is about quantity of papers and prestige (i.e. spin) with little effort to make sure the work is valid, well reviewed or replicated. That is the incentive system for scientists is completely out of whack with the process of good science

Is there a crisis? I am a skeptic and cynic, so I think there are some real concerns. However the little optimist voice in my head also points out that there are some great opportunities as well to help to not only resolve this crisis, but make science/scholarly research better, and far more dynamic. Indeed there is a vocal and active community trying to make this all much better.

Before I delve into the specifics about the need for open access (and what might be stopping some of us from diving into it completely), I want to speak about the larger crisis of the scholarly enterprise in general, and how open science initiatives can fix this. In my mind this comes down to an issue of trust. Trust between collaborators. Trust between scientists working in the same field. Trust between researchers at different stages of their careers (graduate students, post-docs and PIs). Finally, we need trust between scientists and the general population. Not just how the public perceives scientists (and scholars in general) and the work they do, but that we do our science in an open way that leads to the appropriate self-correcting mechanisms. However, even beyond the large number of anti-science and anti-intellectual movements out there, there has been a substantial loss of trust in how scholars operate, and what motivates us.

Between the large number of research articles that get picked up by the popular press about "cures" or of genes for "this that and the other", only to have such research shown to be largely (or entirely) incorrect a few months later. Combine this with the many news articles that point out the lack of repeatability of scientific studies, or examples of scientific fraud and misconduct, our lay audience (and those that ultimate help pay for our research and salaries) are perhaps becoming quite skeptical. The lack of access to the scientific literature for the public due to paywalls (from subscription based journals) is simply another large nail in the coffin for the trust that scientists and science communicators have been trying to build with the public.

Frankly our motives are questioned, and not just by the public at large. They are also questioned by our graduate students too. As undergraduates (or from watching nature shows) they see this amazing wonderful universe to study, but then come to do research as graduate students and realize that a business model has taken place with a culture of "scientific stardom" being the goal for many. Worse than that, they see a perverse incentive system where quantity of publications and prestige over where articles are published has taken hold, and the overall quality of scientific research is the perceived victim (has this been evaluated?). There are many folks to blame, including university administrators, the "high prestige" journals, etc, but we first need to look at ourselves (practicing scientists) for accepting and adopting this system as it has developed into the status quo.

So how do we fix it?

Let's think about the aspects of open science. Not only do we need to communicate what we do more effectively, but we need to make everything far more transparent.

Open Science

Much has been written about open science, and the open science movement. It has many goals, but I would say in general the two most important ones are to increase transparency of the scientific process and to speed (and open) up science communication. There are many aspects going all of the way from open "lab notebooks" (here and here). Submitting pre-prints (papers prior to peer review and formal acceptance at a journal) to repositories to speed up science communication. Increasing openness in peer review, so that all can see the comments of the peer referees and editors. This can include pre and post publication review (pubpeer, pubmed commons). Sharing of raw data associated with research papers as well as all of the details of how the analysis was performed (the computer code associated with it). It all comes down to the fact the published paper is not the science itself, but a precis (or an advert) for the actual scientific work. All of the data, the work and even the peer review process itself is part of the scholarship of science. Making some or all of this available will not only help with transparency, but will speed up the scientific process. Having access to the raw data may also help to answer all sorts of new and interesting questions. I have always loved this quote by Sir William Bragg:

"The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them"

Open Access and the entanglement of scholarly publishing with prestige and other incentives

Crossroads for publishing

At least in the life sciences, it is clear that publishing articles in "high profile" journals like Nature, Science, PNAS (and a few others) can make or break a career. The prestige associated with such articles can trump many things. Having such publications can open many doors in terms of jobs, grants, tenure, invited talks and more. Indeed last week in Nature, articles were written about just this phenomenon, the so called "golden club". Not surprisingly most of the traditional journals with such cachet are subscription based, although there are at least two open access journals that are certainly up there (PLoS Biology and ELife).

Since most of the prestigious journals are subscriptions based, and there are such strong incentives to publish in them, it makes it very difficult for many researcher to move to publishing in open access journals (although they can still submit papers to institutional or disciplinary repositories). If I had the opportunity to publish in Nature or Science, would I? Yes, precisely because I know that having such a publication will open doors, aid in getting grants, promotions and raises. While I feel strong support for open access, I am frankly not above such concerns. Some of this may simply reflect my petty needs for external validation of my science (which I can get over), but grants and raises potentially influences the quality of my life and my work. It is hard to pass that up.

Perceptions of open access journals

There remains a common misperception that many open access journals are nothing more than predatory or "vanity" publications with little or no rigorous peer review. A recent "sting" by the science journalist John Bohannon in Science has done little to help this perception. Too much has already been written about this article, mostly highly critical of his methods, biased sampling approach and lack of a control group. While It was presented as a news piece (not a scholarly article), Bohannon has stated on several occasions that his original plan was to submit this to PLoS One (an open access article with peer review) so this remains an issue.

As Peter Suber (among many others) describes in his book there are currently two models of open access. The first (so called green OA) means that while papers may be published in subscription based journals, a free version of the accepted manuscript (usually without final copyediting and typesetting provided by the journal) is placed in a repository such as pubmed central or an institutional archive. Often this "free" version has a 6-12 month delay before being released. Since this model of green OA is a required stipulation for projects funded by organizations such as the NIH, some journals (where the majority of authors are funded by such agencies) are now just making all of their content open access after a 12 month embargo).

The other major model for open access publishing is gold open access. In this case, once a manuscript has gone through peer review by expert referees and academic editors, and it is accepted, then the authors of the manuscript are charged a fee for typesetting and (usually online) publication. Thus the model is that the authors, not the readers are charged. In the life sciences the funds for this usually (more on this in a second) come from granting agencies, although many journals have fee waivers (or no fees at all).

The concern with this of course is this may create a perverse incentive system, such that journals would increase their acceptance rate to increase profit (~1-2K/paper accepted). Thus the rigor of peer review could be negatively impacted, resulting in so called "vanity publications" that have the veneer of scientific rigor and peer review, but in fact do not. Couple this with so called predatory (scam) journals (that are much like other scam spam). Before open access journals existed, such vanity journals already existed among the subscription models. And as the Bohannon sting has shown us, journals that are published by well known publishers like Elsevier, are not above being "stung" and by accepting faux articles with obviously flawed methodologies as well. Beyond that, somewhere on the order of 70% of all open access journals have no author side fees.

Despite this issue, and the existence of predatory and possible vanity journals (such as many of those found on Beall's list), the Open access scholarly publishers association has a code of conduct for journals aiming to maintain reputable scholarly journals from predatory ones. From my perspective, the journal PLoS One which in many ways represents the flagship of open access journals (peer review entirely based upon technical soundness of the experiments and interpretation, not upon subjective assessment of novelty) was noted for how thorough the review process (and rejection was). The other worth while point is to take a look at the re-analysis done by Brian Wood. Seems like the one thing that journal impact factor might be useful at doing is predicting whether a reasonable amount of peer review might take place.

It is also worth pointing out that problems with sufficient peer review occurs with subscription based journals as well. In addition there has been a history of these so called vanity journals even among subscription publishers. In addition, many journals with subscription based models also have page and figure charges that the authors must pay, so some of the same incentives also apply to these journals. In my own personal experience, these page charges end up being about the same cost as publication in open access journals. So many of the same charges against OA journals can equally be leveled against such journals.

Why have I not embraced open access completely?

It is probably clear from my perspective on all of this, that I am firmly in favour of open access models of publishing, and like I stated from the outset, I do think that this is where everything is going to, although by what model I am not sure.

So given all of this, why don't I publish exclusively in open access journals? Well there are two reasons, or possibly one reason arising from two different parts of my mind.

The first relates to "establishment" journals. In my field, there are several well regarded journals that have persisted for a very long time, some for over a century such as Genetics. In my field, publishing my work in journals such as Genetics, or in Evolution means that A) It has a natural readership. B) These are the same journals that shaped my understanding of the field during my intellectual development and so I have a fondness for them and C) While they may not have the cachet of Nature, Science and the like.. there is no doubt that in my field they are considered well regarded journals. D) These also represent the journals for my professional societies, which I actively support and promote above and beyond their role in scholarly communication.

As for the second set of reasons.. I am not sure I am willing to be a martyr. In other words, I may be acting with a great deal of cowardice. Despite having been an editor at PLoS One for many years, and I stand by the rigor of reviews by my referees and myself, there is no doubt that there are many in the community who still believe that it (as a journal) accept anything. If I choose to publish all of my work (and that of my students and postdocs), I risk losing readership. If such views are held by university administrators I risk loss of salary raises promotions and grants (depending on the panel).

Thus until the incentive system has changed, and this can only change by concerted effort between university administrations, grant program officers and well established scholars in each of our fields embracing such changes, many researchers like myself will continue using this screwed up system, because of the incentives, risking further erosion of public trust. Is my half way attitude a cop-out. Yes. Some horrible mix of rationalization, cowardice and avarice I suppose. Am I likely to change my behaviour? Probably not until my mortgage is paid off and my kids have finished university.

Tuesday, October 22, 2013

How easy should it be to replicate scientific experiments?

The economist just published a pair of articles broadly about the state of affairs in scientific research (and from their perspective everything is in a tail spin). "How Science Goes Wrong" and " Trouble at the lab". Both articles are worth reading, although few will find themselves in agreement with all of their conclusions. Neither article takes very long to read, so I will not try to sum up all of the arguments here. For two very different perspectives on these articles check out Jerry Coyne's blog who largely agrees with the statements they make. An alternative perspective on why these articles missed the mark almost entirely, see the post by Chris Waters my colleague here at Michigan State University . Chris points out that most studies do not represent a single experiment examining a particular hypothesis, but several independent lines of evidence pointing in a similar direction (or at least excluding other possibilities).

However, instead of going through all of the various arguments that have been made, I want to point out some (I think) overlooked issues about replication of scientific experiments. Principally that it can be hard, and even under extremely similar circumstances stochastic effects (sampling) may alter the results, at least somewhat.

Let's start by assuming that the original results are "valid", at least in the sense that there was no malfeasance (no results were faked), the experiments were done reasonably well (i.e. those performing the experiments did them well with appropriate controls), and that the results from the experiments were not subject to "spin" and no crucial data was left out of the paper (that may negate the results of the experiments). In other words, ideally what we hope to see out of scientists.

Now, I try and replicate the experiments. Maybe I believe strongly in the old adage "trust but verify" (in other words be a skeptical midwesterner). Perhaps, the experimental methods or results seem like a crucial place to begin for a new line of research (or as an alternative approach to answering questions that I am interested in).

So, I diligently read the methods of the paper summarizing the experiment (over and over and over again), get all of the components I need for the experiment, follow it as best as possible, and .... I find I can not replicate the results. What happened? Instead of immediately assuming the worst from the authors of the manuscript, perhaps consider some of the following as well.

1- Description of methodological detail in initial study is incomplete (this has been and remains a common issue). Replication is based on faulty assumptions introduced into the experiment because of missing information in the paper. Frankly this is the norm in the scientific literature, and it is hardly a new thing. Whether I read papers from the 1940's, 1970's or from the present I generally find the materials and methods section lacking, from the perspective of replication. While this should be an easy fix in this day and age (extended materials and methods included as supplementary materials or with the data itself when it is archived), it rarely is.

What should you do? Contact the authors! Get them on the phone. Often email is a good start, but a phone or skype call can be incredibly useful at getting all of the details out of those who did the experiment. Many researchers will also invite you to come spend time at their lab to try out the experiment under the conditions, which can really help. It also (in my mind) suggests that they are trying to be completely above board and feel confident about their experimental methods, and likely their results as well. If they are not willing to communicate with you about their experimental methods (or to share data, or how they performed their analysis), you will probably be in good shape to feel skeptical about how they have done their work.

2- Death by a thousand cuts. One important issue (relating to the above) is that it is almost impossible to perfectly replicate an experiment, ingredient for ingredient (what we call reagents). Maybe the authors used a particular enzyme. So you go ahead and order that enzyme, but it turns out to be from a different batch, and the company has changed the preservative used in the solution. Now, all of a sudden the results stop working. Maybe the enzyme itself is slightly different (in particular if you order it from a different company).

If you are using a model organism like a fruit fly, maybe the control (wild type) strain you have used is slightly different than the one from the original study. Indeed, in the post by Jerry Coyne mentioned above, he discusses three situations where he attempted to replicate other findings and failed to do so. However, in at least two of the cases I know about, it turned out that there were substantial differences in the wild type strains of flies used. Interesting arguments ensued, and for a brief summary of it, check out box 2 in this paper. I highly recommend reading the attempts at replication by Jerry Coyne and colleagues, and responses (and additional experiments) by the authors of the original papers (in particular for the role of the tan gene in fruit fly pigmentation).

Assuming that the original results are valid, but you can not replicate them, does it invalidate the totality of the results? Not necessarily. However, it may well make the results far less generalizable, which is important to know and is an important part of the scientific process.

3- Sampling effects. Even if you follow the experimental protocol as closely as possible, with all of the same ingredients and strains of organisms (or cell types, or whatever you might be using), you may still find somewhat different results. Why? Stochasticity. Most scientists take at least some rudimentary courses in statistics, and one of the first topics they learn about is sampling. If you have a relatively small number of independent samples that you use (a few fruit flies for your experimental group, compared to a small number in their control group), there is likely to be a lot of stochasticity in your results because of sampling. Thankfully we have tools to quantify aspects of the uncertainty associated with this (in particular standard errors and confidence intervals). However for many studies they treat large quantitative differences as if they were essentially discrete (compound A turns transcription of gene X off....). Even if the effects are large, repeating the experiment again may result in somewhat different results (different estimate, even if confidence intervals overlap).

If the way you assess "replication" is something like "compound A significantly reduced expression of gene X in the first experiment, does it also significantly reduce expression upon replication", then you may be doomed to frequently failing to replicate results. Indeed statistical significance (based on p values etc...) is a very poor tool in statistics. Instead you can ask whether the effect is in the same direction, and whether the confidence intervals between the initial estimate and the new estimate upon replication overlap.

Ask the authors of the original study for their data (if it is not already available on a data repository), so you can compute the appropriate estimates, and compare them to yours. How large was their sample size? How about yours? Can that explain the differences?

4- Finally, make sure you have done a careful job at replicating the initial experiment itself. I have seen a number of instances where it was not the initial results, but the replication itself which was suspect.

Are there problems with replication in scientific studies? Yes. Are some of the due to the types of problems as discussed in the economist or on retraction watch? Of course. However, it is worth keeping in mind how hard it is to replicate findings, and this is one of the major reasons I think meta-analyses are so important. It also makes it clear why ALL scientists need to make their data available through disciplinary or data type specific repositories like DRYAD, NCBI GEO, the short read archive or more general ones like figshare.

Monday, October 14, 2013

Fallout from John Bohannon's "Who's afraid of peer review"

As many many scientists, librarians and concerned folk who are interested in scientific publishing and the state of peer review are aware, the whole 'verse' was talking about the "news feature" in Science by John Bohannon entitled "Who's afraid of peer review?".

The basics of the article was a year long "sting" operation on a "select" group of journals (that happened to be open access.. more on this in a second) focusing in part on predatory/vanity journals. That is some of the journals had the "air" of a real science journal, but in fact would publish the paper (?any paper?) for a fee. Basically Bohannon generated a set of faux scientific articles that at a first (and superficial) glance appeared to represent a serious study, but upon even modest examination it would be clear to the reader (i.e. reviewers and editors for the journal) that the experimental methodology was so deeply flawed that the results were essentially meaningless.

Bohannon reported that a large number of the journals he submitted to accepted this article, clearly demonstrating insufficient (or non-existent peer review). This and the head line has apparently lead to a large amount of popular press, and many interviews (I only managed to catch the NPR one I am afraid).

However, this sting immediately generated a great deal of criticism both for the way it was carried out, and more importantly the way the results were interpreted. First and foremost (to many) that ALL of the journals that were used were open access, and thus no control group for journals with the "traditional" subscription based models (where libraries pay for subscription to the journals). In addition, the journals were sieved to over-represent the shadiest predatory journals. That is it did not represent a random sample of open access journals. One thing that really pissed many people off (in particular among advocated of open access journals, but even beyond this group) that Science (A very traditional subscription based journal) used the summary headline: "A spoof paper concocted by Science reveals little or no scrutiny at many open-access journals.", clearly implying that there was something fundamentally wrong with open access journals. There are a large number of really useful critiques of the article by Bohannon including ones by Michael Eisen, The Martinez-Arias lab, Lenny Teytelman, Peter Suber, Adam Gunn (including a list of other blogs and comments about it at the end). There is another list of responses found here as well. Several folks also suggested that some open access advocates were getting overly upset, as the sting was meant to focus on just the predatory journals. Read the summary line from the article highlighted in italics above, as well as the article and decide for yourself. I also suggest looking at some of the comment threads as Bohannon does join in on the comments Suber's post, and many of the "big" players are in on the discussion.

A number of folks (including myself) were also very frustrated with how Science (the magazine) presented this (and not just for the summary line). Making the "sting" appear to be scientifically rigorous in its methods, but then turning around and saying this is just a "news" piece whenever any methodological criticism is discussed. For instance, when readers commented about both the lack of peer review and the biased sampling of journals used for the "sting" operation for Bohannon's article, this was a response by John Travis (managing editor of News for Science magazine):

I was most interested in the fact Science (the journal) had an online panel consisting of Bohannon, Eisen and David Roos (as well as Jon Cohen Moderating) to discuss these issues. Much of it (especially in the first half hour) is worth watching, I think it is important to point out that Bohannon suggests he did not realize how his use of only OA journals as part of the sting operation would be viewed. He suggests that he meant this as largely a sting of the predatory journals, and that if he did it again he would have included the subscription based journals as a control group. You can watch it and decide for yourself.

The panelists also brought up two other important points that seem to not get discussed as much in the context of open access vs. subscription models for paying for publication or for peer review.

First, many subscription based journals (including Science) have page charges and/or figure charges that the author of the manuscript pays to the journals. As discussed among the panelists (and I have personal experience with paying for publication of my own research), these tend to be in the same ballpark as for the publication of open access papers. Thus the "charge" that the financial model for publication for OA journals would lead to more papers being accepted is true for many of the subscription journals as well (in particular for journals that are entirely online).

Second (and the useful point to come out of Bohannon's piece) is that there are clear problems with peer review being done sufficiently well. One suggestion that was made by both Eisen and Roos (and has been suggested many times before) is that the reviews provided by the peer referees of the manuscript and the editor could be published alongside (or as supplemental data on figshare) the accepted manuscript, so that all interested readers can assess the extent to which peer review was conducted. Indeed there are a few journals which already do this such as PeerJ, Embo J, ELife, F1000 Research, Biology Direct and some other BMC-series (see here for an interesting example), Molecular Systems Biology, Copernicus Journals. Thanks to folks on twitter for helping me put together this list!

This latter point (providing the reviews alongside published papers) seems to be so trivial to accomplish, and the reviewers names could easily remain anonymous (or they could provide their names providing a degree of academic credit and credibility to the scientific community) if so desired. So why has this not happened for all scientific journals? I am quite curious about whether there are any reasons NOT to provide such reviews?

Friday, September 13, 2013

Learning something new: Integrating computer vision into my research.

I recently started my sabbatical year, where I get a chance to re-tool myself, and my research knowledge. The game plan is to learn how to integrate computer vision (machine vision) and machine learning approaches into my research, in particular with respect to the study of animal behaviour and the analysis of images (and videos). We study the evolutionary genetics of complex phenotypes in my lab. While this used to (mostly) mean the complex structure of the shape and size of fruit-fly wings, we are moving more into the study of animal behavior. In particular how flies evade and escape being eaten by predators (more on that at a later date).

The analysis of such data (both huge sets of wing images as well as video, which is effectively a series of images) can be time consuming and what can be done manually is somewhat limited (such as with JWatcher), in particular if you want to do "high throughput" work with many samples. I have over the past few years interacted, and begun to collaborate with scientists who are using all sorts of techniques from computer vision which have amazed me, both with respect to the speed of the analysis, but also the detailed information gleaned from such approaches. So I am trying to get up to speed and see how to utilize these approaches for my own work.

To that end I will be now posting about this experience (as well as all of the more usual genetics). This will include useful new tidbits, programming scripts, software I have tried (and tutorials), books, and anything else I can think of. Basically my research progress journal for this new endeavour. I hope that this will help me stay nice and organized, and perhaps will be useful more generally. If you start to follow this thread, and have suggestions for anything, please let me know in the comments or on twitter.

More to follow soon!

Our new pre-print: An integrative genomic approach illuminates the causes and consequences of genetic background effects

This is a guest post by Dr. Chris Chandler. Cross posted from Haldane's Sieve.

Biologists have long recognized that a mutation can have variable effects on an organism's phenotype; even introductory genetics classes often make this observation by introducing the concepts of penetrance and expressivity. More mysterious, however, are the factors that influence the phenotypic expression of a mutation or allele. We know, for instance, that introducing the same mutation into two different but otherwise wild-type genetic backgrounds can result in vastly different phenotypes. But what specific differences between these two genetic backgrounds interact with the mutation, and how? And how does gene expression fit into this puzzle? Answering these questions has not been an easy task, which is not too surprising when you realize that penetrance and expressivity are, in reality, complex quantitative traits. We therefore adopted a multi-pronged genetic and genomic approach to tease apart the mechanisms mediating background dependence in a mutation affecting wing development in the fly Drosophila melanogaster.

The phenotypic patterns seen in our model trait have already been characterized: the scalloped[E3] (sd[E3]) mutation has strong effects in the Oregon-R (ORE) background, resulting in a tiny, underdeveloped wing, while its effects in the Samarkand (SAM) background are still obvious but much less extreme, resulting in a blade-like wing.

To try to find out what causes these differences, we generated and combined a variety of datasets: whole-genome re-sequencing of the parental strains and a panel of introgression lines to map the background modifiers of the sd[E3] phenotype; transcription profiling (using two microarray datasets and one RNA-seq-like dataset), including analyses of allele-specific expression in flies carrying a "hybrid" genetic background; predictions of binding sites for the SD protein, which is a transcription factor; and a screen for deletion alleles that enhance or suppress the sd[E3] phenotype in a background-dependent fashion.

Our results point to a complex genetic basis for this background dependence. We found evidence for a number of loci that are likely to modulate the effects of the sd[E3] allele. However, some unexpected inconsistencies provide a cautionary tale for those intending to take a similar mapping-by-introgression approach for their trait of interest: do multiple replicates, and introgress in both directions, or you may inadvertently end up mapping some other trait! Although the number of candidate genes we identified were generally large, by combining those results with data from our other datasets, we were able to narrow our focus to those showing a consistent signal, yielding a robust set of candidate genes for further study. Without getting into too much detail, we also used a novel approach to show that background-dependent modifier deletions of the sd[E3] phenotype (of which there are many) involve higher-order epistatic interactions between the sd[E3] mutation, the deletion, and the genetic background, rather than quantitative non-complementation (so more than two genes were involved).

Overall, we think that an integrative approach like this could be useful for others trying to understand complex traits, including genetic background-dependence of mutations. In addition, if you're a Drosophila researcher working with the commonly used Samarkand or Oregon-R strains, our genome re-sequencing data (raw and assembled), including SNPs, will soon be available in public repositories for genetic data.

Go to the pre-print of the paper on arXiV.

Wednesday, July 31, 2013

Evolution and Core Processes in Gene Regulation – some “terminal” thoughts (guest post from David Arnosti)

Guest appearance by David Arnosti here; I’m Ian’s colleague at Michigan State University and co-organizer for the ASBMB Special Symposium.

Our symposium last week brought together some folks who don’t regularly appear at the same conferences, including the structural biologists and biochemists who’ve devoted a good part of the last 30 years to unraveling the mechanisms of the central dogma – gene transcription, RNA metabolism, and protein translation. The trend has been to identify the central machinery, publish high-impact papers in single-word-title journals, and then years later discover that things don’t necessarily work the same in all contexts, organisms, developmental settings. Not surprising in eukaryotic transcription, for instance, where the “basal machinery” comprises ~200 proteins. Zach Burton, conference participant, refers to this complexity as a “molecular horror”, but from an evolutionary perspective, it is also an opportunity.

With respect to tracking the changes inherent in functioning of regulatory circuitry, we have more complete understanding of how bacterial systems work. Nice presentations by Saeed Tavazoie and Eduardo Groisman highlighted how easily such systems can transition between regulatory states, with just a few genetic changes in regulatory factors. Robert Landick and Seth Darst, in studies of E. coli RNA polymerase, pointed out that this bacterium actually features a novel outgrowth, an insertion of 188 amino acids that connects directly to the conserved trigger loop in the catalytic site. Nobody knows what it is doing there, but it is lineage specific, and mutations accumulate in this part of the protein when cells are grown under nutrient limitation, suggesting a short-circuit way to globally fiddle with gene expression.

Eukaryotic systems likewise have novel structures to their core machinery; Lawrence Myers described the gene amplification of Mediator subunits in Candida albicans that are linked to pathogenicity, while Jean-Marc Egly pointed out the pervasive effects of mutations in human Mediator and transcription factor TFIIH. With eukaryotes, the importance of variability in core machinery for generating important changes on the population or species level is obscure. Studies such as those presented by Ian Dworkin (host of this blog) and Aviv Regev showed how we are able to identify numerous loci involved in genetic background effects relating to development and immune function – but the overall importance of pervasive impacts generated by changes in an RNA polymerase subunit, for instance, vs. subtle changes in an enhancer is not clear. One mystery I dropped on the conferees was the special features of the RNA Pol II CTD found in Drosophila, but not other eukaryotes – a reflection of their unique developmental gene expression, discussed by Melissa Harrison and Julia Zeitlinger?

Aside from being mistaken once for Bill Gates by a visitor from Shanghai while walking around the University of Chicago, I was able to maintain my identity as a gene regulation specialist who takes to heart Theodosius Dobzhansky’s mantra that “Nothing in Biology Makes Sense Except in the Light of Evolution”. We will see how the combination of biochemical detail and evolutionary perspective can propel us into a future where gene regulation, in all its rich variation, makes sense. Many thanks to Joan Geiling and Barbara Gordon from the ASBMB for making this an outstanding conference!

Friday, July 26, 2013

Wrap-up for day one of: Evolution and Core Processes in Gene Regulation

Yesterday was the first day of the meeting on the "Evolution and core processes in gene regulation". A small conference (~75 people) interested in various aspects of gene regulation. As I mentioned in a previous post, the participants represent a really diverse mix of biologists interested in gene regulation (how, where why genes are turned on and off), including many who do not speak a common "scientific" language.

To start with, Chicago surprised all of us by having truly pleasant weather for late July. Crisp, dry and not too hot. That always helps perk everyone up.

The diversity of talks was represented right from the very first session with talks ranging from using experimental evolution to examine questions relating to evolution of regulatory function (Saeed Tavazoie), to very mechanistic analysis of enhancers that regulate different tissues in different ways (from Scott Barolo). I did some fairly extensive tweeting (@IanDworkin, #genereg) on what was presented. However there were a few highlights (for me and my own work).

The work from the Barolo lab tearing apart the enhancers of the Patched gene (regulated by the transcription factor Gli mediating hedgehog signaling pathway) was really neat and I am still mulling around the findings in my head. We know that some proteins bind to DNA, and some of these proteins (called transcription factors) help to regulate gene expression. Sometimes turning genes on, sometimes turning them off. In this case the transcription factor Gli (or as we call it in Drosophila, cubitus interruptus) can take on two forms, one called an activator, the other as a repressor. It turns out that while both of these forms bind to the same general binding sites (DNA words that sometimes can be spelled a bit differently analogous to "colour" and "color"), how tightly they bind (depending on the spelling) matters. More importantly it matters that some of the sites allow the protein to bind tightly, and others weakly. Having the weak binding sites matters (and it matters that they are weak).

I also really enjoyed the approach that Nir Yakoby was taking to understand the evolution of developmental mechanisms underlying how the shell of the egg for fruit flies was patterned. His lab is taking both a cool approach to thinking about regions of gene expression (gene X is expressed here, here and here, but not here), and how it relates to evolution of the eggshell.

Justin Fay (one of the organizers), also talked about one of his model systems using yeast, and how variation throughout one particular gene (FZF1) was important for yeast to grow in the presence of sulfite (think wine - yeast - sulfites....). The take home message (for me anyways) was that the whole gene had a signature of natural selection (between species), and that the differences in the DNA in many places across the gene seem to contribute to the effects with respect to how the yeast can grow with sulfite.

There was far more to the first day to the meeting (check the twitter feed for details) and I can not do it all justice, but hopefully this gives you a small flavour for the meeting. If you attending, post about what got you excited.

Wednesday, July 24, 2013

Genetics really is hard (to interpret)

I am sure this will not surprise most of you, but genetics research can be really hard. I don't simply mean that doing genetics experiments is hard (which it can be), but interpreting the results from genetic analysis can be difficult. This post is about an interesting story involving the analysis of a a gene called I'm not dead yet (Indy) in the fruitfly Drosophila (one of the geneticists favorite organisms) and its role in extending lifespan. This story, that has taken place over the past decade has taken a number of interesting twist and turns involving many of the subjects that I like to discuss in this blog and my own work, including trying to make sense of the results from genetic studies, the influence of factors like genetic background and environment on mutational effects, and of course Drosophila itself. While I do not study lifespan (longevity), I have been interested, and following the story for this research over the past 5-6 years because of the implications of the influence of genetic background effects (which I do work on). I should also mention that other than being a geneticist I do not claim to have any great knowledge of the study of aging, but I will do my best on that.

I hope in this (and future) posts to accomplish a few things, so I thought I would lay them all out first (in case I start to ramble off in strange directions).

Describe a cool story about something important to just about everyone (who does not want to find out how to live longer).
Discuss the means and logic of how genetic analysis. That is how we (geneticists) go about figuring out whether a particular gene (or variant of a gene) influences something we care about (like how long we live).
Context matters a lot for genetic analysis. Factors like the food used to feed your critters (among many others factors), and the genetic background (of the critters) that the mutation is studied in can profoundly change what you see (the results).
Scientists, even when making honest efforts to perform good, reproducible research can get different results because of seemingly subtle differences in 2&3.

Not surprisingly, many scientists are interested in the biology of aging, and in particular in what factors influence longevity. In addition to it being very cool, and of obvious importance to many people on the planet, it is also important for aspects of evolutionary theory. The point being that many scientists are interested, and approach questions of aging from many different perspectives, which is great. It is also not surprising that geneticists (and again the general public) are interested in finding genes that influence the aging process (why do some people live longer than others). So in the year 2000 (you know, when all of our computers did not shut down) when a paper entitled "Extended life-span conferred by cotransporter gene mutations in Drosophila" came out, there was a lot of buzz. The basic results suggested that reducing the function or expression of a particular gene, Indy increased how long fruit flies lived. While we (the people) are not fruit flies, by the year 2000 research had already clearly demonstrated that there were many shared genes in all animals (including people and flies), and many seemed to have pretty similar functions. Thus explaining the excitement and buzz. By the way, Indy is short for "I'm not dead yet", and if you do not get the reference check this out (start at 0:58 if the two minutes is too long), or here if you prefer it in musical form, or here as a cartoon.

So what did they do in this study? The punchline is that using multiple, independently generated mutations they demonstrated that as you reduced Indy expression and function, the fruit flies lived for a longer time (increased longevity) when compared to the fruit flies with normal (wild-type) copies of the Indy gene. Seems straightforward enough, and by using multiple independent mutations they demonstrate (at one level) the repeatability of the results. That is, there results are not some strange one-off random results, but can be reproduced, which provides some degree of generality to these results.

Of course, results are rarely so simple and clear, and with additional investigations layers of complexities are often demonstrated. Studying longevity can be particularly difficult, and not only because you will have to wait a long time to see when something dies of natural causes.

So does Indy actually influence lifespan? The short answer is that the results from follow up studies have been pretty mixed, so it is perhaps not as clear as hoped from the original study. More on that soon in subsequent posts!

References and links if you want more information from the original studies
Rogina B, et al. (2000) Extended life-span conferred by cotransporter gene mutations in Drosophila. Science 290:2137–2140.

Toivonen, et al. 2007. No influence of Indy on lifespan in Drosophila after correction for genetic and cytoplasmic background effects. PLoS Genetics. 3(6):e95

Wang, et al. 2009. Long-lived Indy and calorie restriction interact to extend life span. PNAS USA. 106(23):9262-7. doi: 10.1073/pnas.0904115106

Toivonen JM, Gems D, Partridge L. Longevity of Indy mutant Drosophila not attributable to Indy mutation. Proc Natl Acad Sci USA. doi: 10.1073/pnas.0902462106.

Helfand, SL. et al., 2009. Reply to Partridge et al.: Longevity of Drosophila Indy mutant is influenced by caloric intake and genetic background. 106(21): E54. doi: 10.1073/pnas.0902947106.

Frankel. S. & B. Rogina. 2012. Indy mutants: live long and prosper. Frontiers in Genetics. 3(13). doi: 10.3389/fgene.2012.00013

Rogina B, Helfand SL. 2013. Indy mutations and Drosophila longevity. Front Genet. 4:47.
doi: 10.3389/fgene.2013.00047

From the meeting ASBMB:Evolution and Core Processes in Gene Regulation - Chicago (July 25th-28th)

Well this will be something new for me. I will be both blogging and tweeting from the meeting on the evolution and core processes in Gene Regulation from Chicago over the next few days. I will also be giving a talk there as well. Understanding how and why genes are turned on (or off) both in space (different parts of your body), and time (when you are growing, or as you age) remains an important question that spans a great deal of biology. It turns out that even when asking seemingly similar questions related to gene regulation, the approaches (and language) that say a biochemist or an evolutionary geneticist use can be completely different (and mutually incomprehensible). So meetings like this are really important to getting us all on the same pages and interacting with one another.

This meeting is being organized by David Arnosti, Justin Fay and Ilya Ruvinsky. The three of them (all great scientists and fun to talk with) are bringing together a very diverse group of scientists from hardcore biochemists to straight up evolutionary biologists. We did a related symposium back in 2011, and also in 2008 (at Michigan State University). It was a lot of fun then. Should be more fun (for me anyways) now, as I am not one of the organizers this time. In any case the program for the meeting can be found here.

David, Justin and Ilya also sent this to all of the speakers (to think about for their talk):

We understand that the DNA-RNA-protein dogma is sculpted by evolutionary forces, but where have experimental insights shown HOW particular features of the central machinery are shaped? Some aspects of the process of gene expression have been tightly linked to evolutionary innovations, such as variable HOX expression and limb morphology.

In your system, which aspects, if any, of the process studied are known to reflect selective pressure? For example, for central components of gene expression machinery, are there particular features that can be linked to specific conditions or attributes of the species/developmental stage/tissue? Alternative tissue-specific or species-specific components, activities?

If you were founding a new research institute dedicated to understand how evolutionary selection acts at all levels of gene expression, how would you combine nuts-and-bolts mechanistic studies with Big Picture evolution research?

What technical breakthroughs would make the biggest impacts on such efforts? Let's think big; e.g. mosquito blood meals preserved in amber (OK, that was a joke).

Tweets start tomorrow afternoon (Thursday July 25th at around 2PM). I also hope to be doing a daily summary of the meeting on this blog. Time to get my talk ready!

Friday, January 25, 2013

Experimental test of a genetic constraint hypothesis

We have recently posted a (heavily revised) manuscript to arXiv detailing how we used the fruit fly Drosophila melanogaster (you can read here about why these little flies are so wonderful) to test a particular hypothesis about a genetic constraint, and more generally how our knowledge of development may inform us about the structure of the genetic variance-covariance matrix, G. Also we developed a really cool set of statistical models that evaluated our explicit hypotheses (more on that right at the end of the post)!

As a quick reminder (or introduction), G summarizes both how much genetic variation particular traits have, as well as how much traits co-vary genetically. This covariation can be due to "pleiotropy" which is a fancy word for when a gene (or a mutation in that gene) influences more than one trait. ie. a mutation might influence both your eye and hair colour). These traits can also covary together when two or more alleles (each influencing different traits) are physically close to each other (linked) and recombination has not had enough time to break these combinations apart. I highly recommend Jeff Conner's recent review in Evolution for a nice review of these (and other concepts related to some issues I discuss below).

Evolutionary biology, in particular evolutionary quantitative genetics thinks a lot about the G-matrix, and how it interacts with natural selection (or drift) to generate evolutionary change. This is summarized by the now famous equation linking change in trait means(Δz̄) as a function of both genetic variation (and covariation) and the strength of natural selection (usually measured as a so-called selection gradient, β). This is the multivariate (more than one trait) version of the breeders equation (made most famous by all of the seminal work by R. Lande).

Δz̄=Gβ

Why do we care so much about this little equation? It encapsulates many pretty heady ideas. First and foremost that you can not have evolutionary change without genetic variation. That's right, natural selection by itself is not enough. You can have very strong selection for traits (such as running speed) to survive better with a predator around, but if there is no heritable variation for running speed, no (evolutionary) change will happen in the proceeding generations (and good luck with that tiger coming your way). However, once you have to consider multiple traits (running speed, endurance and hearing), we have to think about whether there is available genetic variations for combinations of traits, and whether these are "oriented" in a similar direction to natural selection. If not, it may be that evolutionary change with be slowed considerably (even if each traits seems to have lots of heritable variation). Of course if the genetic variation for all of these traits is pointing in the same direction as selection, then evolution may proceed very quickly indeed! The ideas get more interesting and complex from there, but they are not the for this discussion (the paper above by Jeff Conner, and this great review by Katrina McGuigan are definitely worth reading for more on this).

In any case, much thought has been given to how this G matrix can change both by natural selection and by other factors such as new mutation. Depending on how G changes, future evolutionary potential might change, which is pretty cool if you think about it! How might G change then? These are important ideas, because while we can estimate what G looks like, and how it might change (in particular due to natural selection), it is much harder to know what it will look like far in the future, making our ability to predict long term evolutionary change more difficult.
So what might help us predict G? One idea is that our knowledge of developmental biology will help us understand the effects of mutations, and thus G. If so, developmental biology could be a particularly powerful way of predicting the potential for evolutionary change, or lack there of (a so called developmental constraint).

To test this idea, I decided to use a homeotic mutation. Homeosis is the term used for when one structure (like an arm) is transformed (during development) to another (related) structure like a leg. In fruitflies homeotic mutations are the stuff of legend (and nobel prizes), in particular for the wonderful cases of the poor critters growing with legs (instead of antenna) out of their heads, or four winged flies. You can see wonderful examples of mutations causing such homeotic changes in flies and other critters here.

In our case we used a much weaker and subtler homeotic mutation Ubx¹, which causes slight, largely quantitative changes. For example with this mutation, the third set of legs on the fly would be expected to resemble (in terms of lengths of the different parts of the leg) the second set of legs (flies like all insects have 3 sets of legs as adults). We wanted to know whether when we changed the third legs to look like second legs, would the G for the transformed third leg look that of a normal third leg or a normal second leg? Thus we were trying to predict changes in G based on what we know (a priori) of development and genetics in the fruitfly.

So what did we find? The most important points are summarized in figure 2 and table 3 (if you want to check out the paper that is). The TL'DR version is this: Yes, the legs homeotically transformed like we expected, but G of the mutant legs did not really change very much from that of a normal third leg. In other words, our knowledge of development did not really help us much in understanding changes in G. There are a few reasons why (which we explain in the paper), but I think that it is an interesting punchline, and I will leave it up to you to decide what it means (and if our experiment, analysis and interpretation are reasonable and logically consistent).

I also really want to give a shout out to one of the co-authors (JH) who developed the particular statistical model that we ended up using. He developed a set of explicit models that really helped us test our specific hypotheses directly with the data and experimental design at hand. This is sadly rarely done with statistics, so it is worth reading just for that! I really think (hope?) that this combination of approaches can be very useful for evolutionary genetics. Let me know what you think!

Thursday, January 24, 2013

How and what to teach in undergraduate genetics

Here at Michigan State University, we are considering how to "fix" the primary undergraduate Genetics class. Why does it need to be fixed? Many reasons. For instance it has for many years been taught with little "institutional memory" from semester to semester. So what concepts are covered (and how) may depend heavily on when the students have taken it. This class is taught each semester (fall, spring and summer) with enrollments exceeding 300 students, and is required for practically every life sciences undergraduate major across many departments and colleges at the University. Indeed in my college alone (Natural Sciences) ~75% of the 4800 UGs in the college are in biological disciplines. Thus there is an extremely wide diversity of backgrounds, in particular with respect to basic quantitative skills. It is also generally a poorly regarded course from the perspective of students, and is seemingly considered a "weeder" course where the hopes of many pre-med students are crushed (the course currently does not require calculas or physics as a pre-requisites, which at least when I was an UG, represented the sieve courses).

While we are just at the beginning of this process (and we are just starting to collect information ) I already have a number of questions that I am trying to make sense of, and I would really appreciate feedback from everyone, especially people who have already been involved with a similar process at other schools

I will probably write about all of these questions (and what I am thinking on each one) in the future, but for now I will just get them down.

So my questions for the moment (let me know if you have any others I should be thinking about.

What sorts of background/ pre-requisites are reasonable for a "fundamentals of genetics"? Just 1st year biology? chemistry? physics? calculus? stats?

There has been a lot of recent discussion on the concepts (and the order that they should be taught), most notably the recent paper by Rosemary Redfield, as well as her blog about teaching genetics. This has also generated a lot of useful discussion (here and here for example). I have reviewed several proposals for genetics textbooks, so many other organizing principles are also being used. Once I have organized my own thoughts I will write my own thoughts on this. I am curious what has worked (or has not worked) as well. Thoughts?

Who is the target audience for a genetics course? Unlike introductory level courses (biology, physics, calculus) genetics is often taught as a second or third year course (here at MSU it is a 300 level course). Usually such more fundamental disciplinary courses are being taught from a disciplinary perspective. However, the audience for Genetics seems far broader. In particular many students hoping to be involved in medical sciences. To whom do we teach? Those fundamentally interested in biology in general, or genetics in particular? Or to the much broader audience who include many who have no desire to be in the class (but have to to fulfill their degree while trying to get into medical school)? Is there a happy medium? Are two different classes (one for each audience) a better idea?

Thoughts?

Wednesday, January 23, 2013

Some further thoughts on "risky" research and the culture of science

This is just some further thoughts on an old post regarding the New York Times article " Grant system leads cancer researchers to play it safe". In that post I mulled over the idea that the mentoring process for young scientists trains us (as a community) to be hyper-critical and skeptical. Now of course scientists are individuals, and we vary a lot. Indeed there are lots of scientists who tend to be optimists, and take their (and other peoples) data at face value, while others spend their careers taking apart the ideas of others. There is of course room for all of these approaches. We need the creative spark of people to generate new models, and data to test them, and other scientists who test the logic or validity of these ideas and models. This is part of what makes the scientific process work so well. But, how might this affect the potential funding of risky "science"? Given that there are limited resources available to fund science research, if one reviewer of a proposal is highly skeptical of the ideas, while all of the other reviewers like them, will that be enough to have the proposal rejected for funding?

I am certain that if I "polled" many of my fellow scientists, they would all point to at least one proposal they submitted that failed to be funded based on one review, while all of the other reviewers loved it. It is not so different from what going onto Rottentomatoes. There are some movies where many reviewers love it, while others hate it. Indeed I have never seen a movie reviewed where there is complete agreement. Not surprisingly, the same is true for the scientific review process (although I would hope for different reasons).

However, this has all made me think about the differences in the way countries provide public funds for scientific research. In particular, in the U.S., the funding system tends to have both strong "boom-bust" cycles, naturally tied to the economy as a whole, but also strongly tied to fads in scientific research (sometimes called "sexy science"). Now, we are only human, and while nerdly as it may be, scientists can be enamoured by new and very interesting findings. Naturally this leads to many other scientists to want to join into this new area, and when grant proposals are reviewed on this research, the reviewers may themselves be entranced by the ideas, and pin their own hopes for future research successes on these new ideas or methods or approaches.

Indeed in my own field of Genetics, I have watched such transformations occur numerous times in my relatively short experience working in the field. This has happened both due to changes in technology as well as statistical methodology ( more on this in a future posting). In each instance, the same basic pattern emerged. First there was almost unanimous excitement and hope that these new approaches would solve all sorts of persistant problems in the field (for instance finding the set of genes that contribute to disease X). Shortly after, there were a few dissenting voices (largely ignored) that pointed out some of the shortcomings of the approach or method. Then in the next 2-3 years, as more and more people used these approaches or methods (or tested these new ideas), more and more issues were uncovered. And just then, when hope was beginning to fade, a new idea/method/technology was discovered, and so the cycle continued....

So how does all of this affect the funding for "risky research". Honestly I do not know. But I think it is worth considering. Any thoughts?