genomics

And by hot, I mean employable. I'll get to that in a bit, but I first want to relate some history. Back when I was a wee lil' Mad Biologist, and molecular population genetics was in its infancy, there was a brief period where people had to be convinced that this stuff was useful (it was). Then it became fashionable, and the 'early adopters'--people who were regularly using PCR and clone-based sequencing (followed by S35 sequencing)--became hot intellectual commodities for about five years. Then the field became crowded, but 'good' molecular population geneticists (whatever 'good' means)…
The exciting thing about the recent technological advances in genomics is that we have a massive amount of data. The terrifying thing about the recent technological advances in genomics is that we have a massive amount of data. A while ago, I brought this up in the context of bacterial genomics: Most of the time, when you read articles about sequencing, they focus on the actual production of raw sequence data (i.e., 'reads'). But that's not the rate-limiting step. That is, we have now reached the point where working with the data we generate is far more time-consuming... So, from a…
SEQUENCE GENOMES!!! Proflikesubstance has a very good post about PR announcements in science, which stems from the duplicated sequencing of the cacao and Tasmanian Devil genomes. What struck me is this bit: What also seems ridiculous to me is that there are TWO groups sequencing either of these genomes. I can understand the race for the human genome and maybe even things like fruit fly and Arabidopsis, but since when did the Tasmanian devil fan club go all cut throat? And I like chocolate as much as the next person, but two genome sequences*? It's hard to tell whether this is competition or…
A couple of weeks ago I attended the Human Microbiome Research Conference. At that meeting, one talk by Bruce Birren (and covered by Jonathan Eisen) mentioned something that was completely overlooked by the attendees. Now, I don't blame them, since what Birren mentioned was about bacterial genomics, not the human microbiome. But here's what I tweeted about Birren's talk (TWEET!): B. Birren-E. coli K-12 can be assembled into 1 scaffold for hundreds of $s with Illumina seq & new jumps Let's unpack this below the fold. When we sequence a genome, we actually sequence small pieces (with the…
One of science's saving graces is that a fair number of scientists will publicly admit that they are wrong (and then there's Marc Hauser*...). Last week, at the Human Microbiome Project meeting, Jonathan Eisen gave a talk about the GEBA project which is an effort to sequence the genomes of a diverse group of bacteria to create a bacterial genomic encyclopedia. At one point during his talk, Eisen mentioned that originally all of the genomes in the project were to be finished, although that standard has been relaxed. Eisen then noted that with the new sequencing technologies, it's feasible…
So, Nature Reviews Genetics has an article, "Computational solutions to large-scale data management and analysis", which claims the following in the abstract (italics mine): Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale,…
One of the mysteries of genome-wide association studies ("GWAS") is the problem of 'missing heritability': quantitative genetics indicates that a trait (e.g., height, heart disease) has a significant genetic component, but the genetic variation we can link to that trait only explains a small amount of the suggested heritability. Christophe Lambert describes why he thinks GWAS hasn't had that much success so far: One major limitation is that the microarrays used in most major GWAS efforts to date employ common genetic variants originally identified in a rather small number of presumably…
Last week, I wrote about the problems facing genomics and the concept of ownership of data. While I am sympathetic to researchers' career needs under the current system, I don't think we can, in good conscience, let that get in the way of rapid data release, especially in applied areas. I, and others, cast this as a conflict between individual researchers and the larger community, but there was a third part that was missing: universities. To the extent that universities care about--and desperately need--grants, altering how funders determine what a successful outcome is critical. That's…
At least if we're talking about microbes. Nick Loman has a good piece summarizing the state of genome sequencing technologies. It's pretty accurate, although I'm always skeptical about capabilities until I've seen it function. But from my point of view, which is focused on microbial genomics, the actual sequencing--determining the nucleotides on a piece of DNA--is already incredibly cheap. In that context, there are two models that seem relevant: 1) Really rapid sequencing. Here, you could have a bacterial genome in a matter of hours, even if it's not that efficient in terms of cost…
Well, someone at ScienceBlogs had to draw down on Scientopia, and it might as well be the Mad Biologist. I was going to respond to this post by proflikesubstance about genomics and data release in a calm, serious, and respectful manner, and, then, I thought, "Fuck that. I'm the Mad Biologist. I have a reputation to uphold." Anyway, onto genomics and data release. Proflikesubstance writes: I learned something interesting that I didn't know the sharing of genomic data: almost all major genomics centers are going to a zero-embargo data release policy. Essentially, once the sequencing is…
So, in some quarters, there's been wailing and gnashing of teeth over the Congressional hearings about the direct-to-customer ('DTC') genetic testing industry. I've discussed why I don't think regulation is a disaster before, but I'll add one more issue to the mix: maintaining subject confidentiality in NIH genomic studies. If someone related to a person in a study publicly releases his or her genome, it could be possible to identify a 'deidentified' anonymous subject. I'm curious to see how that issue shakes out. But that's not what this post is about. Instead, I propose that the DTC…
One of my hobbies lately has been to get either RNA seq or microarray data from GEO and do quick analyses. Not only is this fun, I can find good examples to use for teaching biology. One of these fun examples comes from some Arabidopsis data. In this experiment, some poor little seedlings were taken out of their happy semi-liquid culture tubes and allowed to dry out. This simulated drought situation isn't exactly dust bowls and hollow-eyed farmers, but the plants don't know that and most likely respond in a similar way. We can get a quick idea of how the plants feel about their situation…
John Hawks, in his paleodreams. I mean that in the best way. John Hawks bumps into a prescient estimate of the total gene number in humans: While doing some other research, I ran across a remarkable short paper by James Spuhler, "On the number of genes in man," printed in Science in 1948. We've been hearing for the last ten years how the low gene count in humans -- only 20,000 or so genes -- is "surprising" to scientists who had previously imagined that humans would have many more genes than this. So here's the next to the last line of Spuhler's article: On the basis of these speculations…
This post might put me at odds with much of the science bloggysphere, but I think a lot of the concern over congressional and FDA hearings over whether 'over-the-counter' genetic screening and genome sequencing should be regulated is overblown. Maybe my feelings are influenced by my concerns about the misuse of antibiotics--in the case of cefquinome, the FDA did the right thing (and I only wish they had acted sooner). Consider this article which was recommended by a ScienceBloging, the crux of which is: The controversy here rests on a single assumption: That the typical Walgreen's customer,…
Yes, you read the title correctly--I'll get to that in a bit. Nicholas Wade's article about the Human Genome Project (HGP), "A Decade Later, Genetic Map Yields Few New Cures" has been getting a lot of play. Thankfully, ScienceBlogling Orac summed up perfectly my thoughts about both the science and hype surrounding the HGP, so I don't have to. But this post by Mike Mandel has been getting some play: My nomination for the most significant economic event of the past decade: The failure of the Human Genome Project to thus far deliver medically significant results. I dunno: the collapse of…
Biosafety has been on everyone's mind this week after the announcement of the J. Craig Venter Institute's successful transplantation of a synthetic genome. What horrible pathogen will future bioengineers be able to design? What unforeseeable environmental catastrophe will befall us upon the release of genetically engineered bacteria? These are hugely important questions as research in synthetic biology moves forward, being discussed in congressional hearings and as an integral part of every new synthetic biology design. As the major proposed goal of a great deal of synthetic biology research…
I had the pleasure of chatting with John Hawks about the two big science news stories of the past few months, the synthetic genome and the Neandertal genome, for Science Saturday at bloggingheads.tv. John is a professor of anthropology at the University of Wisconsin who studies population genetics of ancient humans, as well as a terrific teacher. I learned a lot of really fascinating things about how people study fossils and trace human evolution and it was interesting to find some connections between the two stories! As he mentions on his blog, we didn't once mention synthetic Neandertals…
...assembly and analysis. From the depths of the Mad Biologist's Archives comes this post. The Wellcome Trust has a very good (and mostly accurate) article about the 'next-gen' sequencing technologies. I'm going to focus on bacterial genomics because humans are boring (seriously, compared to two bacteria in the same species, once you've seen one human genome, you've seen them all). Most of the time, when you read articles about sequencing, they focus on the actual production of raw sequence data (i.e., 'reads'). But that's not the rate-limiting step. That is, we have now reached the point…
Selling a work fiction is difficult; publishing in Nature is a long-shot; yet somehow writer and genomeboy Misha Angrist managed to publish fiction in Nature. The only way I was ever going to get a first-author publication in Nature [Angrist explains] was if I just made it all up. So thatâs what I did. Hat tip to David Dobbs for providing the scientific inspiration. The short story/fantasy Angrist publishes actually pulls little, it seems to me, from my story about the orchid/plasticity/differential susceptibility hypothesis, though it does work ground seeded by both genetics and…
Genome Biology recently published a review, "The Case for Cloud Computing in Genome Informatics." What is cloud computing? Well: This is a general term for computation-as-a-service. There are various different types of cloud computing, but the one that is closest to the way that computational biologists currently work depends on the concept of a 'virtual machine'. In the traditional economic model of computation, customers purchase server, storage and networking hardware, configure it the way they need, and run software on it. In computation-as-a-service, customers essentially rent the…