What's encoded in your genome

So in previous posts I've written:

How to think about biology, Life is full of machines and Life and information. I guess I'm on some philosophy of Biological study kick.

Now I'll put the pieces of the puzzle and talk about what those proteins encode in the typical mammalian organism. This will go a long way to explaining how these machines promote what has been called evolvability. But what is evolvability? Here I am using the term as the ease of which a system can evolve phenotypically in response to natural selection. Going back to my first essay, I had emphasized the idea that the connection between genotype and phenotype is not straightforward. Here is a flowchart that illustrates the problem:

INFORMATION (Genotype, DNA) => MACHINES (Proteins, ribozymes etc. ) => NETWORK of machines => PHENOTYPE

So what types of machines are encoded by the human genome? And how does this impact evolvability?

You might think that the majority of our genes specify proteins that are mostly directing how nerve connections are made, after all the brain is the most complicated thing ever made, but if you did think this, you would be wrong. Are there genes for intelligence? Beauty? Wit? Climbing trees? Obesity? Cystic fibrosis? Not a chance. One way of thinking about this is problem to expand the last part of that flowchart.

MACHINES => NETWORK of machines => Cellular behavior => tissues => organism

Sure there are genes whose function may impact these phenotypes, but the primary role of each genes is to create a machine that acts within the cell. Even the "Cystic fibrosis gene" or CFTR has a particular cellular function - it's a channel that lets charged chlorine atoms pass across the plasma membrane. In the end each proteins or functional RNA is part of a network of machines that generates a cellular behavior. These networks dictate how the cell interacts with its neighbors and with the environment.

The main goal of modern molecular biology (and biochemistry, cell biology ...) is to understand how the numerous molecular machines found in the typical cell fit into these networks and how these networks generate cellular behaviors (i.e. phenotypes).

Now let's dive into the genome ...

First we'll perform a simple exercise. We'll begin by looking up the number of metabolic enzymes in the mouse genome using a Go Ontology Browser. According to this rough measure there are about 7,000 genes that are responsible for making and breaking down molecules. This includes sugars, lipids, amino acids, nucleic acids and even proteins. If we estimate that the mammalian genome has 20,000 genes, that's already a third of the total proteins that are dedicated to these basic activities. Let's take one of these metabolic processes, the translation of mRNAs into proteins by the ribosome (i.e. protein synthesis). Excluding tRNA, there are 432 genes alone involved in this activity (roughly 2% of the genome). Lump in the tRNAs (about 400 extra genes) and the number goes up to ~4%. So out of the entire genome close to 1 out of every 25 genes code for machines that help to convert mRNA information into protein. And this doesn't include all the other genes that code for proteins that modulate the activity of the ribosome!

This newly made protein, it just doesn't hang out, it needs to be properly folded up by another set of machines called chaperones (131 genes for protein folding) and then transported to the right place (721 genes). If the protein is misshaped due to a mistake in production, or because of misfolding, it has to be destroyed (634 genes). Sometimes extra modifications like a methyl, phosphate, or lipid group, have to be added to this protein (1314 genes). The list goes on ...

You can imagine that just getting a membrane protein from the point of synthesis to its destination on the cell surface is hard. Incidentally, this is known as the secretion pathway.


First the protein is made and pumped into the endoplasmic reticulum (ER), then folded, then packaged into a vesicle, then transporting to the Golgi, then modified multiple times by acquiring and then losing and hen acquiring sugars, then transported to the endosomes, then sorted to vesicles destined for the plasma membrane, then transported to the surface. Each transport step requires the protein to be loaded in a vesicle. This vesicle must bud off of one organelle, transported by motors along certain cytoskeletal filaments and then fused to the membrane of the appropriate target organelle. This entire secretion pathway, from ER to plasma membrane, takes an incredible number of different proteins. Each step is robust. And quality control mechanisms exist to make sure that the whole secretion system works efficiently.

These basic cellular processes are composed of ancient protein networks. Most of the core of each network is fairly well conserved across all eukaryotes. This is also why most of your genes are found in all other vertebrates, and many are found in every nucleated cell. Deep down you are nothing but a yeast cell with many embellishments.

So what does all this have to do with evolvability? We're almost there, I promise.

So we've established that a large chunk of your genome is dedicated to creating these sophisticated intracellular networks. Many of these networks can prop themselves up and work under a variety of different conditions. In a nutshell, they are self-organizing and robust. But that's not all. These core processes are propped up by backup networks. For example the cell has many ways to deal with unfolded proteins - disable one network and another takes over. There are many ways to transport cargoes - disassemble one transport filament and another is used. In addition, these core processes are supported by speciallized networks that function in quality control. These include the networks involved in buffering, repair and degradation. They monitor that things are OK and help get fix or get rid of macromolcules that might disrupt the core processes. These buffering systems help ensure that homeostasis is maintained. Heat the cell, and a plethora of proteins act to keep most proteins folded and the cell working. Dump the cell in an oxidative environment and other processes work to fight the negative effects of free oxygen. Most of the time your cells are being tugged toward various extremes of environment, but fortunately the cell has been loaded with robust networks that are backed up by these special buffering networks of proteins.

A freakin' large part of the genome is involved in these buffering capabilities. There are buffers that get rid of bad RNA, bad proteins. Buffers that help fold proteins. Buffers that keep the cellular organization intact in the presence of "stress". Buffers for free radicals, buffers for DNA damaging reagents and yes, buffers against acids and bases. And these not only buffer changes in the environment but also CHANGES IN THE MAKEUP OF THE NETWORKS THEMSELVES. Thus these special buffering networks are the key to understanding evolvability. That's why a cell can express a foreign protein and all is fine. That's why you can sometimes make a mutation in a "critical" gene, but the cell keeps chugging along as if nothing is wrong. It's really astonishing what you can do to the typical cell.

So how is change affected? So the key thing to think about is the core set of processes. They are robust and supported by backups and by buffering networks. Generally the central components of the networks are the most highly conserved - if you alter these players too much the system won't work. In addition there exist peripheral proteins at the edges of each network. These components can be changes quite easily without the network falling apart. These tug and pull the center of the network. But their activity is moderated by the robustness of the network and by the buffering processes I described above. So by changing the level or the properties of the peripheral proteins you can ever so slightly move the center without worrying that the whole network will collapse. The robustness of the entire network coupled to the supporting buffering networks act to decrease the amount of negative effects that the peripheral actors may impose on a particular cellular system. For example, let's say you want to change the cell shape by altering a gene. Do you need to mutate actin, the major cytoskeletal protein responsible for the cellular morphology? No. You just have to change some gene that modulates actin. You might want to alter its sequence and slightly change its activity or you may want to just tweak its level. The nature of the cytoskeleton is so stable that it can absorb and integrate these alterations without the whole network collapsing.

Let's repeat that - the core set of processes that lie at the heart of a network are so stable that they allow any of the peripheral components of the networks to be altered quite drastically in terms of sequence or in terms of the amount of that protein/RNA at any given time. The potential negative effects of such changes are repressed by the robustness of the network and by the buffering mechanisms found in every cell.

Next post I will tell you how the robustness of the core networks, in conjunction with the buffering processes, help to guide the changes of the peripheral actors towards relevant and potentially useful cellular behaviors. Yes change can be guided! But strictly at the phenotypic level.


More like this

OK this is an attempt to revive the blog. This entry is inspired by a talk given about a month ago by my mentor, Tom Rapoport. I hope that it will be the first of a series of posts where I ramble on about what we don't know. In each post I'll discuss a topic that remains mysterious. I'll try to…
Fundamentaly biology is about information. An evolving entity must be able to copy itself - it needs the information on how to make a copying machine and it needs to copy this information to its progenitors. Already you can see how this works. INFORMATION => COPYING MACHINE => COPY…
Last time I told you about how the view of cancer switched from the perspective of metabolism to oncogenes. Today we'll see how recent developments have placed the spotlight back on metabolic pathways. I'll begin this tale with a quote from a review written by Andrew M Arshama and Thomas P Neufeld…
Last week I saw an awesome lecture by Gaudenz Danuser who has a lab at the Scripps institute in San Diego. It has taken me a week to fully digest what was said, plus I haven't had the time to jot this down. Over the past few years the Danuser lab along with Claire Waterman-Storer's group (see this…

You've been reading Kirschner.

My one problem -- how accurate are the GO annotations?

It would seem to me that multicellular evolution relies on the alteration of inter-cellular communication.

Yes, there is a massive problem with Go Annotations. I've been told that the annotations for the yeast genome are more reliable.

As for the distinction between genes that affect multicellular activity, you are right. But those types of genes really affect the behavior of cells and and how they relate to eachother. Hopefully I'll integrate that point in my next assay. I'll try to give you an example of how these networks can buffer and enhance the viability (i.e. guide evolution) at the multicellular level.

i not understand the steps of protein is made in the cell..
is it about dna n rna or the whole process begin from the nucleus.