Protein Synthesis: Transcription and Translation

i-710d005c8660d36282911838843a792d-ClockWeb logo2.JPGHere is the third BIO101 lecture (from May 08, 2006). Again, I'd appreciate comments on the correctness as well as suggestions for improvement.

--------------------------------------------------
BIO101 - Bora Zivkovic - Lecture 1 - Part 3

The DNA code

DNA is a long double-stranded molecule residing inside the nucleus of every cell. It is usually tightly coiled forming chromosomes in which it is protected by proteins.

Each of the two strands of the DNA molecule is a chain of smaller molecules. Each link in the chain is composed of one sugar molecule, one phosphate molecule and one nucleotide molecule. There are four types of nucleotides (or 'bases') in the DNA: adenine (A), thymine (T), guanine (G) and cytosine (C). The two strands of DNA are structured in such a way that an adenine on one strand is always attached to a thymine on the other strand, and the guanine of one strand is always bound to cytosine on the other strand. Thus, the two strands of the DNA molecule are mirror-images of each other.

The exact sequence of nucleotides on a DNA strand is the genetic code. The total genetic code of all of the DNA on all the chromosomes is the genome. Each cell in the body has exactly the same chromosomes and exactly the same genome (with some exceptions we will cover later).

A gene is a small portion of the genome - a sequence of nucleotides that is expressed together and codes for a single protein (polypeptide) molecule.

Cell uses the genes to synthetise proteins. This is a two-step process. The first step is transcription in which the sequence of one gene is replicated in an RNA molecule. The second step is translation in which the RNA molecule serves as a code for the formation of an amino-acid chain (a polypeptide).

i-ea793123ec66f8e59b61aec2d28292f1-a1 DNA-RNA-ribosome.jpg

Transcription

For a gene to be expressed, i.e., translated into RNA, that portion of the DNA has to be uncoiled and freed of the protective proteins. An enzyme, called DNA polymerase, reads the DNA code (the sequence of bases on one of the two strands of the DNA molecule) and builds a single-stranded chain of the RNA molecule. Again, where there is a G in DNA, there will be C in the RNA and vice versa. Instead of thymine, RNA has uracil (U). Wherever in the DNA strand there is an A, there will be a U in the RNA, and wherever there is a T on the DNA molecule, there will be an A in the RNA.

Once the whole gene (100s to 10,000s of bases in a row) is transcribed, the RNA molecule detaches. The RNA (called messenger RNA or mRNA) may be further modified by addition of more A bases at its tail, by addition of other small molecules to some of the nucleotides and by excision of some portions (introns) out of the chain. The removal of introns (the non-coding regions) and putting together the remaining segments - exons - into a single chain again, is called RNA splicing. RNA splicing allows for one gene to code for multiple related kinds of proteins, as alternative patterns of splicing may be controlled by various factors in the cell.

Unlike DNA, the mRNA molecule is capable of exiting the nucleus through the pores in the nuclear membrane. It enters the endoplasmatic reticulum and attaches itself to one of the membranes in the rough ER.

i-3a19796933a5528366879971897d87f6-a2 DNA-RNA-ribosome2.jpg

Translation

Three types of RNA are involved in the translation process: mRNA which carries the code for the gene, rRNA which aids in the formation of the ribosome, and tRNA which brings individual amino-acids to the ribosome. Translation is controlled by various enzymes that recognize specific nucleotide sequences.

The genetic code (nucleotide sequence of a gene) translates into a polypeptide (amino-acid sequence of a protein) in a 3-to-1 fashion. Three nuclotides in a row code for one amino-acid. There are a total of 20 amino-acids used to build all proteins in our bodies. Some amino-acids are coded by a single triplet code, or codon. Other amino-acids may be coded by several different RNA sequences. There is also a START sequence (coding for fMet) and a STOP sequence that does not code for any amino-acid. The genetic code is (almost) universal. Except for a few microorganisms, all of life uses the same genetic code.

When the ribosome is assembled around a molecule of mRNA, the translation begins with the reading of the first triplet. Small tRNA molecules bring in the individual amino-acids and attach them to the mRNA, as well as to each other, forming a chain of amino-acids. When a stop signal is reached, the entire complex disassociates. The ribosome, the mRNA, the tRNAs and the enzymes are then either degraded or re-used for another translational event.

Protein synthesis - post-translational modifications

Translation of the DNA/RNA code into a sequence of amino-acids is just the beginning of the process of protein synthesis.

The exact sequence of amino-acids in a polypeptide chain is the primary structure of the protein.

As different amino-acids are molecules of somewhat different shapes, sizes and electrical polarities, they react with each other. The attractive and repulsive forces between amino-acids cause the chain to fold in various ways. The three-dimensional shape of the polypeptide chain due to the chemical properties of its component amino-acids is called the secondary structure of the protein.

Enzymes called chaperonins further modify the three-dimensional structure of the protein by folding it in particular ways. The 3D structure of a protein is its most important property as the functionality of a protein depends on its shape - it can react with other molecules only if the two molecules fit into each other like a key and a lock. The 3D structure of the fully folded protein is its tertiary structure.

Prions, the causes of such diseases as Mad Cow Disease, Scabies and Kreutzfeld-Jacob disease, are proteins. The primary and secondary structure of the prion is almost identical to the normally expressed proteins in our brain cells, but the tertiary structure is different - they are folded into different shapes. When a prion enters a healthy brain cell, it is capable of denaturing (unwinding) the native protein and then reshaping it in the same shape as the prion. Thus one prion molecule makes two - those two go on and make four, those four make eight, and so on, until the whole brain is just one liquifiied spongy mass.

Another aspect of the tertiary structure of the protein is addition of small molecules to the chain. For instance, phosphate groups may be attached to the protein (giving it additional energy). Also, short chains of sugars are usually bound to the tail-end of the protein. These sugar chains serve as "ZIP-code tags" for the protein, informing carrier molecules exactly where in the cell this protein needs to be carried to (usually within vesicles that bud off the RER or the Golgi apparatus). The elements of the cytoskeleton are used as conduits ("elevators and escalators") to shuttle proteins to where in the cell they are needed.

Many proteins are composed of more than one polypeptide chain. For instance, hemoglobin is formed by binding together four subunits. Each subunit also has a heme molecule attached to it, and an ion of iron attached to the heme (this iron is where oxygen binds to hemogolobin). This larger, more complex structure of the protein is its quaternary structure.

See animations:
Transcription
Translation

References:

Peter H. Raven, George B. Johnson, Jonathan B. Losos, and Susan R. Singer, Biology (7th edition), McGraw-Hill Co. NY, Chapters 3, 14 and 15.

Previously in this series:
Biology and the Scientific Method
Cell Structure

More like this

Two quick suggestions:

the two strands of the DNA molecule are mirror-images of each other

They are complementary, but not mirror-images (which brings to mind issues of chirality).

A gene is a small portion of the genome - a sequence of nucleotides that is expressed together and codes for a single protein (polypeptide) molecule

Not all genes encode protein, and there are multiple mechanisms by which a single gene can encode multiple proteins (in addition to alternative splicing, which you talk about, there is also trans-splicing, reading frame shifts, probably more I don't know about).

Mirror-images - you are right, I will correct this.

The rest - far over the heads of my students, the science-fearing adults in non-science majors with, roughly, zero background.

OK time for me to be finicky!
1) "For a gene to be expressed, i.e., translated into RNA,"
Change translated into transcribed.

2)RNA processing (capping, which you forgot to mention, and splicing) mostly occurs while the RNA is transcribed. In fact RNA polymerases helps to recruit both the capping and splicing machinery to the emerging RNA molecule.

3)"Unlike DNA, the mRNA molecule is capable of exiting the nucleus through the pores in the nuclear membrane. It enters the endoplasmatic reticulum and attaches itself to one of the membranes in the rough ER." There are many problems with this. The nuclear envelope (NE) is part of the ER that wraps around the nucleus. mRNA traverses the NE through the nuclear pore complex. Once the mRNA is in the cytoplasm it is not attached the ER. Certain mRNAs encode a signal sequence. These short polypeptide chains bind to the sequence recognition particle (SRP) as soon as they emerge from the ribosome. SRP then turns off further synthesis of the protein and targets the nascent polypeptide chain, the ribosome that is manufacturing the new protein, and the associated mRNA, to the surface of the ER. This ribosome studed ER is called rough ER. The signal sequence is then threaded through a pore in the ER (called the translocon) and protein synthesis resumes. The rest of the newly synthesized protein is pumped through this pore and thus into the inside (or lumen) of the ER to generate a new secreted or transmembrane associated protein.

(just being picky)

My own nitpicks. Suggested corrections are in bold.

Each of the two strands of the DNA molecule is a chain of smaller molecules. Each link in the chain is called a nucleotide. A nucleotide is composed of one sugar molecule, one phosphate molecule and one base. There are four types of bases in the DNA: adenine (A), thymine (T), guanine (G) and cytosine (C).

A nucleotide, by definition, is base + sugar + phosphate(s).

A gene is a small portion of the genome - a sequence of nucleotides that is expressed together and usually codes for a protein (polypeptide) molecule.

I wouldn't say a gene codes for a single protein molecule, especially since you mention alternate splicing later on.

Except for a few microorganisms, all of life uses the same genetic code. (And even the exceptions use a genetic code that's only slightly modified.)

Paranthetical suggested so students don't think the exceptions use a totally different code.

Another suggestion:

As different amino-acids are molecules of somewhat different shapes, sizes and electrical polarities, they react with each other. The attractive and repulsive forces between amino-acids cause the chain to fold in various ways. Certain segments of a protein often assume certain recognizable conformations, such as a helix or a turn or a sheet. These local conformations of the polypeptide chain form due to the chemical properties of the component amino-acids, and are called the secondary structure of the protein.

And:

The overall 3D structure of the fully folded protein is its tertiary structure.

je suis etudiant de l'ecolle seliare de Hue. Je veur commencer le partuir y a menuiar larteic acide amino du proteine sans des poepar

This material is usefull both in preparing my teching material and improving my knowledge

By JOHAN KAWATU (not verified) on 15 Oct 2007 #permalink

What is the process of transcription and translation?????