The process of translation in biology is the decoding an mRNA message into a polypeptide product. Put another way, a message written in the chemical language of nucleotides is "translated" into the chemical language of amino acids. Amino acids are linearly strung together via covalent bonds (called peptide bonds) between amino and carboxyl termini of adjacent amino acids. The decoding and "linking" process is catalyzed by a ribonucleoprotein complex called the ribosomes and can result in chains of amino acids of lengths ranging from tens to more than 1,000.
The resulting proteins are so important to the cell that their synthesis consumes more of a cell’s energy than any other metabolic process. Like DNA replication and transcription, translation is a complex molecular process that we can approach using both the Energy Story and Design Challenge rubrics. Describing the overall process, or steps in the process, requires the accounting of the matter and energy before the process and after the process and a description of how that matter is transformed and energy transferred during the process. From a Design Challenge standpoint, we can - even before digging any further into what is or is not understood about translation - try to infer some of the basic questions that we will need to answer regarding this process.
Let us start by considering the basic problem. We have a strand of RNA (called mRNA) and a bunch of amino acids and we need to somehow design a machine that will:
(a) decode the chemical language of nucleotides into the language of amino acids,
(b) join amino acids in a very specific manner,
(c) complete this process with reasonable accuracy, and
(d) do this at a reasonable speed. Reasonable, is of course defined by natural selection.
As before, we can identify subproblems
(a) How does our molecular machine determine where and when to start working?
(b) How does the molecular machine coordinate decoding and bond formations?
(c) where does the energy for this process come from and how much?
(d) how does the machine know where to stop?
Other questions and functional problems/challenges will certainly arise as we dig deeper.
The point, as always, is that even without knowing any specifics about translation we can use our imaginations, curiosity and common sense to imagine some requirements for the process that we will need to learn more about. Understanding these questions as the context for what follows is key.
A peptide bond links the carboxyl end of one amino acid with the amino end of another, expelling one water molecule. The R1 and R2 designation refer to side chain of amino acid the two amino acids.
Attribution: Marc T. Facciotti (original work).
Protein Synthesis Machinery
The components that go into the process
Many different molecules and macromolecules contribute to the process of translation. While the exact composition of "the players" in the process may vary from species to species - for instance, ribosomes may consist of different numbers of rRNAs (ribosomal RNAs) and polypeptides depending on the organism - the general functions of the protein synthesis machinery are comparable from bacteria to human cells. We focus on these similarities. At a minimum, translation requires an mRNA template, amino acids, ribosomes, tRNAs, an energy source, and various additional accessory enzymes and small molecules.
Reminder: Amino acids
Let us simply recall that the basic structure of amino acids is composed of a backbone composed of an amino group, a central carbon (called the α-carbon), and a carboxyl group. Attached to the α-carbon is a variable group that helps determine some of the chemical properties and reactivity of the amino acid.
The 20 common amino acids.
Attribution: Marc T. Facciotti (own work)
A ribosome is a complex macromolecule composed of structural and catalytic rRNAs, and many distinct polypeptides. As we start to try thinking about energy accounting in the cell it is worth noting that ribosomes do not come "free". Even before an mRNA is translated, a cell must invest energy to build each of its ribosomes. In E. coli, there are between 10,000 and 70,000 ribosomes present in each cell at any given time.
Ribosomes exist in the cytoplasm in bacteria and archaea and in the cytoplasm and on the rough endoplasmic reticulum in eukaryotes. Mitochondria and chloroplasts also have their own ribosomes in the matrix and stroma, which look more similar to bacterial ribosomes (and have similar drug sensitivities), than the ribosomes just outside their outer membranes in the cytoplasm. Ribosomes dissociate into large and small subunits when they are not synthesizing proteins and reassociate during the initiation of translation. coli, the small subunit is described as 30S, and the large subunit is 50S. Mammalian ribosomes have a small 40S subunit and a large 60S subunit. The small subunit is responsible for binding the mRNA template, whereas the large subunit sequentially binds tRNAs. Each mRNA molecule is simultaneously translated by many ribosomes, all synthesizing protein in the same direction: reading the mRNA from 5' to 3' and synthesizing the polypeptide from the N terminus to the C terminus. The complete mRNA/poly-ribosome structure is called a polysome.
tRNAs are structural RNA molecules that were transcribed from genes. Depending on the species, 40 to 60 types of tRNAs exist in the cytoplasm. Serving as adaptors, specific tRNAs bind to sequences on the mRNA template and add the corresponding amino acid to the polypeptide chain. Therefore, tRNAs are the molecules that actually “translate” the language of RNA into the language of proteins.
Of the 64 possible mRNA codons—or triplet combinations of A, U, G, and C, three specify the termination of protein synthesis and 61 specify the addition of amino acids to the polypeptide chain. Of these 61, one codon (AUG) also encodes the initiation of translation. Each tRNA anticodon can base pair with one of the mRNA codons and add an amino acid or terminate translation, according to the genetic code. For instance, if the sequence CUA occurred on an mRNA template in the proper reading frame, it would bind a tRNA expressing the complementary sequence, GAU, which would be linked to the amino acid leucine.
Aminoacyl tRNA Synthetases
The process of pre-tRNA synthesis by RNA polymerase III only creates the RNA portion of the adaptor molecule. The corresponding amino acid must be added later, once the tRNA is processed and exported to the cytoplasm. Through the process of tRNA “charging,” each tRNA molecule is linked to its correct amino acid by a group of enzymes called aminoacyl tRNA synthetases. At least one type of aminoacyl tRNA synthetase exists for each of the 20 amino acids; the exact number of aminoacyl tRNA synthetases varies by species. These enzymes first bind and hydrolyze ATP to catalyze a high-energy bond between an amino acid and adenosine monophosphate (AMP); a pyrophosphate molecule is expelled in this reaction. The activated amino acid is then transferred to the tRNA, and AMP is released.
The Mechanism of Protein Synthesis
Just as with mRNA synthesis, protein synthesis can be divided into three phases: initiation, elongation, and termination. The process of translation is similar in bacteria, archaea and eukaryotes.
In general, protein synthesis begins with the formation of an initiation complex. The small ribosomal subunit will bind to the mRNA at the ribosomal binding site. Soon after, the methionine-tRNA will bind to the AUG start codon (through complementary binding with its anticodon). This complex is then joined by large ribosomal subunit. This initiation complex then recruits the second tRNA and thus translation begins.
Bacterial vs Eukaryotic initiation
In E. coli mRNA, a sequence upstream of the first AUG codon, called the Shine-Dalgarno sequence (AGGAGG), interacts with a rRNA molecule. This interaction anchors the 30S ribosomal subunit at the correct location on the mRNA template. Stop for a moment to appreciate the repetition of a mechanism you've encountered before. In this case, getting a protein complex to associate - in proper register - with a nucleic acid polymer is accomplished by aligning two antiparallel strands of complementary nucleotides with one another. We also saw this in the function of telomerase.
Instead of binding at the Shine-Dalgarno sequence, the eukaryotic initiation complex recognizes the 7-methylguanosine cap at the 5' end of the mRNA. A cap-binding protein (CBP) assists the movement of the ribosome to the 5' cap. Once at the cap, the initiation complex tracks along the mRNA in the 5' to 3' direction, searching for the AUG start codon. Many eukaryotic mRNAs are translated from the first AUG, but this is not always the case. According to Kozak’s rules, the nucleotides around the AUG indicate whether it is the correct start codon. Kozak’s rules state that the following consensus sequence must appear around the AUG of vertebrate genes: 5'-gccRccAUGG-3'. The R (for purine) indicates a site that can be either A or G, but cannot be C or U. Essentially, the closer the sequence is to this consensus, the higher the efficiency of translation.
During translation elongation, the mRNA template provides specificity. As the ribosome moves along the mRNA, each mRNA codon comes into 'view', and specific binding with the corresponding charged tRNA anticodon is ensured. If mRNA were not present in the elongation complex, the ribosome would bind tRNAs nonspecifically. Note again the use of base pairing between two antiparallel strands of complementary nucleotides to bring and keep our molecular machine in register and in this case also to accomplish the job of "translating" between the language of nucleotides and amino acids.
The large ribosomal subunit consists of three compartments: the A site binds incoming charged tRNAs (tRNAs with their attached specific amino acids), the P site binds charged tRNAs carrying amino acids that have formed bonds with the growing polypeptide chain but have not yet dissociated from their corresponding tRNA, and the E site which releases dissociated tRNAs so they can be recharged with another free amino acid.
Elongation proceeds with charged tRNAs entering the A site and then shifting to the P site followed by the E site with each single-codon “step” of the ribosome. Ribosomal steps are induced by conformational changes that advance the ribosome by three bases in the 3' direction. The energy for each step of the ribosome is donated by an elongation factor that hydrolyzes GTP. Peptide bonds form between the amino group of the amino acid attached to the A-site tRNA and the carboxyl group of the amino acid attached to the P-site tRNA. The formation of each peptide bond is catalyzed by peptidyl transferase, an RNA-based enzyme that is integrated into the 50S ribosomal subunit. The energy for each peptide bond formation is derived from GTP hydrolysis, which is catalyzed by a separate elongation factor. The amino acid bound to the P-site tRNA is also linked to the growing polypeptide chain. As the ribosome steps across the mRNA, the former P-site tRNA enters the E site, detaches from the amino acid, and is expelled. The ribosome moves along the mRNA, one codon at a time, catalyzing each process that occurs in the three sites. With each step, a charged tRNA enters the complex, the polypeptide becomes one amino acid longer, and an uncharged tRNA departs. Amazingly, this process occurs rapidly in the cell, the E. coli translation apparatus takes only 0.05 seconds to add each amino acid, meaning that a 200-amino acid polypeptide could be translated in just 10 seconds.
Many antibiotics inhibit bacterial protein synthesis. For example, tetracycline blocks the A site on the bacterial ribosome, and chloramphenicol blocks peptidyl transfer. What specific effect would you expect each of these antibiotics to have on protein synthesis?
The Genetic Code
To summarize what we know to this point, the cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and uracil (U). Translation of the mRNA template converts nucleotide-based genetic information into a protein product. Protein sequences consist of 20 commonly occurring amino acids; therefore, it can be said that the protein alphabet consists of 20 letters. Each amino acid is defined by a three-nucleotide sequence called the triplet codon. The relationship between a nucleotide codon and its corresponding amino acid is called the genetic code. Given the different numbers of “letters” in the mRNA and protein “alphabets,” means that there are a total of 64 (4 × 4 × 4) possible codons; therefore, a given amino acid (20 total) must be encoded for by more than one codon.
Three of the 64 codons terminate protein synthesis and release the polypeptide from the translation machinery. These triplets are called stop codons. Another codon, AUG, also has a special function. In addition to specifying the amino acid methionine, it also serves as the start codon to initiate translation. The reading frame for translation is set by the AUG start codon near the 5' end of the mRNA. The genetic code is universal. With a few exceptions, virtually all species use the same genetic code for protein synthesis, which is powerful evidence that all life on Earth shares a common origin.
Redundant, not Ambiguous
The information in the genetic code is redundant. Multiple codons code for the same amino acid. For example, using the chart above, you can find 4 different codons that code for Valine, likewise, there are two codons that code for Leucine, etc. But the code is not ambiguous, meaning, that if you were given a codon you would know definitively which amino acid it is coding for, a codon will only code for a specific amino acid. For example, GUU will always code for Valine, and AUG will always code for Methionine. This is important, you will be asked to translate an mRNA into a protein using a codon chart like the one shown above.
Termination of translation occurs when a stop codon (UAA, UAG, or UGA) is encountered. When the ribosome encounters the stop codon no tRNA enters into the A site. Instead a protein know as a release factor binds to the complex. This interaction destabilizes the translation machinery, causing the release of the polypeptide and the dissociation of the ribosome subunits from the mRNA. After many ribosomes have completed translation, the mRNA is degraded so the nucleotides can be reused in another transcription reaction.
What are the benefits and drawbacks to translating a single mRNA multiple times?
Coupling between Transcription and Translation
As discussed previously, bacteria and archaea do not need to transport their RNA transcripts between a membrane bound nucleous and the cytoplasm. The RNA polymerase is therefore transcribing RNA directly into the cytoplasm. Here ribosomes can bind to the RNA and begin the process of translation, in some instances while transciption is still occurring. The coupling of these two processes, and even mRNA degradation, is facilitated not only because transcription and translation happen in the same compartment but also because both of the processes happen in the same direction - synthesis of the RNA transcript happens in the 5' to 3' direction and translation reads the transcript in the 5' to 3' direction. This "coupling" of transcription with translation occurs in both bacteria and archaea and is, in fact, essential for proper gene expression in some instances.
In context of a protein synthesis Design Challenge we can also raise the question/problem of how proteins get to where they are supposed to go. We know that some proteins are destined for the plasma membrane, others in eukaryotic cells need to be directed to various organelles, some proteins, like hormones or nutrient scavenging proteins, are intended to be secreted by cells while others may need to be directed to parts of the cytosol to serve structural roles. How does this happen?
Since various mechanisms have been uncovered, the details of this process are not easily summarized in a brief paragraph or two. However, some key common elements of all mechanisms can be mentioned. First, is the need for a specific "tag" that can provide some molecular information about where the protein of interest is destined. This tag usually takes the form of a short string of amino acids - a so called signal peptide - that can encode information about where the protein is intended to end up. The second required component of the protein sorting machinery must be a system to actually read and sort the proteins. In bacterial and archaeal systems this usually consists of proteins that can identify the signal peptide during translation, bind to it, and direct the synthesis of the nascent protein to the plasma membrane. In eukaryotic systems, the sorting is by necessity more complex, and involves a rather elaborate set of mechanisms of signal recognition, protein modification, and trafficking of vesicles between organelles or the membrane. These biochemical steps are initiated in the endoplasmic reticulum and further "refined" in the Golgi apparatus where proteins are modified and packaged into vesicles bound for various parts of the cell.
Some of the various specific mechanisms may be discussed by your instructor in class. The key for all students it so appreciate the problem and to have a general idea of the high-level requirements that cells have adopted to solve them.
Post-translational Protein Modification
After translation individual amino acids may be chemically modified. These modifications add chemical variation and new properties that are rooted in the chemistries of the functional groups that are being added. Common modifications include phosphate groups, methyl, acetate, and amide groups. Some proteins, typically targeted to membranes will be lipidated - a lipid will be added. Other proteins will be glycosylated - a sugar will be added. Another common post-translational modification is cleavage or linking of parts of the protein itself. Signal-peptides may be cleaved, parts may be excised from the middle of the protein, or new covalent linkages may be made between cysteine or other amino acid side chains. Nearly all modifications will be catalyzed by enzymes and all change the functional behavior of the protein.
mRNA is used to synthesize proteins by the process of translation. The genetic code is the correspondence between the three-nucleotide mRNA codon and an amino acid. The genetic code is “translated” by the tRNA molecules, which associate a specific codon with a specific amino acid. The genetic code is degenerate because 64 triplet codons in mRNA specify only 20 amino acids and three stop codons. This means that more than one codon corresponds to an amino acid. Almost every species on the planet uses the same genetic code.
The players in translation include the mRNA template, ribosomes, tRNAs, and various enzymatic factors. The small ribosomal subunit binds to the mRNA template. Translation begins at the initiating AUG on the mRNA. The formation of bonds occurs between sequential amino acids specified by the mRNA template according to the genetic code. The ribosome accepts charged tRNAs, and as it steps along the mRNA, it catalyzes bonding between the new amino acid and the end of the growing polypeptide. The entire mRNA is translated in three-nucleotide “steps” of the ribosome. When a stop codon is encountered, a release factor binds and dissociates the components and frees the new protein.