L06v02 Translation Part1

OK, this is the first of two videos on translation, as we continue our discussion of gene expression-- which is turning specific genes on-- and translation, specifically, the act of making protein out of RNA. We'll learn that when we read the mRNA, the reading frame is crucial. That codons of three nucleotides specify amino acids. We'll be introduced to tRNA molecules, which are the adapter molecule between the nucleic acid code and the protein code. We'll emphasize that proteins are synthesized in the amino to carboxy direction, also written as N to C. In the second video we'll be introduced to the fact that the ribosome is the molecular machine of many proteins and RNA molecules, that it's responsible for making proteins, and it's quite a remarkable machine. And we will learn the four step process of making proteins from amino acids. In this slide we want to emphasize a major difference between eukaryotes and prokaryotes. For making proteins in eukaryotes here in the nucleus, in the white circle, DNA is transcribed into RNA. The RNA is spliced and capped. It'll be exported out of the nucleus into the cytosol. And here is where the protein is made from the mRNA, the separation of where the RNA is made, and where the protein is made. Prokaryotes don't have a nucleus, so DNA is made to RNA, and protein's made from RNA in the same compartment, and it can happen even at the same time. While the is RNA polymerase is making part of the mRNA molecule, the ribosomes may have already hopped on the other end and started to make protein. Now on this slide we see the beginning of mRNA being converted into protein. And the first thing you'll notice is that there are three nucleotides, C-U-C in this case, coding for a single amino acid, leucine. Now you might ask, why should that be? So I want you to just pause the video and think about it for a few seconds. Why you need three nucleotides to code for one amino acid? OK, I hope you gave that some thought. The answer is, that's the smallest number of nucleotides which can be used. If you only used one nucleotide-- well there's four different nucleotides that could be there-- you could only code for four different amino acids. So if you used two nucleotides, you have four that could go here, four that could go here, that's 16 different combinations. You can only code for 16 amino acids then. It's only by using the third, in which case you have four by four times four-- or 64 different possibilities-- that you have enough coding potential to code uniquely for 20 amino acids. The second important thing to notice is, the way you group the three nucleotides is crucial. You can see that different groupings of the same nucleotide sequence-- I'll say the first five bases-- C-U-C-A-G, C-U-C-A-G, C-U-C-A-G-- so this is the same sequence, just grouped differently. In this case we're grouping the first three, in this case we grouping two three and four, in this case we're grouping nucleotides three, four, and five. They code for different amino acids, leucine, serine, glutamine. Normally the cell, and you, when presented with a sequence would have to figure out what the reading frame is. We'll see later in the course some of the tricks for doing that, or computational approaches for doing that. For this class, for now, you can always assume that when you're given a mRNA sequence. And you have to convert to protein, you can always assume the first three is a codon, and then four, five, and six is a codon, and seven, eight, nine. Now here we see the 64 different codons, the 64 ways of taking three nucleotides at a time. They're organized by the 20 amino acids, and there are three special codons that code for stop codons. This is the signal to the ribosome to stop making the protein. I'll just mention that these three codons have special names, sometimes you'll see in the literature amber, ochre, and opal. I don't know which is which, you don't need to know that either. We can also see here that this is the three letter code for the various amino acids. Underneath we have the one letter code for amino acids. If you plan on going on in biological or biomedical research, you might as well learn the single letter codes, but you're not required to for this class. You do need to know the three letter codes. And as we mentioned earlier, what class of amino acid they are. Alanine is a hydrophobic amino acid. Arginine is a positively charged amino acid. Aspartic acid is a negatively charged amino acid, and so on. Now let's focus in on alanine. There are four different ways to code for alanine. G-C-A, G-C-C, and G-C-U. You'll notice that in all cases the first two codons are G-C. This is generally typical of all the variation in the coding system. It's in the third position where it's tolerant of any base. For this reason, the third position is called the wobble position, and its generally less important for specifying an amino acid. The first two being the same doesn't always hold. For instance, there's six different ways of doing serine. The first two can either be A-G or U-C. Let's talk about the six different ways of coding for leucine. In principle, all six of these are interchangeable. The cell can use any of these at any time, and they will always specify leucine. Now in practice, in actuality, the cell uses different codons at different times. Sort of the most prominent example is, some of these codons are used in highly abundant proteins. They have highly abundant tRNAs, they are incorporated quickly into proteins. Some of them are used primarily for low abundant proteins, which there are a low percentage of the tRNAs, they get incorporated slowly. And that's one way that the system is optimized. We'll also see later in the course that once we know which ones are used for high abundant proteins in general, we can use this, and actually make a fairly decent calculation about protein abundance based on its particular codon usage. I think this is the clearest way of showing the 64 different combinations, but it's not the most common presentation. So we will look at that on the next slide. Let's focus in on the one way to make methionine, which is A-U-G. So here we see first base in codon, that is A, that indicates that the position is somewhere in here. U is the second base, so that means it is somewhere in this column. And G is the third base, and that means it's here, and there is methionine. So this is the way-- you don't have to memorize this of course-- and this is the way you'll be given the information on tests. And this is the way you will have to use it when decoding RNA sequences into protein. OK on this slide we're introduced to the tRNA molecule, which is the adapter between the mRNA sequence, and it connects to the protein because each tRNA carries an amino acid on this end. Amino acids read the mRNA code-- and the group of three is called the codon-- and this is an RNA molecule. So this is able to base pair with the mRNA, just like any two RNAs would. There are three specific bases on an mRNA, called the anticodon, and that base pairs to the codon. These red lines correspond to hydrogen bonds between bases. But the fact that there's two, three, or one here is not of any particular significance. Although I suppose they might be trying to show the wobble position is the weakest, or the least important, but who knows? The final point is that, since there are 61 different codons that code for amino acids, there are 61 different tRNA molecules that read the different codons. Only 20 amino acids, but 61 different tRNA molecules. So in this slide we see a more realistic depiction of tRNA structure. You can see it has sort of a clover leaf type structure here, when unfolded. And that's the purpose of this little picture. Here is the anticodon, the anticodon G-A-A. This is the anticodon for the amino acid, which is located here, for phenylalanine. There's a D loop in red, and a T loop in yellow. And here is the actual structure. It looks pretty crazy, but you can see there's the blue loop, the yellow and the red loop, the amino acid in green, and you can see some regions where you can imagine approximately helical type structure starting to emerge. But it's a far cry from the very regular, by comparison, structure of DNA. And remember, it's just that one hydroxyl group on the sugar that's the molecular difference. Yet look at what an important consequence it can have, at least for nucleic acid structure. That's always amazing to me. And then here at the bottom we have a partial sequence, starting here, you can look in red, A-G-D, A-G-D. You'll notice that there's a funny symbol here. A few RNA molecules and ribsomal RNAs have a special base, psi, it's referred to. It's called pseudouridine I just point that out, but you don't need to know that for the rest of this course. Now on the next slide I just want to show you how it reads the mRNA. So here's the mRNA, five prime to three prime. We're reading the mRNA in this direction. I had to flip it over and invert it. And then you can see that the anticodon A, base pairs with U. A base pairs with U, and G base pairs with C. So this is the way that the anticodon is base pairing with the mRNA molecule. And U-U-C does in fact code for phenylalanine, which we can see on our handy look-up table, U-U-C. The process of connecting an amino acid to a tRNA molecule, a pure tRNA molecule, is done by enzymes, which we'll see on the next slide. It utilizes ATP to make an activated, adenylated amino acid. And this portion is then cleaved off as it is connected to the tRNA molecule. And this energetically pays for this otherwise energetically unfavorable reaction. You can imagine that it's very important for these enzymes to be accurate at connecting only the proper amino acid to the tRNA. Otherwise, when you're reading the amino acid, you would get incorrect amino acid incorporation into the growing polypeptide chain. Here we see the type of enzyme responsible for this reaction, it's called a tRNA synthetase. Since there are 61 different tRNA molecules, there are 61 different enzymes. Even though they're only connecting 20 different amino acids, you have to have one for the specificity of each tRNA molecule. In this case, the amino acid tryptophan is being connected to a tRNA molecule. Tryptophan is coded for by U-G-G. Now on the slide, we see the important principle about how proteins are synthesized from the amino end to the carboxy end. In this case there's three amino acids, one, two, three, connected to a tRNA. A fourth amino acid is going to come in, it's going to attach to the carboxy end. In so doing, it's going to cleave the connection to the tRNA molecule. tRNA molecule number three will be released, and now you have the four amino acids connected to the fourth tRNA molecule. This is all happening inside the ribosome, but this is the fundamental picture. You get proteins are synthesized in the amino to carboxy direction. And in the next video, we will meet the ribosome and see how it gets this to occur. Thanks.