Tip:
Highlight text to annotate it
X
and the Howard Hughes Medical Institute and in this lecture
I'm going to tell you about some work from our own lab
on spliceosome structure and dynamics. So as we saw in
the first lecture, eukaryotic genes are split, in that they
have expressed regions or exons, up here, and introns.
And the introns have to be removed and in my previous
cartoon I just used the scissors and tape. So in this lecture
we're going to be talking about "What is the nature of those
scissors and tape, and how do they actually do the reaction?"
So first let's talk about how the splicing actually happens.
And it happens in two chemical steps. In the first step
of splicing, the branch site--so we talked about that, that's a
conserved adenosine in the intron near the 3 prime splice site.
The 2 prime hydroxyl of the branch site attacks the phosphate
at the 5 prime splice site, and that generates a lariat intermediate,
which has a 2 prime, and 5 prime and a 3 prime phosphate
all coming off that adenosine. And it releases the 5 prime
exon. Now that 5 prime exon doesn't float away, it's going to be
held onto by the splicing machinery, which we'll be talking about
in a moment. The second step of splicing--so here are intermediates
in the reaction. This 3 prime hydroxyl that was generated
in the first step on the 5 prime exon--it now attacks the
phosphate at the 3 prime splice site, and now it kicks out
the intron and ligates the two exons together. Now we
know this is the chemistry and that these occur as single
transesterification reactions. The two steps of splicing
are catalyzed by a large complex in the cell called the spliceosome.
The spliceosome is arguably the most complicated macromolecular machine
in the cell, as we'll see in a moment. And the spliceosome consists
of four major chunks, large pieces, subunits that need to
come together with each round of splicing in this complicated
dance called the spliceosome cycle that we'll talk about
in a few slides. But what I want to show you here is that
the pieces of the spliceosome, the major pieces are these
components called U1, U2, the triple snRNP U4/5/6, and the NineTeen complex,
or the NTC. Now what are these things?
Now what are these U1, U2 things that I showed you in the last slide?
Well they are so-called snRNPs for small nuclear RNA protein complexes.
And so each of these snRNPs consists of a small nuclear RNA
somewhere between a hundred and a couple hundred of nucleotides.
It's a stable RNA complexed with a set of proteins. So for
example, U1 snRNP has U1 snRNA in it and a set of core proteins
called the Sm proteins. There are seven proteins. They make a
ring-like structure, and you can see they are common to
a lot of the snRNPS. And then some specific proteins
that are common to U1 snRNP and in U1's case--70K, A and C.
U2 is more complicated, and these are all of the proteins that are
associated with U2 snRNP. And then with the triple snRNP,
a so-called triple snRNP because it has three small nuclear
RNAs in it, and it has even a larger set of proteins.
In addition to the snRNPs, there's the NineTeen complex, so
named because it contains a protein called Prp19 and
its associated factors. So it is just like a snRNP except it
does not contain an RNA component. And then in addition
to these main stable components, there are also things called
splicing factors and these are proteins that come and go,
but are not stably associated with any single snRNP.
And included in this class are RNA helicases that change the
structure of RNAs or can change the structure of RNA protein complexes.
Certainly there are RNA binding proteins, we talked about
two of the classes of those in the last lecture--the SR proteins
and the hnRNP proteins. And there are unexpected proteins
like cis/trans prolyl isomerases and ubiquitin ligases.
Altogether the complete spliceosome parts list as we now understand it
consists of 5 snRNAs (U1, U2, U4, U5 and U6) and a hundred
proteins in yeast, a hundred different proteins, and about
three hundred different proteins in humans. And the reason that the
human splicing machinery is so much more complicated
than the yeast splicing machinery, you can imagine
why because we have so many different introns, and we do all
this alternative splicing that the yeast don't. And so most
of these proteins, the extra ones, are involved in alternative splicing.
Now let me tell you a little bit about the snRNAs. They're called
"U" snRNAs because they're uridine-rich RNAs, and their
numbering came from--if one just purifies all of the stable RNAs
out of the nucleus and runs them on a gel, the most abundant
one is U1, the second most abundant is U2, and so forth.
And it turns out that U3 does exist, but it is involved in ribosome biogenesis
as is U7. But the other five of the six most abundant
are all involved in pre-mRNA splicing, or the spliceosome.
Here we are at the spliceosome, back at the spliceosome cycle,
and we're going to look at this a little more closely because
I'm going to be telling you about some experiments we've recently
done to test this cycle. So in the first part of the cycle,
at the beginning of spliceosome assembly, U1 snRNP interacts with
the 5 prime splice site and U1 snRNA actually base pairs
and recognizes the 5 prime splice site. Similarly, U2 snRNA
base pairs with and recognizes the branch site consensus sequence,
and U2 snRNA, when it joins the U1 snRNA, forms A complex.
E complex stands for "early" complex and then A, B, and C is--
we'll see in a minute--were just named by where they ran
on a gel. So A complex comes first, and then the next big chunk
of the spliceosome that comes in is the triple snRNP--
U4/5/6. Once the triple snRNP is there, we have B complex.
Then there's a structural rearrangement, where U1 snRNP is actually
ejected. It's kicked out, and somehow things are majorly rearranged such
that U6 snRNA is interacting with the 5 prime splice site. In another rearrangement
U4 snRNP is kicked out, and then the NineTeen complex comes in,
and now we have the catalytically active spliceosome, where the first step
and then the second step of splicing occur. Once the second step of splicing
is complete, the splice product is released, and the spliced out intron
leaves with the rest of the spliceosome. That splicing machinery has to disassemble,
and then it's reassembled anew on each new round of splicing, so ergo
the spliceosome cycle. So how do we know a lot of these details of the mechanism
that I've been telling you about? Well one of the ways that we know is by
performing in vitro splicing reactions. So in an in vitro splicing reaction,
we take a piece of RNA, usually a piece of RNA that would be a couple
hundred of nucleotides that would consist of an exon with one intron
followed by a downstream exon. And the little asterisk here, the red asterisks,
are to indicate that this RNA would be radioactively labeled, so we would
transcribe it in vitro and put radioactive nucleotides all throughout.
We then mix this RNA with either whole cell extract if we're working on the yeast spliceosome
or nuclear extract if we're working on the human spliceosome. And for example, we
might get that from HeLa cells, which is a very common tissue culture cell for humans.
And then also ATP because ATP is essential for splicing because many of the
spliceosome transitions that I was showing you here all of the ones
after E complex formation, each one of those steps requires ATP and
also going around the backside here. Now why do we use whole cell extract
or nuclear extract? Well I've just told you that the splicing machinery is
incredibly complicated. It has in yeast a hundred different polypeptides;
in humans, three hundred different polypeptides. There's really just no
way at present that we can purify each one of those proteins, have that
in a test tube and fully reconstitute the machinery. So right now, the
best way to study the spliceosome is simply almost in its native
environment, and that is in unfractionated cell extract. So if we then take
those splicing reactions and take out time points and then purify
the radioactive RNA and run it on a denaturing gel--and in this case a
pretty high percentage denaturing gel--what we can see is that over
time (and this time course goes from about zero to sixty minutes
in vitro) we can see the substrate gradually disappear. And then at early
time points, the two intermediates of splicing appear, the lariat intermediate
and the 5 prime exon. And then you can see at later time points the
lariat product, the intron product, and the spliced exon product
appear. And the reason that the lariats run high in the gel, even though
they're smaller than the pre-messenger RNA, is that they have this
circular structure, and because of that unusual structure, they're retarded
in the gel more than a linear RNA and so they actually run higher than
you would expect. But now if instead we take that same splicing reaction
but don't purify the RNA and just run it on a native gel, so now we are
looking at the complexes that contain the RNA. Here we can see those
complexes I told you about before. So here's E complex, the early complex,
A, B and C and they build up and go away over time, as you might expect.
And so that's where the names of these different complexes came from,
simply by their migration on the gel. Alright, now these different
complexes can be purified. They're stable. They're stable enough to
survive a native gel, and there have been many different ways devised now
to purify the different complexes. Here's one example from my laboratory.
What we did was we took a splicing substrate, where we mutated
the 3 prime splice site, so the splicing machinery could build up on
the RNA. It could do the first step, but it couldn't do the second step
because the 3 prime splice site was mutated. And into that intron, we built
stem loops that were recognized by the MS2 protein, which is a viral coat
protein that binds very tightly to its recognition sequence. That viral coat
protein called MS2 we linked to maltose binding protein. Maltose binding
protein likes to bind amylose resin, so we could use that as an affinity
tag to pull down these spliceosomes and purify them. And you can see here
an EM image of those spliceosomes and the spliceosomes are all
around 20-30 Angstroms in size. Now from those EM images, you can do
single particle reconstruction to start to get structures of the splicing machinery.
And at this point, in comparison to the ribosome, for example, the
structural information we have for the spliceosome is rather limited.
We do have crystal structures now--two different crystal structures of U1 snRNP,
which is the most common of the snRNPs. And those are at 5.5 and 4.4
Angstrom resolution, so enough to see the RNA and the proteins.
But for the bigger complexes, we are still at the electron microscopy
stage, and so for example, here are some images from Reinhardt Luhrmann's lab
of the yeast splicing complexes--the B complex, the activated B complex
and then the C complex, which is the spliceosome that has intermediates
in it. And so in the coming years, we're really--the splicing community--is
really struggling hard but looking forward to having high-resolution structures
because we would really like to see where all of these parts bind
and how they all fit together to form these just really remarkable machines.
Now in the meantime, that's where we are on the structural front. In
addition to structure for any biochemical complex or machine, you
really need to note something about their dynamics, so I've gone over this
whole splicing cycle with you but as I explained before, the splicing cycle is
based on the complexes that are stable enough to be resolvable
on a gel or you can affinity purify them. But it doesn't tell you about
the kinetics of things coming and going. So for example, is it going to be true
that on every intron U1 has to come before U2, and do these two have to come before
the triple snRNP? And all of these arrows that we're showing here are one-way
arrows, but most biochemical reactions and chemical reactions really are
two-way. So are these arrows really one-way: is it a one-way street?
Or is this process in any way reversible? So to get at that information
my laboratory has recently been collaborating with Jeff Gelles's laboratory
at Brandeis University and Virginia Cornish at Columbia University
as well as some co-workers at New England Biolabs to develop new
tools in order to look at the dynamics of the spliceosome. The main
method that we have been utilizing is called total internal reflectance.
Now this is an experiment you can try at home. So this is simply a laser
pointer that's going into an aquarium of water. And if you correctly position
the laser pointer at the critical angle, when there's a change in
refractive index, in this case between water and air, then all of the laser
light will be completely reflected, that's called total internal reflectance.
Except right at the point where the laser contacts there is a little bit of
energy that goes through the other side, called the evanescent wave.
So the evanescent wave--in this case now we're going to be having the lasers
come through the air to a microscope slide. So here is the change
in refractive index is going from the microscope slide to the aqueous
layer above the microscope slide. The evanescent wave will go about
a hundred nanometers into the solution above the microscope slide.
So imagine having not just one laser but let's say three different colors
of lasers. It turns out we can now do five lasers. I'm only going to show you three,
three today. But imagine having three different colors of lasers all going
in at their critical angles, and having something tethered to the surface
within this 100 nanometers and having molecules that are fluorescent
in the colors that are excited by your three different lasers. So molecules
that are in solution above the evanescent wave are not fluorescent
because they're outside the area of where the light energy is. And so
only the molecules that are tethered to the surface are going to be fluorescent.
So we can use this to then ask, for anything that's tethered to the surface
what different colored molecules at any one time are associated with the
molecule that's tethered to the surface. So let's just see how this looks.
So imagine you're looking down at this surface and we're going to be
looking at the molecules on the surface. So we call this technique colocalization
of single molecules spectroscopy, or CoSMos. And this technique was
pioneered by Jeff Gelles and his co-worker Larry Friedman at Brandeis
University. So here we are looking at the fluorescence, and each one of
these spots is a single molecule on a glass coverslip that has different colored
things on it. In this case, the molecules are a strand of DNA, and the
colored things are different oligos that are complementary to that strand of DNA,
but they have different fluorophores on them. And so you can see for
example that this molecule of DNA had all three of the oligos bound to it
but this molecule of DNA only had this one--only had the green and the
blue bound to it. And you can see, here's one that only had the blue
molecule bound to it. So it's very simple because all we're doing is
we're looking at this, say, constellations, different constellations of spots
and we're going to learn something about our biological system.
And in particular if we can look at how these spots change over time,
we can learn about the dynamics of the system. Now in order to
use this to study the spliceosome, we had to develop a number of
different or new technologies to enable us to label parts of the spliceosome
so that we could see them. And so one of the things that we had to do
was to create fluorescently tagged pre-mRNAs because we need to know
where the pre-mRNAs are on the surface. Also our pre-mRNAs have to
have some way to be tethered to the surface. The way we do that is to
put a biotin molecule on one end, and then we also have biotin on the
glass surface. We have biotinylated PEG--polyethylene glycol.
And then we make a sandwich, where we have streptavidin. Streptavidin
can bind four molecules of biotin, so you can use that to make a
sandwich and bind your RNA there. Now the other thing that we had to
develop were other ways of tagging the snRNPs because what we really
wanted to do was look at the snRNPs coming and going in real time.
So the way that we're tagging the snRNPs is using two protein tags.
One is the SNAP tag that was developed by Kai Johnsson and is now
available through New England Biolabs. SNAP is based on a protein that
is a suicide enzyme that removes alkyl groups from guanine nucleotides
of DNA. And so it transfers those alkyl groups to itself. So in this case
if you have benzyl guanine--so here's guanine and then there's a benzyl group
on it and if you attach to that a fluorescent dye, the SNAP tag protein
will transfer that dye to itself, and if you've made a fusion protein
between the SNAP tag and your protein of interest--in this case a
snRNP protein--then you can specifically label your snRNP protein.
Here the other tag that we've been using is the E. coli DHFR,
dihydrofolate reductase tag, and bacterial dihydrofolate reductase binds
very tightly to trimethoprim--this molecule down here. It's a non-covalent
interaction. Trimethoprim is an inhibitor of E. coli DHFR but this molecule
does not bind to eukaryotic DHFR. But this is a very tight interaction and
again if we tether a dye with that, this dye will interact with our DHFR
tag and allow us to label that protein. And this technology was
developed by Virginia Cornish and her coworkers at Columbia University.
So how do we get these tags on our snRNPs? The way that we do this is
we're using the yeast system and we're using homologous recombination.
So we make versions of different protein genes that we want to
tag--in this case two U1 proteins and a U2 protein. We then place the tag
of interest at the C-terminus of that protein, or the gene for that protein
and then we have a selectable maker. And we use homologous
recombination to put these modified genes into haploid yeast. And
that means the only gene that is encoding that protein in the yeast is
our protein of interest--our tagged protein. So then from those yeast
strains, we can make a whole cell extract. And in this case we have U1
having two DHFR tags on two different proteins, or U1 having two tags
and U2 having a SNAP tag on it, so a triple-tagged strain. We then
take those extracts and we either can simply add the TMP to label
the DHFR or to label the SNAP tag, we take our Benzylguanine that has a
fluorescent tag on it. We react that with the whole cell extract. We remove
the excess dye by gel filtration and then now we can add our TMP. And so
the really great thing about this system is first of all we know that the
proteins we're tagging are active because 1) they're the only copy of
the protein in the cell and we're only tagging essential proteins.
Many of the proteins in the spliceosome are essential, and so we know
that if the cells grow, because splicing is essential, then that protein
must be active. Secondly, there's absolutely no protein reconstitution
required, so we're not making any recombinant proteins, purifying them
and putting them back in. We're using the endogenous proteins. We just
added a small protein tag to the thing. So let's look now at how these
experiments are going to go. So we're going to have our pre-mRNA
that's attached to the surface via this biotin-streptavidin sandwich.
It has a fluorophore in it so we can keep track of where the pre-mRNA
molecules are. And this is actually a view through the microscope of
what a field of these pre-mRNAs look like, where each one of these spots
is a single molecule of pre-mRNA. And in the movies I'm going to show
you we're going to be looking at U1 snRNP binding to those pre-mRNAs
over time in splicing reactions. One of the things about single molecule
reactions is that you're really seeing everything that's going on--
anything that's fluorescent, any kind of dust or anything you can see,
so you really need to do a lot of controls to make sure that you know what
you're looking at. So the first thing I'm going to show you is a movie
where we're doing some controls, where either we have left off the
fluorescent RNA, or we don't have the tags on U1 snRNP (and so we
wouldn't expect signal) or we have the complete reaction where we
have the fluorescently tagged RNA and the fluorescently tagged snRNP.
So let's watch that movie. This movie shows two control fields of view
and then one experimental field of view on the right. The field of view
all the way to the left, we have the wild type pre-mRNA present.
We can't see that in this field of view because we're not looking in that
channel. We're looking at the Cy3 channel, which is the TMP channel.
And we also have the Cy3-TMP in the extract, but there is no tagged
protein. So you can see that we do have a little bit of background with
material binding non-specifically to the slide and so this is why it's important
to do those controls, to make sure your background is not too high.
In the middle panel, we now have the tagged U1 and the Cy3-TMP
but we don't have any pre-mRNA on the slide, so again we only see
background binding. And then in the rightmost panel, which is the one
with all the blinking lights, we have all three components. So we have
tagged U1. We have the pre-mRNA on the surface, and we have Cy3-TMP.
Now one thing that you can see immediately from this is that U1 interaction
with the RNAs is highly dynamic. So even in the absence of ATP, U1
is binding and releasing multiple times from each pre-mRNA. Now that we
know our system's working, let's really do some experiments. And the
really cool thing about these experiments is that you can just see
the answer with your eyes. So I'm going to show you some movies
next, where we've put two fluorescent tags on each of the major subcomplexes
So in one extract, in one quadrant, you're going to see extract that
has labels on U1 snRNP, as you've already seen on two different proteins.
Then we have another extract that has labels on U2 snRNP, on the U5 component
of the triple snRNP, and also on the NineTeen complex. And in this first set of
movies, we're going to not have ATP present, so in the absence of ATP
we've known from the studies on gels that the only complex that should
form is this E complex, so only U1 should be able to stably interact with
the RNA. So now let's look and see if that's the case.
Here's a movie, showing four different extracts with a different snRNP
labeled in each extract--either U1 in the top left-hand corner, the triple snRNP
in the bottom left-hand corner, U2 in the top right-hand corner or the
NTC in the bottom right-hand corner. And in the absence of ATP, what
you can see as we saw in the previous movies, that U1 is coming and binding
reversibly, but for all the other snRNPs, we do not see any significant binding
over background. If we take the data from each of those fields of view
and simply count the number of spots over time--the total number of spots
over time--what you can see is that in the absence of ATP only U1
builds up. And none of the other snRNPs or the NTC really have much
occupancy at any one time in the absence of ATP. So now let's run the
whole spliceosome cycle, so now we're going to add ATP and see what happens.
This movie is now in the same order as before but now we've added ATP
to the reaction. And if you watch very carefully, you can see the
apparent order of addition of the snRNPs. So early in the movie,
and the movie is continually looping, you can see that U1 is binding
and coming and going. The next snRNP to build up is U2 and then
after U2, we start to see U4, 5 and 6 come up. And then the NTC, we
see less of it, but it accumulates much later in the reaction.
So what you can see from this movie is that we can see in real time
all four of these snRNPs binding to the surface that's covered with
pre-mRNA molecules. And as I will show you in the next slide, we can see
that all of the snRNPs are binding dynamically--that is, they're coming
and going, that none of them are coming and staying permanently.
One of the things that you can see from those movies is not only can we
see that all of the snRNPs are binding in the presence of ATP, but unlike
the spliceosome cycle that I showed you before with all the one-way arrows,
all of the snRNPs are binding reversibly. And we can see this by--here's looking at
one individual RNA molecule, and we're just looking at the intensity over time for that
one RNA molecule and you can see for U1 it bound twice. Here's an RNA
molecule where two molecules of U2 bound, U5, and the NTC. The reason
we don't just see two binding events often times especially for U1, we'll
see three, sometimes up to ten binding events. The reason we're showing
these particular traces is it shows you that this binding is due to reversible
binding and not due to photobleaching. So photobleaching is always a problem
in these single molecule reactions because under the intense laser light
the dyes can often photobleach and then they go blank and then when
the signal disappears you don't know if it's because your complex has
gone out of the evanescent wave or your dye has photobleached. So this is
why we attached two different fluorophores to each snRNP because
when we see the stepping behavior, this is either due to dye release
because we're using the DHFR tag or it's due to photobleaching. But that
means that this molecule which went away in one step really had to be a
molecule that went away. The whole complex went away. Because it's very
unlikely that you'd have two simultaneous photobleaching steps or two
simultaneous dye steps. And also you can see it went away and then
another one came back. So again this is another one of the controls that
you need to do when you're doing single molecule experiments. Alright
so thinking back to those movies, if we count up the total number of
spots in each frame and just plot that number here, and this would be the
number of dyes per pre-mRNA molecule. You can see now in the
presence of ATP U1 builds up first, then U2, then U5, and then the NTC
after that. So this gives us an apparent ordered process for spliceosome
assembly, and it's consistent with that. But it doesn't actually tell us for any
one molecule that the spliceosome assembly was ordered. But we can
test that directly with our single molecule methods simply by following
two snRNPs at once. So now we're going to do three-color experiments.
So one color on the pre-mRNA, one color on--in this case--U2 snRNP,
and another color on U1. And in the same experiment, by watching these snRNPs
simultaneously, we can see does U1 come first or does U2 come first?
And so here's where these data look like. So here again is one of these
individual single molecule traces, but this is one pre-mRNA molecule where
U1 and U2 came to the same pre-mRNA in the same extract. And you can see
very clearly that U1 came first and then U2 came. But we can quantify this by
measuring the on time for both U2 and U1 and taking this difference,
the time of arrival of U2 minus the time of arrival of U1, and that's the
delay time between the two. So if that number's positive, that means
U2 came after U1. If that number is negative, that means U2 came before U1.
And then we can look at this number over many different pre-mRNA molecules.
So this is a so-called probability density plot. It's a bar graph where the probability
density is the bin height divided by the bin width. The important thing to see
is that here we're looking at 82 different molecules and that almost all of them
had a positive number for this tU2 minus tU1. So that means that on almost
all of them, U2 came after U1. Now there were a few here where U2
apparently came first, but that would be consistent with the amount of
labeling of our extracts because we can't get completely a 100% labeling.
It is impossible without our extracts dying. So we have about 90%
labeling efficiency in our extracts, and this level would be consistent with
a dark U1 coming before U2. So it's not inconsistent with an ordered model.
So we've done this for all of the pairs of the complexes. Here's U2 versus
U1. This is another set of data now with 111 molecules. Here's U5
versus U2, and here's the NTC versus U5. And you can see all of these,
most of the events, gives you a positive number so therefore the
second complex came after the first complex. And so that leads us to
conclude that for this particular pre-mRNA that we've been working with
(and this is RP51A--it's a model splicing substrate in yeast), it is a highly
ordered assembly pathway with U1 almost always preceding U2 and
then the triple snRNP, and then the NTC comes after that. But what we now know
that's new that we didn't know before is that every step in this pathway
is reversible. So that means that the pre-mRNA is not necessarily
committed to splicing at the very first step with U1 addition but that
it increasingly gets committed as you go through the spliceosome
assembly pathway. Also, in terms of alternative splicing in the human system
if the spliceosome can be disassembled, for example here at later points,
then you can imagine you could inhibit splicing at particular splice sites
anywhere along this pathway because it could go backward along the
pathway if it's inhibited. So this has important implications for our understanding
of alternative splicing. Finally, I need to thank the people who actually did
the work. And obviously anything this complicated took the input of many
different people. And so from my laboratory I showed you data today from Melissa,
Aaron, Danny, Eric, Jing, and Nick contributed. Also all of this work was done in
collaboration with Jeff Gelles's lab, which developed the CoSMos
technology, in particular Larry and Alex. And then also the Cornish
Laboratory, who developed the DHFR labeling technique, and finally the
New England Biolabs for their help with the SNAP tag. And this work was
funded by HHMI and the National Institutes of Health. Thank you very much.