Tip:
Highlight text to annotate it
X
Good morning. I’m speaking on behalf of my colleagues at the Genome Institute, Rick
Wilson, Elaine Mardis and my colleagues in Genomics of AML PPG, but I also will give
you a brief update at the end about kind of the score card of the AML project at TCGA.
So with AML -- let’s see if this works. Not advancing. Okay, a few brief words about
this disease and its genomics. So we don’t know much about it in terms of initiating
mutations except for patients who have canonical mutations that I’ll show you in just a minute.
For most patients with this disease, when we started this project very, very little
was known about initiating mutations. One very nice feature of the disease is that the
oncologists control the samples. The tumor tissue is very easy to access and access repeatedly
and most of the samples are relatively free of contaminating normal cells without any
additional purification.
Another feature that we loved is we started sequencing whole genomes with the fact that
many of these genomes are diploid. And also riffing on lessons of the past, low resolution
genomic screening that is cytogenetics has been used for 30 years to classify these patients
and to make treatment decisions. And all of us as clinicians who take care of these patients
use this idea, the very important idea that favorable risk cases who have these canonical
mutations, at least three of them, can be treated lightly up front and they’re going
to do relatively well, by that I mean five year survivals of 50 percent. That some patients
with complex cytogenetics have adverse risk, those patients need to be transplanted in
first remission or they will die. But unfortunately about two thirds of our cases have intermediate
risk. And these patients we don’t know what to do with, we don’t necessarily need new
drugs for these cases but we need to know what to do with them. So we need better classification
markers, biomarkers if you will, to separate these people into good and poor risk.
In the first AML genome we sequenced, we found the major classifier of intermediate risk
which is mutations in a gene called DNMT3A. So I’m not going to talk about that because
the data has now been validated by many, many centers and it is the most important classifier
as a single gene of intermediate risk and it tells us, I think, who to transplant. But
I will tell you about some of the conundrums we faced as we’ve sequenced these genomes.
In the first couple of genomes that were done, we encountered what I will call the founding
conundrum. And that is there are hundreds of mutations per AML genome. And because we
validated all of them with deep digital sequencing, we found out that all the mutations are in
all the cells. That’s a problem. It suggests that they may have all risen simultaneously
or that if they’re important for the timber, if they arose because they are all needed,
that you would have to have hundreds of relevant mutations per case. Both of those seem impossible
so how does it happen?
This is the model that we prefer and it’s experimentally tractable and I think it’s
now been proven. So the idea is that hematopoietic stem cells, the cell of origin for this tumor
is a cell that lives for your entire life. But it spends most of your life in G0, divides
perhaps once a week, once every few weeks, you only have about a hundred thousand of
these in your body. These cells accumulate about 14 mutations per year. So that as you
age a number of random innocuous mutations accumulate in these cells, until one fateful
day when a true initiating mutation causes a cell to have an advantage, probably a very
small one. That cell then begins to experiment with other mutations, progression mutations
and once in a while unfortunately a progression mutation cooperates with this initiating mutation
that causes the head. This actually explains most of our data. It explains why all the
mutations have the same recount frequency, why they’re all the same cells because they’re
simply captured by the act of cloning. The initiating event clones that critical cell
and then all the mutations come forward. So when you sequence this genome, not only do
you sequence the two relevant mutations, four and five, but the hundreds of mutations that
are previously accumulated in that cell prior to its transformation.
So, there’s a central question of course in this disease and most cancers. How many
mutations does it take to cause the disease? So our approach to this was to take 24 genomes
and sequence them completely but we selected these cases to be of two different types.
One the M3 subtype of AML which is initiated by a very well-known fusion protein created
by a translocation called PML-RARalpha versus cases with undifferentiated AML, or M1 AML,
that have normal karyotype. We know little or nothing about these cases. So the idea
is very simple. We have one that is caused by a very well-defined initiating mutation
and we know it’s initiating because you can put it in a mouse and it causes a phenocopy
of the disease and one where you know nothing.
So based on this prediction that I just made about how the mutations would arise, we predicted
that the total mutations per genome would be the same because most of them arose before
the initiating event. They are simply background, benign mutations that are present in the stem
cell. That most of the mutations would be random and irrelevant, that the M1 cases would
have nalA mutations that would never be seen in M3 and that the cases would have common
mutations which would be relevant for progression. Make sense?
So what do we find? The key, how many recurring mutations per genome, that should give us
the answer about how many mutations are needed to initiate M1 and how many are needed to
progress. You can think about your predictions in the next few minutes and I’ll show you
the answer at the end.
Twenty-four genome pairs completely sequenced. Ten thousand mutations total found, average
of 421 per genome, 308 mutations and 286 unique genes. About 10 per genome with translational
consequences, about 10 in the exome in each case. We looked at these mutations and 66
additional M1 and 43 additional M3 cases. There are 21 recurrently mutated genes. In
M3 there was only one, PML-RARalpha. In M1 we found 10 recurrent mutations and I’ll
show you what they are in a minute. And 11 mutations were common to the two subsets.
The total number of mutations by tier fit the prediction. They’re exactly the same
in tier one, the coding region. Exactly the same for the two subtypes in tier two, the
conserved region of the genome with potential regulatory function. And exactly the same
in tier three and exactly the same for total numbers of mutations fits the predictions
that they had to arise prior to initiation. If you plot them by genome space, they fit
exactly as random events that occurred in genome space in tier one, tier two and tier
three. The r-value for the M1 and M3 cases are both exactly one. These are random mutations
that occurred prior to transformation in the stem cells.
One thing we learned with deep digital sequencing is that this disease is clonal. Every AML
case -- so I’ll show you more about this at the very end -- have founding clones where
all the mutations occur in every cell. And many cases have subclones that are derived
from the founding clone. And this is very, very important as we begin to think about
setting relapsed AML and the number of clones in each of the cases is basically identical.
Here are the recurrently mutated genes, you all are custom to seeing, this is the bookkeeping.
These are the M3 cases, that’s PML-RARalpha on top. Here it is cooperating with FLT3 mutations,
these are the M1 cases. So you can see that there are a large number of mutations, these
are the ones that I already spoke of, the 10 that occur only in the M1 cases are very,
very rarely in M3. And then these are the mutations that occur basically in both subsets,
the so-called progression mutations that could cooperate with the founding or initiating
mutations. One interesting hit that we go from this analysis was finding that all four
members of the cohesin complex are recurrently mutated in AML. This complex is important
for holding sister chromatid together, sister chromatids together and is to organize during
S phase. And every gene that’s a member of this complex is recurrently mutated in
AML, only in the M1 variety.
So, how many mutations? Well, as I told you there are the same number of tier one mutations
in these two kinds of genomes. If you look at the recurrent mutations with translational
consequences and the 24 fully-sequenced cases, the number is zero to six for M1 and one or
two for PML-RARalpha. And we extend this to an additional 107 cases, the number stays
zero to seven and one to three. That’s how many mutations it takes to cause these diseases.
So in summary, for this part of the talk PML-RARalpha is the initiating mutation for all of these
M3 cases that are sequenced. There are a cluster of mutations that tend to occur together,
NPM1, DNMT3A, this classifying mutation I told you about and IDH1. And then these are
the other mutations that appear to contribute to initiation of M1. There are 10 mutations
that are held in common in these two subtypes. And these are clearly mutations that are important,
not for defining the subtype, but important for progression.
So I just want to give a brief scorecard on the AML project that is being done by TCGA.
This is just bookkeeping but there are some very interesting things that are about to
come forward and there’s an incredibly rich database that is about to be explored experimentally.
So we have 50 whole genome sequenced cases that are complete and completely validated,
the ones I just told you about and another 26 with normal karyotype AML from any FAB
subtype of the disease. Another 150 cases had exome parasequence at the Broad, transcriptomes
were sequenced from these same cases in British Columbia by Mark O’Meara’s group and 192
methylation rates have been done from this set by Peter Laird and Tim Triche at USC.
We’ve also just finished sequencing from among this set the 50 cases that have primary
refractory or early relapse disease which will add richly to our understanding of this
worst of the worst subset of the patients.
The cases that we chose represent AML as a disease. As I told you, about half the cases
have these intermediate risk findings, about 20 percent have translocations associated
with good risk and about 20 percent have these poor risk cytogenetic abnormalities. So it’s
a fair sampling of this disease as it occurs in the real world. These are just the data
on tier one, two and three mutations in the patients with normal karyotype AML versus
the APL or the M3 subtype. You can see as I told you, the numbers of mutations are about
the same. The thing that determines the total number of mutations per genome, you should
be able to predict?
Male Speaker: Age.
Tim Ley: Age. There are a number of new, recurrently
mutated genes this is a partial list of the recurrent mutations that are present in up
to three percent of cases. Many of the names of these genes are new, not all of them are
completely validated -- this work will be done within about a month. And then you can
see that there are patterns that begin to develop in terms of mutual exclusivity that
are being explored by Ben Raphael, I’ll put a plug in for his talk this afternoon,
who will be showing you data about patterns of exclusivity. There are beautiful data that
have come from Vancouver where they’ve used RNA-seq to find a number of well-known translocations
for this disease. And remarkably, the number of private translocations that create novel
fusion proteins in many genes that are well-known to us but have not been previously identified
as translocation partners in AML. Many of these are in-frame fusions that create novel
proteins with novel functions and many of them are in genes that have never been seen
before mutated in AML, like for example, DNMT3B.
Finally, in work that Tim Triche and Peter Laird have done, they have done a beautiful
job of assembling the methylome data for these on that Illumina 450K array. With 192 cases
you can see these gorgeous patterns that seem to privatize individual groups of AML cases.
I think all of us predicted that there will be very, very significant mutations that would
predict these individual patterns of methylation. And in fact, that DNMT3A would be the primary
predictor, along with the mutations Peter spoke about in IDH1 and 2 and TET2, which
occur in about a quarter of cases of AML. As we went and looked at the classifying mutations,
basically none of them classify these methylation phenotypes. There is one cluster of mutations
that occurs together with DNMT3A, FLT3, IDH1 and 2 and TET2, the common mutations here
were the hypomethylation phenotype but if you look up this line, all of these mutations
occur in combinations. And none of them predicts this phenotype, so there’s much to learn.
The last thing I want to say is that there’s much more to the digital bookkeeping that
exists when you look at this kind of data. Deep digital sequencing is a clinical tool.
It tells us a great amount about the biology of this disease, by looking at deep digital
data. We’ve been able to deduce the clonal evolution of AML at relapse. As I told you,
many of these mutations occur in the cell that is transformed. Mutations in many genes
contribute to the initiation and progression and then subclones arise from these founding
clones in most cases of AML. These subclones have different mutations and different behaviors
after therapy. Some of them completely disappear with therapy. Others come forward achieving
quantal births of mutations that clearly contribute to relapse. Understanding this clonal behavior
at relapse will be extremely important in terms of predicting responses to drugs and
patients and of course defining new therapeutic approaches because what we have to do in this
disease is remove the founding clone to cure the patients. Because every time we look,
the founding clone remerges.
I just want to thank our patients, without them there is no study. Our funders, including
Al Siteman who sequenced the -- who funded the sequencing of the first cancer genome
while no one else was very interested. And finally Rick, Elaine and Li Ding who lead
the work in the genome institute at Wash U. And my colleagues on the Genomics of AML program
project grant, most particularly John DiPersio who leads our oncology group. Thank you.
[applause]
Male Speaker: Thank you, Tim. We have time for a few questions,
Eric.
Male Speaker: Great, great talk. Just fantastic, I had two
specific technical questions.
Tim Ley: [affirmative]
Male Speaker: When you say you know how many mutations there
are, you mean you have a lower bound?
Tim Ley: Yes.
Male Speaker: Right.
Tim Ley: Exactly.
Male Speaker: Because there could be more --
Tim Ley: There could be more --
Male Speaker: -- than the number that had been looked at
so far.
Tim Ley: But this sets the floor.
Male Speaker: This sets a floor, good --
Tim Ley: Okay --
Male Speaker: -- I wanted to check. And the other was just
a tiny thing, I couldn’t help but noticing on the list the CUB and the sushi domain protein
and the mucins [spelled phonetically] come up --
Tim Ley: That’s right --
Male Speaker: -- and both of those are in this class of
late replicating probably --
Tim Ley: Absolutely --
Male Speaker: -- [unintelligible] genes.
Tim Ley: Yep, absolutely correct. But there are many
more that aren’t.
Male Speaker: [inaudible]
Tim Ley: Right.
Male Speaker: [inaudible]
Tim Ley: Yeah, absolutely.
Male Speaker: [inaudible]
Tim Ley: Yeah.
[laughter]
Tim Ley: They’re all over the place and so is titin
[spelled phonetically], right. What is odd though that if you simply apply a significantly
mutated test for these genes which is very rigorous in terms of things and taking size
into account --
Male Speaker: They don’t go away --
Tim Ley: -- they don’t go away. So, I think the reason
I leave them on these slides even though my informatics colleagues tell me, “Take them
off, they’re not significant,” is I have an open mind. They don’t occur in every
case. They should, based on their size. They sure as hell don’t. A lot of small genes
are recurrently mutated and they show up again and again and a lot of big genes like olfactory
receptors [spelled phonetically] never show up in these cases. So it’s hard to know
what it all means. Yeah. On big families, that’s what I meant to say. Big, big gene
families that you might expect to be recurrently mutated don’t show up at all.
Male Speaker: Matthew.
Male Speaker: Tim, great talk. I was struck by the PML-RARalpha
co-mutation with FLT3 --
Tim Ley: [affirmative]
Male Speaker: -- and wondering about the FLT3 wild type
set. Whether from gene expression or other evidence you can get any clue for what might
be a parallel driver mutation to FLT3.
Tim Ley: Critical question. We’ve looked very carefully
thinking that there must be other tyrosine kinases that substitute for FLT3 in these
cases since the combination is so common, does not exist. There are clearly other cooperating
mutations in these cases that aren’t a member of this class. So there’s distinct heterogeneity
even among PML-RARalpha but it does not explain outcomes. We don’t have the answer, but
it’s an important question.
Male Speaker: Okay, thank you Tim. We have 15 short minutes
for coffee break, so --