Tcga - The somatic genomic landscape of glioblastoma multiforme - Roel verhaak

Roel Verhaak: All right. Thank you, Chuck. I'd like to thank the organizers for allowing me the opportunity to speak here today, and I want to specifically mention to Kenna that I did just sign the release form, so she can tweet and film and Facebook, all you want. Let's see, how does this work? Okay. So -- and, of course, I'm representing a large group of people here. So, you may think, because today I'm talking about the somatic landscape of GBM, and you may think -- I'm sure there's an easier way to do this -- oh, there. You may think, a GBM marker study? Didn't we already have one of those? And that's, of course, correct, because we started in 2008 by publishing on the comprehensive genomic characterization of GBM, and that was sort of the kickoff publication to the entire TCGA. And we followed them in 2010 by two papers, one describing gene expression subtypes, and a second one describing a hypermethylator phenotype. So, the dataset that we used for those three papers, they're sort of represented here. We had DNA sequencing of about 600 genes in about 91 cases. At that time, of course, an enormous amount of sequencing, whereas nowadays, you can do that in a day. And we had molecular profiles and copy number profiles on a sample cohort of about 200 cases. So, fast forward to 2012, now we have exome sequencing on 300 cases, or close to 300 cases; we have copy number on almost 600; and we have some form of molecular data on about 600 samples. So this is really quite a large dataset. Importantly, we also have whole genome sequencing of 17 samples, and RNA sequencing of 164 samples. First, we looked for significantly mutated genes using the 300 -- or close to 300 -- exomes. You'll see that the top genes are the ones that we saw in our 2008 paper as well: P53, PGFR, P10; sort of the usual suspects. But we also find a bunch of novel genes that have previously been unassociated with cancer, angleomblastoma [spelled phonetically], in specific. So, for instance, there is SPTA1 that is mutated in 10 percent of the cases, ATRX, TCHH, and so on, and these genes are all present at frequencies above 3 percent. So this slide summarizes the 26 most significantly mutated genes with the 300 samples on the x-axis and the genes on the y-axis. First, I'd like to point out that we see no significant gene mutation in about 10 percent of samples. That's sort of remarkable in my mind, although we see that in most tumor types. We see interesting patterns of mutual exclusivity, such as there's never a mutation in both PIK3R1 and PIK3CA, which obviously makes sense as these are working in complex. And we also see interesting patterns of co-occurrence. For instance, all the cases with an IDH1 mutation also harbor a mutation in P53, and a majority of IDH1s harbor a mutation in ATRX. And this has been recently reported by Vogelstein and colleagues, that these also co-occur in lower grade oligodendrogliomas. I'd also like to point out that we found five BRAF V600E mutations. These are, of course, of interest because they are very frequent in melanoma and respond to vemurafenib in that disease. Although in other diseases, such as colon cancer, these have also been described, but they were not sensitive. So it does not automatically translate to a treatment. As John Weinstein showed this morning, we also looked for patterns of mutation in chromatin modifiers, as these have recently gained great interest in different diseases. We compiled a gene set of 167 genes thought to relate to chromatin modification, and we then plotted the mutations in these 167 genes, as you can see in this slide. And we find that about 40 percent of GBM has a mutation in at least one of those, and these occur in a strikingly mutually exclusive fashion. Now, we tested this for significance by doing 10,000 permutations of similar sized gene sets, and we found that in 97 percent of these permutations, we find a lower number of cases to harbor a mutation, thereby suggesting that this finding is quite significant. As I mentioned, we have a very large number of copy number profiles, and in 2008 we reported on about 200 GBMs, and we found the number of significant amplifications as is shown in this GISTIC figure where you have all the chromosomes on the y-axis and each of the peaks represents a focal region of copy number gain. Now fast-forward again to 2012, we find more peaks, also because we have better methodology. But we also importantly find that most of these peaks harbor only a single gene. Similarly, we find this for focal copy number loss, and I want to highlight QKI, that's 6Q26, which is now only a gene in this region, and this has previously thought to target PARV2. So we looked at genomic rearrangements using whole genome sequencing data. This is a specific copy number locus, chromosome 12q15. This is a locus that harbors MDM2, and each of these lines here represents a copy number of rearrangements between two chromosomal segments. So, as you can see, in this specific sample, there's many genomic rearrangements, and we could actually assemble all those genomic rearrangements into an extra chromosomal ring structure also known as a double minute, and this was confirmed by FISH. As Siyuan showed you this morning, we looked extensively for fusion transcripts, and we found 84 in-frame fusion transcripts. We also found a number of out-of-frame fusions, in 164 GBMs. I want to highlight FGFR3-TACC3, as this was recently published. Just like the one Siyuan talked about this morning, TFG, this is a local -- a small inversion, and since we have -- and we have two cases that harbor this fusion event, and in both cases this is the copy number profile, we see that this tags along with a focal amplification. We found a number of EGFR targeting fusions. They don't -- do not necessarily involve EGFR as a fusion partner, but all of these occur in the area of an EGFR amplification. So these are 11 samples that have an EGFR-associated fusion, so this is EGFR in the middle, and the red indicates that area of focal copy number gain. So this suggests that there's a more complex event going on in this locus. It looks for intragenic rearrangements, starting with the vIII deletion that has commonly been reported in GBM, and the vIII deletion targets exome two to seven, and falls in the extracellular domain of EGFR, and this is also the domain where the majority of port mutations occur. We then searched for C-terminal deletions, we found three different C-terminal deletion variants, and what we are likely undercalling the true number of C-terminal deletions, as those would not result in a fusion transcript with rates on both sides of the deletion, so the 6.4 percent that we report is likely undercalling. And then we also showed this morning, we also find two novel variant, or are at least relatively unknown. There's a few incidental reports on this: the target exon 12 and 13, or exon 14/15. If you now combine all the data on EGFR using all our samples, we find 45 percent of GBM harbor an EGFR-associated point mutation or genomic alteration aside from the focal amplifications that we see. So EGFR is clearly one of the most critical genes for gliomagenesis. And efforts to devise an EGFR therapy would still be very worthwhile. As I mentioned in the introduction, we have previously looked for molecular subtypes. We found a G-CIMP or a hypermethylator phenotype that fell entirely within the expression pro-neural group, and we also described a neural, a classical, and a mesenchymal expression group. This slide shows you the data offered by 330 cases, sorted by molecular subtype, and each of the rows indicates a genomic change, and it highlights associations between genomic abnormalities and molecular subtypes. For instance, there's the association between ids1 mutations and G-CIMP, which we, of course, have previously found. And we now also see that most of these have a MYC amplification. Similarly, we find EGFR amplifications in classical, but we also see cyclin E1 in the classical group, and so on and so forth. Importantly, we confirm that the G-CIMPs do a whole lot better in terms of outcome than the four non-G-CIMP expression groups. And we challenge a bit of paradigm here because if you look specifically at the non-G-CIMP samples, you'll see that the proneurals, without G-CIMP, do worse than other groups, whereas this was previously thought to be one of the better survivor groups. Lastly, we have our PPA-based protein expression profiles, so I have to disappoint Dr. Levine because we also looked at this in the context of gene expression groups. For instance, of course, phospho-EGFR is highly expressed in the classical group, which also had all the EGFR amplifications. And we also see, although it's not as visual, but we also see a small decrease in the apoptosis modules, so a number of proteins that combine from an apoptosis module, we see a decrease in expression in that module, in the classical group. So, in summary, I described comprehensive genomic profiling of about 600 samples, and we detected novel significantly mutated genes, such as SPTA1, LCTR, and so on. We used whole genome and RNA sequencing to detect genomic rearrangements, most notably involving EGFR. And lastly, I want to again point out that the proneural class may actually perform worse than other subtypes. And I want to thank Lynda Chin, Cameron Brennan, and Aaron McKenna, with whom I have co-led this project, and I want to thank the people in my lab, most notably Siyuan and Rahul, who performed a lot of the analysis that you saw today. And lastly, the TCGA GDAC at MD Anderson Cancer Center. Thank you. Charles Perou: Any questions for Roel? I have one, which is, so, is EGFR amplification, is that occurring in double minutes? Or is it in the chromosomes, or what? Roel Verhaak: So the double minutes that we have seen, we have seen two different double minutes, so two individual whole genome samples had one. They both are -- contained MDM2, but one of the two additionally had portions of chromosome seven, so it was a chromosome seven/chromosome twelve double minute, and indeed, it also contained EGFR. Male Speaker: So, I was going to ask, the significantly mutated genes that are new, do you think they are biologically significant? Roel Verhaak: Well, of course, we don't have any functional data, and in terms of the statistics, I would argue that they are, the methods that we have to identify significantly mutated genes, I think, have improved, and based on those methods we would argue that these are biologically significant, but we don't have the functional data to support that notion. Matthew Meyerson: Following Chuck's question, I noticed, and David says there's some significant mutation of PDGFRA, and I think Eric Holland's group at MSKCC had reported intragenic rearrangements and PDGFRA similar to those in EGFR. Did you observe a significant number of those rearrangements as well? Roel Verhaak: Thanks, that's a great question. So, PDGFRA indeed has small exome 8/9 deletions, and we see those, but at much lower frequency. Very interesting, maybe 1 or 2 percent, I would say. Male Speaker: How does MGMT promote or methylation play into your survival analysis? Roel Verhaak: Also a good question. We do have MGMT methylation status. I don't think, if I remember correctly, it doesn't specifically track with one of the molecular subtypes, but we haven't corrected for it in the survival figures that I showed you. Charles Perou: Thank you, Roel.