Tip:
Highlight text to annotate it
X
Adam Ewing: Okay. So I'd like to start by thanking the
organizers for giving me this excellent opportunity to present our call for participation in the
mutation calling benchmark exercise we've put together, Mutation Calling: Benchmark
4. This is the fourth benchmarking exercise that TCGA has carried out. So I'm going to
tell you a bit about how we've gone about setting this up, what the motivation is, and
how to get involved.
So, just briefly to go over the Mutation Calling Benchmark process, in case anyone is unfamiliar,
we start out by selecting pairs of tumor and normal BAMs. BAMs contain short read alignments,
and these BAMs are then distributed to the participants in the benchmarking exercise,
and they call mutations and return mutations as VCF files. So, VCF stands for variant call
file. It's just -- it's a standard widely-used method for expressing all variety of mutations
in a unified file format that we really want to push people to express their mutations
in. So, VCFs are collected and compared, compared for concordance and discordance across somatic,
and we want to encourage people to submit germline calls as well. And so at the end
of the day what we get is a picture of sort of where the field of mutation calling in
cancer stands, and that's a really valuable thing.
Just really briefly to give you a little bit of background and make sure we're all on the
same page with the kinds of mutations I'm going to be talking about. I'm sure this is
fundamental, but SNVs, or single nucleotide variants, so just base pair changes that define
nucleotide positions, and also short insertions and deletions, less than 100 base pairs. Larger
rearrangements like insertions, deletions, duplications, inversions, transductions are
referred to collectively as structural variants, or SVs. And regions where genomes differ from
diploid copy number differ from absolute allele count of two are referred to as copy number
variants, or CNVs.
And so since this is Benchmark 4, clearly there are three other benchmarks which occurred
prior to this, and just to go through briefly the history of benchmarking efforts in TCGA.
So Benchmark 1 was single nucleotide variant calling on six pairs of whole genomes. Benchmark
2 was single nucleotide variant calling on 14 pairs, tumor normal pairs of exomes. Similarly,
Benchmark 3, again, single nucleotide variants on 25 pairs of exomes, this time with associated
validation data, so deep sequencing data over selected regions to validate the presence
of mutations.
And so what I'm calling for participation for today is Benchmark 4, so in addition to
single nucleotide variants we're going to take INDELs, SVs, and CNVs into account, and
we're going to do this on whole genomes from -- derived from cell lines.
So why is it important that we do another benchmark? So we've done three, why is it
important to do another? Well, if we're going to accomplish the goal of comprehensively
characterizing cancer genomes, TCGA has to get together and measure and set standards
for the accuracy of mutation calls. And sort of toward this end in this benchmark, we're
being more comprehensive about the variety of mutations that we're considering. So, like
I said, in addition to single nucleotide variants, we want to extend this to INDELs, structural
variants, and copy number variants to get the full spectrum of variation and evaluate
how different mutation calling algorithms are performing across these different types
of somatic variants.
So as I'll talk about on the subsequent slides, Benchmark 4 really is a controlled experiment.
So we have these cell lines. We can take advantage of their clonality to do things like simulate
normal contamination, which -- so Gaddy's talk and Chris Miller's talk were great lead-ins
for this. We can simulate subclonal expansions by using spike-in mutations. Spike-in mutations
also give us the opportunity to evaluate false negative rates. So they give us sort of the
ground truth and that hasn't been possible in previous benchmarking efforts. And since
the cell line genome data is publically distributable, we can encourage wide participation, both
within TCGA and outside of TCGA. So, for instance, we're reaching out to ICGC, and they're participating
in this benchmark, and others outside of cancer genome consortiums who have an interest in
mutation calling in this sort of tumor normal context are encouraged to participate.
So further -- let's see, on this theme of why are we doing another benchmark? Well,
so there's still a lot of discordance in the mutation calls that we get. So this is a -- sort
of a representative example from a previous benchmark exercise, and so what is shown here
on this Venn diagram there's calls on the same pair of tumor normal BAM file made by
the Broad Institute, by Wash U, and by UCSC. And you can see sort of the concordance and
discordance here in this Venn diagram. So it's sort of a majority of mutations are concordant
between at least two of the centers, but there's still a lot of discordance happening, and
this is important to take into consideration since sort of mutation calling is fundamental
to cancer genomics. Cancer genomics depends on the sort of fidelity of mutation calling
algorithms.
So the samples that we are using to derive all of the BAM files that we're distributing
for Benchmark 4 are based on these two pairs of cell lines. So, HCC1143 and HCC1954 -- so
these are both -- 1143 and 1954 are both derived from breast tumors, and the -- they each have
a paired normal sample, which is derived from blood from the same -- it's a cell line derived
from blood from the same patient. All of these lines are available through ATCC and they've
been sequenced to between -- the sequences that we have for this benchmark are at between
50x and 71x, sequenced at the Broad Institute. And that's all I'll talk about. As I mentioned,
all this data is publically distributed through CGHub.
So this is sort of what we want participants to do. So there's three parts to this mutation
calling exercise. So the first part is pretty straightforward. We just want participants
to compare the tumor cell line full genome BAM to the corresponding normal full genome
BAM for both pairs of cell lines. This will sort of establish a baseline under sort of
ideal conditions, so you've got sort of higher coverage genomes here and their cell line,
so they're presumably clonal.
And so from there we can use sort of this clonal property of the cell lines to do interesting
things, and so these are -- we can simulate normal contamination, so sort of in this row
A here what I'm showing is samples. Each one of these pie charts represents a BAM file
that we've generated for the benchmarking exercise, and so what I'm showing here is
we have mixed the normal and tumor BAMs to yield a 30x coverage BAM file in various proportions.
So over here it's 5 percent, simulates 5 percent normal contamination, and over here we're
simulating 95 percent normal contamination. And as has been alluded to in previous talks,
normal contamination is an important factor in mutation calling fidelity.
And so in addition to simulating normal contamination, we can simulate subclone expansion. And the
way we do this is by taking the original tumor BAM file, spiking-in single nucleotide variants
and structural variants into a single allele, and we can spike-in to a single allele by
using results from Scott Carter and Gaddy Getz's group's absolute algorithm. So we can
selectively spike-in to one allele, and then we can mix that so -- by spiking-in we get
a genetically distinct tumor BAM, and then we can mix that back in with some amount of
normal contamination and some amount of the original tumor to simulate the presence of
subclone in the tumors. And so we can scale -- we've scaled that from 1 percent subclone,
which will be difficult, if not impossible, to detect mutations, and up to 40 percent,
which should be feasible.
And so this sort of normal contamination model scheme and subclone expansion scheme were
generated for both pairs of cell lines, and so, in total, there's -- we're doing six comparisons
here. So this BAM versus the normal, this BAM versus the normal, et cetera. So, in total,
we end up with 28 BAM files, which are distributed publically via CGHub. So if you navigate to
this URL here you can download a public key, and you can use that public key in Gene Torrent
to grab the BAM files for the benchmarking exercise.
So if -- those of you who attended the CGHub workshop yesterday evening should be familiar
a bit with this process, and many thanks to Chris Wilks and the CGHub team for helping
us to get these BAMs up and dealing with our requests to replace them and so on.
So in addition to providing data whereby we can evaluate the performance of mutation callers
comparatively, Benchmark 4 has also been stimulating the creation of new evaluation tools for VCF
files and BAM files, and so I'm listing some of these here. Just to point out that VCF
is a successful standard for expressing mutation calls. There's a whole bunch of tools out
there for it, including VCFtools, the Genome Analysis Toolkit at the Broad, PyVCF, et cetera,
et cetera.
So here are some of the tools that we've created, either specifically for Benchmark 4 or related
tools. So Bamsurgeon is the method I'm using for spiking-in single nucleotide variants
and structural variants into pre-existing BAM files. So if you're interested in that
kind of thing, definitely come talk to me. I'd be happy to have you use it.
VCFcomparator is the comparison engine that we're using to evaluate the outcome of the
-- this benchmarking exercise. LeftShiftBreakends is a bit esoteric, and if you're really into
accurate calling of structural variants breakends, this is something to consider, but something
to talk to me about. And I also want to plug the upcoming VCF to MAF converter. Many thanks
there been discussion among the benchmarking group about the purpose of different pipelines?
evaluating both sensitivity and specificity. So, the sort of large scale effort you mentioned
would really benefit from improved sensitivity, whereas sort of the clinical efforts would
benefit from improved specificity. So we'll evaluate it sort of both ways and give recommendations
for how to appropriately tune mutation calling pipelines both for sensitivity and for specificity,
depending on what your goals are.
Charles Perou: I have a quick question, which is, when you
make the mixed BAM file between the tumor and normal --
see where you want to have it more specific or more sensitive, and I'm sure there will
different criteria that you might think -- someone would argue the opposite of what you say.
clinical settings. Our experience is that we want to open up the calls pretty loose
is Giovanna Ciriello from Memorial Sloan-Kettering. He'll be talking about --