Tip:
Highlight text to annotate it
X
Laura Rasmussen-Torvik: Thank you so much. I'm Laura Rasmussen-Torvik.
I'm an assistant professor at Northwestern University, and I've been involved with eMERGE
for about four years. And I'm presenting today I think as my role as one of the co-chairs
of the PGx working group along with Dan Roden and Josh Denny. Can I have the next slide,
please?
So, I'm reporting on eMERGE progress in this area of genetic testing today, and my presentation's
going to be nearly entirely about PGx because for eMERGE-II, the vast majority of genomic
testing that has gone on has been in PGx. So, next slide, please.
So, I know we've talked a little bit about PGx in several of the earlier talks, but I
just wanted to give everyone a quick overview again and catch people up who may not have
been on earlier. There's three primary aims for eMERGE PGx. In the first aim, across the
network we're recruiting almost 9,000 people, and the goal is to recruit people we believe
will be prescribed one of the drugs that have a CPIC guideline in the relatively near future.
Then, on all of those nearly 9,000 people, we are doing deep sequencing using a capture
agent I'll talk about in a little bit, and obtaining results that way.
The second aim, the sort of actionable variants aim that you see on screen, has several parts.
One of those parts is that at each of the eMERGE sites, they need to pick variants that
they would like to return to individuals in the electronic health record. Then we need
to generate clinical-grade genotypes so we can do this return. Then we need to figure
out how to get those clinical genotypes into the EHR, and then we also have to develop
clinical decision support to help our providers interpret that EHR information.
And then in the third aim, we're taking all of this genotyping information that's coming
off the PGRNSeq, which is our next-generation sequence platform, and we're creating repository
of variants of unknown significance -- again, mainly the rare variants -- and also pairing
that with phenotype information that we've extracted from electronic health records across
all the sites, so that we hope we can initiate studies of function in genotype-phenotype
relationships.
So, that is the official overall aims of PGx.
The reality is that PGx does vary from site to site. We've had to -- every site has had
to tailor it depending on what clinicians at that site are interested in returning,
what the IRB at that site is interested in letting us do. So, you know, at times, I've
wanted to pull my hair out when we're trying to summarize eMERGE -- or, excuse me, PGx
-- because it can look a little bit different from site to site. But in some ways I think
that's really an opportunity for PGx, because since this is a diverse project and it's being
implemented differently, we're having a variety of experiences, and I think it's important
to report on those. Next slide, please.
So, here's the progress of PGx as of mid-January 2014. We've accrued almost 4,000 people with
samples to use on our next-generation sequencing platform, and this includes a mixture of sites,
again, like -- it's going to be a reoccurring theme, that PGx is implemented differently
across sites. So, some sites are recruiting de novo, other sites are recruiting from their
existing biobank, and some sites actually had clinical samples in their existing biobank
and therefore didn't need to re-recruit people for the sample. Twenty-four hundred of these
people have been sequenced, and you'll notice that the denominators are slightly different
because there are a couple sites that are sequencing more people than they are returning
clinical results to. And then almost 1,400 people have had clinical genotypes obtained
that we can put in the electronic health record. Next slide, please.
So, some details about the PGx platform which we call the PGRNSeq. It's a next-generation
sequencing capture agent, and it was developed by our partners PGRN. Eighty-four genes were
selected by a vote of the PGRN community, and the sequence capture included the complete
coding regions and some sequence upstream and downstream. The platform also includes
some known variants that are present on other commercially-available platforms to make meta-analysis
easier. Next slide, please.
Batches of 24 or 48 samples are processed through Illumina flow-cell lanes, and there
have been really, really fabulous results from this platform to date. Thirty-two diverse
HapMap trios were sequenced, and, on average, the depth of coverage per sample was 496x.
And then when you compare those genotypes that were derived from the PGRNSeq data, they
were 99.9 percent concordant with existing SMB [spelled phonetically] data from these
samples in the 1,000 Genomes project. Next slide, please.
So, again, the implementation of this platform across the PGx sites has been diverse. Because
of the way this supplement was funded, there are seven sites that are running samples at
CIDR. Two of those sites are only running samples at CIDR, but those sites are running
some samples at CIDR and some samples at two locations. This is complicated, but again,
it also provides opportunities to really understand what it's like to try to implement an NGS
platform across lots of sites. We have some diversity in the machinery being used: one
site is using Ion Torrent and the other ones are using Illumina. Next slide, please.
And here, you can see which sites are running at least some of their samples onsite, whereas
opposed to others are sending all of their samples offsite to run PGx. And then, at the
bottom part of the screen, you'll see that two groups, Mayo and Mt. Sinai, are hoping
to return some results directly from PGRNSeq, and that would mean they are actually going
to try to obtain clinical-grade results from PGRNSeq, and I'll talk a little bit more about
this in a second. Can I have the next slide, please?
So, again, when I was talking about the different specific aims for PGx, one of the most important
aims for aim two was to get clinical-grade genotypes so that we could implement things
in the electronic health record. Next slide, please.
And here, I'm going to try to clarify my language, because it can get a little complicated when
we're talking about PGx, but PGRNSeq is generally run on research-grade samples. And, in eMERGE,
generally, when we're talking about PGRNSeq, we're talking about sequencing results on
research results. Of course, return results to the electronic health record, we talk about
needing CLIA-validated or clinical-level results. Often in PGx we refer to this as genotyping,
because in most of the sites they are using more traditional genotyping techniques to
generate these clinical results; however, there is this exception of some sites that
are trying to generate clinical-grade results from PGRNSeq. So, I'll try to be careful about
my language going forward so we all understand what I'm talking about. Next slide, please.
And here's another sort of view of how it's being implemented across sites in terms of
which specific variants are being validated, so that -- or which ones we're generating
clinical-grade results on so we can put them in the electronic health record. These genes
vary across sites. Several of the sites are all genotyping CYP2C19, VKOR, CYP2C9, and
SLCO1B1, but not all. So again, we have a fair number of sites that are at least doing
three pairs [unintelligible]. Next slide, please.
Then, of course, there's diversity in the way we are clinically validating PGRNSeq.
Six sites are validating some samples at the Johns Hopkins Diagnostic Library; this is
using Sequenom panel. Most of the sites that are doing that are not validating all their
samples this way. Other sites are using Sanger, Illumina ADME, Sequenom ADME, and, again,
many sites are validating at more than one location using more than one method. It is
complicated, but it also provides opportunities to compare across these different measures
and even within the same sites. Next slide, please.
Okay, so what are some things that PGx is really lending to the conversation in genomic
testing? Next slide. And for these next couple slides I really must thank Marylyn Ritchie.
She gave them to me, and she's the one -- she's very actively involved at the Coordinating
Center. So we have several calling pipelines; again, these are required because we're generating
the sequence data at many different sites. So, in order to make sure that we have comparable
information as a group, we're doing several of those cross-site comparisons. So each site
is performing sequencing on 32 HapMap trios along with the eMERGE study samples, and the
Coordinating Center is calculating the concordance for these trios. And so that's the first of
the two concordance checks mentioned at the bottom; the other is that the Coordinating
Center is comparing VCF on eMERGE study samples generated by the sequencing facility and VCF
generated by the eMERGE Coordinating Center pipeline. Next slide, please.
So here's a cross-site comparison of these HapMap trios across different sites running
PGRNSeq, and as you can see, we have excellent concordance. Next slide, please.
This is, quickly, an overview of the eMERGE Variant Calling Pipeline that has been implemented
at the Coordinating Center; they've used GATK. You can see the different filters that are
used, and then there's two variant calling runs at two different time points: multi-sample
calling is run on the batch sent from the Sequencing Center for each site independently,
and then quarterly, they're running a multi-sample calling run on the entire batch. Next slide.
So, here is the multi-sample calling by site at the CC compared to the single-sample calling
by the Sequencing Center. And as you can see, we have very good data to start with, and
it gets even better with multi-center calling. Next slide, please.
And similarly here is the multi-sample calling of the entire eMERGE set with single-sample
calling by site. Next slide, please.
Another type of QC analysis that we're doing is that we're comparing the research -- and
this is generally, of course, the sequencing results from the PGRNSeq -- and the clinical
pharmacogenetic results, which are generally genotyping on orthogonal platforms. The idea
of this was to evaluate the PGRNSeq research platforms. It's complicated by different report
formats. The PGRNSeq we get data in VCF formats; the VCF is not a file format meant to be read
by humans. It can be easily manipulated to get data out that's easier to understand.
And then, typically, the clinical-grade results are often coming in the form of star alleles,
particularly for the CYP genes. So there you have to take the star allele, translate it
to a haplotype, then translate it to individual genotypes, and even then when you go to compare
genotypes from that to those pulled out of VCF files, you can have strand orientation
issues. So --
Male Speaker: You're right at about 11 minutes now; we'd
like you to wrap it up.
Laura Rasmussen-Torvik: Okay, so standardization of reports would
benefit the wider community, and it's also forcing sites to develop policies about non-concordant
research results with clinical genotyping. We know our research results are really good
-- obviously, we have to report the clinical results, but it's an interesting situation.
Next slide, please.
And just my final points. There has been a lot of development of systems to integrate
genotypes as computed results; I'll let the EHRI group give you the details. But as someone
mentioned earlier, typically genotype results are imported as PDFs or mentioned in the notes;
they're not computed results. So how do we integrate these results and also document
clinical interpretation as part of these systems? I think that the documentation of clinical
interpretation can be complicated with the computed results, and this is particularly
complicated when you're receiving results from multiple outside laboratories.
And finally, what do we do if this interpretation changes? The CPIC guidelines are fabulous,
but particularly for some of the rare star alleles for some of the CYP genes, the interpretation
is changing, and how do we handle that in the Electronic Health Record and how do we
document it? Next slide. So, this is just a summary of everything I've talked about
today, and I will hand it over to my other panel members.