Tip:
Highlight text to annotate it
X
Heidi Rehm: Thank you for having me.
So I was asked to give some of the laboratory perspective. My background is as a board-certified
clinical laboratory geneticist, and I direct the Laboratory for Molecular Medicine at the
Partners Center for Personalized Genetic Medicine under Scott Weiss’s leadership. And we are
in the process of developing a genomic sequencing clinical service through our CLIA lab. We’re
working closely with our geneticists' clinics, and Mike Murray from the Brigham is here,
in terms of the up-front and return of results environment. Our main focus, in terms of the
expertise that we have and our strength, is really focusing on the data analysis interpretation
piece, and initially, we will outsource the technical component for the whole genome sequencing,
although we are developing the exome sequencing capability in our clinical lab as well.
So most of our effort has been surrounding the enormous computational challenges surrounding
variant annotation, filtration, and really, although we’ve done evidence-based variant
assessment for 10 years in the context of doing clinical sequencing, scaling that to
address the genome is, of course, another order of magnitude. We’re also working closely,
using our GeneInsight software that we’ve developed over the last eight years, to support
clinical reporting of sequenced data, and expanding that to support genomic interpretation.
We have a -- I'm working with a U01-funded grant that Robert Green is the PI of called
MedSeq, where we will sequence 100 whole genomes through that effort, half with cardiomyopathy,
half with healthy patients, and are really, you know, working on the development of the
general genome report in terms of general information from every patient, as well as
disease-specific reports looking at primary indication and using orthogonal confirmation
of those results in the CLIA setting, and also working closely in terms of integration
of this infrastructure into our EHR. And I’ll talk about a number of these pieces.
And I really want to broaden this, not just to the things that we’re doing but issues
that are common across many groups that are working in the space. And I sort of put my
punch list of some of the key challenges in the clinical implementation of whole exome
and genome sequencing on this list. I think we’re all aware that sequencing technologies
are changing very rapidly, and so picking a time point to implement a certain technology
is challenging. And that was one of the reasons we chose to initially outsource that component,
work on the interpretive piece, which is really the most challenging part, and then bring
the technology in later at whatever point it had evolved to. The computation requirements
are obviously unprecedented in terms of the large data sets we’re working with. In the
clinical implementation of this, it’s not yet ready to just use the data as is. There
is still a need to confirm the results, and unfortunately, they all can’t be confirmed
by one method, depending on whether you’re talking about point mutations, copy number
variants, et cetera, or low-level variants and somatic variation. And so different approaches
to how we think about the confirmation process, hoping that at some point we won’t have
to do this, but it’s still there now.
And also as we implement these technologies in the clinical setting, we need to both address
the fact that there are existing very high-quality tests that are targeted, yet we also want
to take advantage of the rest of the genome, and so how do we balance that in terms of
the fact that the genomic approaches don’t have the level of quality that our targeted
tests do. And we still -- we want both components of that, and I’ll talk a little bit more
about that.
In the CESAR meeting that some of us were in, we spent a lot of time thinking about
secondary findings, which are appropriate to return and which are not, and that’s
a big area under discussion. Updating results over time and, you know, patients, ideally,
may only get their genome sequenced once, but the knowledge changes. How do we address
that? And I’ll talk a little bit, some of the strategies we worked on. And probably
for me the biggest challenge is that human variation is enormous and rare, and understanding
the phenotypes associated with it will be challenging and will require great structure
and data sharing, I think. And I’ll talk at length about some of the strategies we’re
thinking about.
I do want to point out that the ACMG came out with a policy statement surrounding the
use of genomic sequencing that’s on the websites. And I’m not going to go into all
of the detail here, but I will say that the board has been fairly forward-thinking about
embracing this in the clinical arena, not only for diagnostic environments but also
even in screening, recognizing the use in pre-conception screening, and even in healthy
individuals if there’s a high threshold for what you return back to the patient, though
not, obviously, supporting it yet in a prenatal or first tier newborn screening environment.
A number of recommendations in that just four-page statement that came out. One thing that will
be a theme for me is this last statement here, “labs should share genomic data into public
databases,” and I’ll talk a little bit more about efforts there. There are two work
groups that are diving deeper into some of the aspects of these topics. One is a secondary
findings workgroup that’s chaired by Robert Green and Les Biesecker, and we’ve been
working on trying to define recommendations for what to return to patients. I’m also
chairing, along with Pinar Bayrak-Toydemir, a standards and guidelines work group addressing
the laboratory standards for how we implement both targeted and next-gen sequencing, whole
exome sequencing, and whole genome sequencing into a CLIA lab environment. And we have a
draft now that we’re still working through the last details before we start disseminating
it to the community for feedback.
One of the challenges in the clinical lab is we’d like to develop, and we have to
for CLIA standards, clear SOP so that every test is run the same. In genomic sequencing,
you really don’t want to run every test the same. Each family will have a different
assumption about inheritance pattern, different approaches, different family members available,
and different strategies, and that it makes it challenging to implement this in a clinical
lab where you really want to have very defined workloads. And as I mentioned earlier, the
technology still isn’t perfect, so how do we get the technology to a level that is appropriate
for clinical efforts? In terms of genome sequencing, well, we can maybe supplement that with an
exome to add better depth in the exonic regions that are most looked at from an interpretation
standpoint. And some of us are thinking about supplementing whole exome with the clinical
exome -- those genes already known to be associated with disease -- to have higher quality data
for those critical regions, even using multiple technologies because each has different error
platforms, specific errors, and all strategies to think about how to best effectively incorporate
this into a clinical environment so that the technical quality’s appropriate.
In our lab, we do a lot of targeted next-gen sequencing, and for every test we run, we
fill in every last base with Sanger if anything’s missed, so that at the end, whatever the gene
content of that test is, we have been able to cover it at 100 percent. Of course, that
adds significant labor and cost to that process as we iteratively fill in with Sanger as well
as confirm all of the variants by Sanger sequencing. And although this is not a reasonable approach
to fill in every last space for a whole exome or genome, many of us that are using this,
even in disease-targeted context, still realize there’s critical content that has to be
there for every patient. So these types of strategies for targeted are also being, you
know, implemented, even in the whole exome and genome approaches that groups are taking.
And, of course, adding custom design for confirmatory testing is yet another added challenge to
this work flow that is difficult in the clinical lab, but something we’re working on for
our whole genome approaches.
The other challenge that we struggle with and have for many years is assessing the evidence
of variants when there’s often very little evidence there. We time our fellows, who are
the first tier of assessing variants, how long it takes per variant that we report out
in a clinical context. If there’s no data out there at all, it takes, on average, 20
minutes to search every database and look at in silico data, and up to an average of
two hours per variant when there are publications. This is obviously dealing with rare variation,
not the level of complexity if you were to evaluate a GUS-type variant that has lots
of literature on it. And we do about 300 of these a month in our clinical lab, in terms
of reporting out the clinical significance of rare variation that may be clinically relevant,
and then that data’s reviewed by counselors as they draft reports, the geneticists who
sign it out, for our somatic cancer; also, pathologists. There’s a lot of labor that
goes into this process in evaluating the evidence for a variant before it goes on a clinical
report.
And the challenge here, and, you know, I’ve been collecting some data on certain tests
that we run, so this is for hypertrophic cardiomyopathy, one of our gene panels. We’ve tested 3,000
cases to date, found over 500 clinically significant mutations. Two-thirds of them have been unique
to a family, so in many cases we get one shot at this. You can’t just wait around for
the literature to come out on your variant, because often it’s not going to be there
ever. So this is a huge challenge in the clinical laboratory environment. You know, people say,
you know, don’t put it in a clinical lab until it’s well-established. Well, sequencing
breaks all of those rules, because we find novel variation every day. In hearing loss,
it’s even worse. Eighty percent of the variants are unique to a family with the data we have
to date.
So this presents a challenge. My variant data problem, you know, we do have public data
on variants largely in databases like dbSNP, the ESP cohort that’s now available, and
that’s very useful for general population frequency data, but it’s largely unannotated
with respect to clinical relevance. The data that is annotated with respect to pathogenicity
is usually comes from initial research studies, these LOCUS-specific databases that are out
there. Unfortunately, a lot of that data is in error. Large enough controls were not tested.
There’s publications coming out that, you know, upwards of over a quarter of that data
is just wrong in terms of its assumptions.
So we really need to get larger data sets to make effective understanding of variation.
And a lot of that data, actually, today, the best data that we do have access to, or I
should say, that is in existence, is in clinical labs and is not well-published or available
to all us in the public domain. So we have been working to try and come up with ways
to solve this. It’s a little awkward to talk about this because it’s a grant under
review, but I think the principles are still common to all of us, and so I want to talk
about them as things we should all think about, whether it’s our grant or other approaches,
or ideally both.
So, you know, we need to come up with standards in the community in terms of even how we -- the
terms we use to assess variation, the rules to evaluate evidence for those variants, and
how we think about them. And bring data together, you know. And when this project started there
was a group that thought there should be a separate clinical-grade variant database,
separate from the other databases we have. The challenge is most of the variants are
on some continuum of knowledge and constantly evolving, and it’s very difficult to say
a variant should be -- is ready for the clinical-grade database versus it’s still in the research
environment. I think everything, at some level, is still in the research environment as we
learn more about it.
So my hope is that all of this data will be in the same place. We, in clinical labs, use
research databases all the time and vice versa, so the goal is to put it all together, get
data out of clinical labs into the public domain, out of LOCUS-specific efforts, as
well as the uncurated population data that’s coming from large studies. Put it in the same
place, and then enable expert groups to, through evidence-based and consensus models, arrive
at what our best guess is for those variants. And then that, you know, has a better opportunity
to be used in the clinical environment than what we have access today.
And the project that we had proposed, we’re working closely with NCBI, to put this data
in the ClinVar database so that we can ensure that it’s in the public domain, working
with other efforts that are already working in this space but trying to expand it and
consolidate it into one place. We are working closely with the ISCA Consortium that David
Ledbetter initiated several years ago to try and get copy number variation data out of
clinical labs into the public domain, and they’ve been very successful with over 30,000
cases to date in dbVar, accessible to anyone who wants that information. And that -- we’ve
learned a lot from their efforts, and we’re now joining forces in a combined grant to
try and expand this to include molecular data. And I, you know, I was concerned about the
willingness of clinical labs and even commercial labs to want to put this data in the public
domain. It’s often considered their proprietary data. But I’ve been pleasantly surprised
that many, many labs have agreed to participate in this project. There’s only three that
have so far declined -- Myriad, Prevention Genetics, and Medical Neurogenetics -- but
many of these labs, and you’ll see commercial labs here as well, have agreed to participate
in this effort and are willing to put this data in. The challenge is it does require
resources to get that data out of their systems, to have it all structured in the same way,
and be able to put it in the public domain. And so that’s something that we, as a community,
need to figure out how to support, when labs are willing to share this data.
Another challenge that I alluded to earlier is how do we update this data over time? The
clinical labs face this as an enormous challenge today with sequencing in the clinical context,
and there’s no -- there's no billing mechanism to be able to reinterpret reports. It’s
just done, for the few labs that actually do it, it’s done free of charge, in essence,
and that’s not a sustainable model. We need to figure out how to do this more efficiently,
and the guidelines from the American College of Genetics are that we should be doing this,
but most labs are not. So how do we update data over time? And I’ve been looking at
our data over the course of a number of years of reporting on many different diseases, in
this case hypertrophic cardiomyopathy, and how we have changed the categories as we’ve
learned new things over time. And over a five-year period we had changed categories 300 times
with different variants and different knowledge that was acquired in both directions, going
from benign to pathogenic and vice versa. And we just published some of this data about
this sort of arena this past month, and about 4 percent of a physician’s reports per year
need to be updated, so it’s a pretty significant challenge for us.
We ended up developing a clinic interface to our laboratory software, where the physician
can go in, get access to their patient reports in electronic form, and the variants are structured
and connected to our variant database. So if we update, if I go in and update a variant,
maybe in the context of signing out a new report, then that automatically will update
the variant in the physician’s system. And today they actually get an email alert, without
PHI, but a link that lands them on this page and tells them what the variant information
that changed, and so they can see that information, they can click on the variant, and read the
evidence that was the basis for that change. And so this, you know, I could just approve
a variant and 1,000 reports could get updated in seconds, and that makes it much more efficient,
to be able to do that and support patient -- improve patient care.
And this activity has been the subject of an NIH challenge grant that David Bates is
the PI of, to look at the usability of the system by physicians, particularly physicians
who have never had any training in it, and how easily can they just go in and figure
it out, and get the updates, and know what’s happened, and grading them on certain tasks,
et cetera. And overall -- and I was a little concerned about the system, and that in giving
all this very efficient updating, that the physicians would come back and say, great,
now can you go amend that report for me and sign it out again? And they haven’t largely
done that. They’ve been satisfied with the system delivery and the updated information.
They can print it out and put it in a patient chart if they want to, and we haven’t really
had to amend reports. So that’s been a great system for us to support this process and
improve patient care.
Now as we think about expanding these types of systems into genomic medicine, where we’re
dealing with the whole genome, obviously, we can’t ping the physician every time one
of those three million variants changes. It’s knowledge. So there’s got to be more infrastructure
to support who gets what alerts, or do you deliver alerts at all, or do you allow just
real-time engagement of that data in clinical decision support paradigms. And so these are
all things that we’re thinking and enabling within the Partners Healthcare EMR environment
in discussing how to do this best, with what strategies. We’ve also been talking to the
-- so we disabled the alerting mechanism in the oncology domain because somatic tumors,
those tumors that are genetics-evolved, and updating years later is not useful. But they
have asked us about using the same environment to deliver clinical trial notification, which
is, of course, very common within the oncology domain. So we’ve been thinking about expanding
this infrastructure to support those types of activities.
You know, when we originally developed this infrastructure, we built a network hub to
enable many labs to communicate to many health care organizations so everybody didn’t have
to build interfaces to every lab, because that’s expensive architecture, to allow
knowledge sources and labs to share data amongst themselves but still maintain their own interpretations
of data. We’ve talked with Illumina about bringing their CLIA whole genome lab onto
this network, being able to robustly share data, and we are rolling out the infrastructure
to support this this month, where labs who agree to share their data can show the variant
interpretations that they have on a variant-specific level. You can click on lab X’s variant,
see what they say about that variant, how many cases they’ve had, what the literature
is that supports that, and labs can be able to sort of import other labs’ interpretations
into their system, enabling a much richer sharing of data.
We also have enabled case history sharing, so if I have all of my cases I’ve ever seen
in my system, I can allow that data to be shared with another lab, and we strip all
the PHI off of that, so that if I go into my system I can see my cases with all PHI
in there, but then I can see de-identified cases with some clinical information and which
variants were found in each case, and be able to see that across all the data sets sort
of being shared in that environment. And so we hope that that will enable a richer understanding
of this, particularly as we try to address the challenges of rare variation interpretation.
This gene insight system is now being integrated into our EHR environment so that it will become
-- right now it’s been a stand-alone, web-based interface, and that allows us to easily roll
it out to any physician around the world who doesn’t even have an EMR. But we feel like
it’s more powerful if we can truly integrate it into the EHR environment so it’s the
same face that any physician logging into their patient electronic health record will
see. So that, by the end of this month, will be integrated into the Partners’ EHR. It
will be Partners’ Clinical Genetic Data Repository, is sort of what it’s being called.
The same infrastructure and structure data is being pushed into the research patient
data registry so that our RPDR that we use today for a lot of clinical research will
have structured genetic data within it, and that’s obviously supporting many millions
of patients within the Partners’ health care environment.
There’s another effort that Scott is leading in terms of the biorepository, consenting
thousands of patients that walk in the door of Partners Health Care, to share, to sign
up for the biorepository and consenting to broad use of that, of that data and their
sample. And we would like to engage other groups in thinking about this data sharing
environment, for one of the, perhaps, one of the demonstration pilot projects to be
able to use this infrastructure. If labs or groups are interested in getting onto a network
and sharing data, we’d be happy to talk with others who might be interested in that
and coming together for one of those projects.
And that’s all. I thank some of the groups that have worked on some of the projects I
spoke to, and I’d be happy to entertain any questions. Marc?
Marc Williams: I think it’s really excellent, the work
that’s been done to try and aggregate information. I’d had one comment, and I want to pose
a question to some of our payer representatives here. So the comment is that is one of the
benefits that has been touted for whole genome is the fact that you only need to it once,
and that, you know, then this information can be repurposed. I think you very eloquently
stated that there is, in fact, work involved with that that could lead to cost and reimbursement
issues down the road, and so perhaps we shouldn’t be quite so forthcoming about the idea of
saying this is a one-time cost and then, you know, it’s basically free to use the information.
I think that’s going to generate some very interesting models about how do we actually
reimburse for the updating, and maintenance, and that. But probably beyond the scope of
a five-minute discussion.
The specific question that I would like to pose to our payer representatives is this
issue of the labs that won’t play. And I’m glad to hear that there's a very small number,
but clearly for this to work, I mean, I think we have to have all people, you know, being
willing to put their data in so that we can all benefit from that. Is there a mechanism
in the reimbursement side where if we set aside those that are sole source providers,
like PRC, where there’s patent issues, but in situations where laboratories say, "Well,
that’s our information, it’s proprietary, we’re going to use that to be competitive,"
the payer could say, "Guess what? We’ll use somebody else that’s going to contribute
data because in the long run that’s best for our patients." Because it’s only that
economic pressure that will ultimately, I think, get them to pay. Is that a mechanism
that would be possible to explore?
Female Speaker: Well, that’s a complicated question. You
know, a business relationship with a lab, especially a national lab, is based on lots
of factors, mostly economic, so I think you’re not going to impale yourself on a stake over
that issue. Although the big national labs, I notice, are on the list there, so it’s
the smaller players that, you know, I think, that you’re talking about. And usually the
relationships with the labs are over, sort of, quality and economics, not this level
of data sharing. So I think probably you’d have to spend some time thinking about that.
Now we do actually receive lab data from our labs, again, in the molecular pathology codes
are all these stack codes. Again, when we get more specificity over those things, we
will be getting, you know, specific data, and I’d be curious about what the LabCorps
representative has to say about that. So it can be the payer that’s sharing the data,
as a pass-through.
Female Speaker: I’ll be talking about reimbursement in my
talk, and some of the new changes to reimbursement and what impacts they might have.
Marc Williams: One other thought on this, and particularly
for the Medicare program, there’s tools other than, you know, directly reimbursement,
that, you know, might be explored as ways to encourage data sharing. So, for example,
there’s all the conditions of participation that make providers, you know, eligible for
reimbursement and, you know, there’s all sorts of sort of regulatory policies throughout
Medicare in addition to, you know, the sort of individual reimbursement for tests that,
you know, might be explored as ways of, you know, encouraging that kind of behavior. But
that’s just Medicare, not necessarily private payers, but something to look into, and wasn’t
my area of expertise at Medicare, so other folks would have to be, you know, asked about
it.
Male Speaker: I’m curious --
Female Speaker: Certainly one could imagine the request that’s
made during a contract negotiation that this would be a, sort of, a standard that one was
expected to follow.
Male Speaker: I’m curious about John Harley’s Cincinnati
Children’s, about you have 300 cases in which you’ve already done exome sequencing,
and that’s all done outside the institution. What level of data preservation are you using?
Are you saving the original data, the terabytes, the five terabytes that you get back from
each subject, creating something that’s a petabyte and a half that you have to store?
Or are you storing just the final sequence and ignoring the possibility that someone
would want to refilter that data from the beginning? And then how long can -- do you
plan to keep that available for requerying and that sort of thing?
Heidi Rehm: So just to clarify, I think the 300 up there
was the number of novel variant assessments per month, but we haven’t done that many
exome sequences. But to address your question, which is a very good one, in terms of data
storage. So, we’ve addressed this in the laboratory standards that we’re developing
for ACMG right now, in terms of data storage for next-gen sequencing, and the guideline
that we put out is labs certainly don’t need to keep the raw image files, and that
we recommend that labs keep the read-aligned files, or the BAM files, or the FASTQ files
for a period of time, suggesting between one and two years, and then they keep the VCF
file, the variant call files, indefinitely. At this moment, as a suggested guideline,
although a lot of state guidelines and CLIA guidelines may trump certain things. So those
are some basic guidance.
I don’t, you know, we do recognize the ability to realign data with better algorithms, and
so not -- so wanting to keep the raw reads in a context and not just the aligned reads
but the non-aligned reads as well so that they could be realigned. But to keep them
indefinitely, I think the technologies for sequencing are changing such that the raw
technology is going to be improved such that two years later you’d probably still want
to resequence, and the cost to store data is large enough that you have to balance the
cost to store with the cost to redo it. So there’s a lot of things in flux here, and
I don’t think we can easily say this is exactly how you should do it, but we’re
making some guidance so that labs that want guidance can have that. But I still think
it’s a moving target.
Male Speaker: Mary Relling, last question, and short answer.
I guess, okay, and Kelly, really short question and answer, and we’re going to have -- so
I will claim credit or blame for the fact that there’re 20-minute talks and five-minute
discussion periods. We’re over, we’re eating into our break time. That’s fine,
because this is a really important part of the discussion. So quick questions, quick
answers.
Mary Relling: Can you help distinguish for me, because I
totally understand sharing these variant data is super important, and we need to come up
with a national or international resource. What’s the distinction between sharing to
GeneInsight versus sharing what you described in your U41/ClinVar?
Heidi Rehm: Yup. So, you know, it’s our hope that, like,
for instance, myself, I use GeneInsight, but I’m going to be sharing all of that data
into ClinVar that’s from my lab. But there’s different data structures and capabilities
that surround those data sets. So, for instance, to be able to submit full genome data sets
into NCBI they have to go into dbGaP, and there’s a lot of constraints around an access
to that data. In GeneInsight, which is a HIPAA-secure environment, because it’s within our Partners’
HIS system, we can allow patient information to be there and be shared a little more easily,
so that’s one distinction. Even though the variant level information I do hope that any
lab sharing it within the GeneInsight network that we’re trying to set up would also then,
they would be willing to submit that into ClinVar.
So, you know, I think it’s, at the end of the day, it’s multiple strategies hoping
that we can succeed somewhere or in all of them. But my real goal is to try and get everything
as much as possible into the public domain, but the degree to which the full data structure
that some of the more sophisticated IT systems can support, whether that’s ready and there
already, it’s probably not there yet in ClinVar, but hopefully, over time, we can
continue to enhance these systems to really take the full breadth of data that’s associated
with cases and clinical data as well.
Male Speaker: Okay, so Kelly, Erwin, and then we’re going
to go on.
Female Speaker: So, you have stated that --
Female Speaker: You state that about 66 percent of your variants
had never been seen before. If you suspect that these are involved in the patient’s
condition, how do you return those information for inclusion in patient health care, or do
you?
Heidi Rehm: So, we do return any variant found during
targeted testing, whether it’s a variant of unknown significance and marked as such
on the report. And the reason for that is we often try to encourage family member testing,
because segregation for hypertrophic cardiomyopathy, a dominant disease, the best way to actually
understand if we can get a live score that’s, you know, significant, that can guide us to
calling it pathogenic. So we’ll report those out, and we’ll often do free testing of
family members to support that segregation aspect. So we basically report what we know
about a variant, and then over time we may learn from and increase that knowledge.
Male Speaker: So, I applaud you on the integration with
the EMR, and that raises some questions. One of them: Who actually makes the decision which
variant gets into the EMR? And then, second, but equally important, the EMR, for any given
patient, can be viewed by many people, you know, from students to nurses to dieticians
to -- so, is there any way that you limit control, regulate access to who is going to
actually see the genetic information within the EMR?
Heidi Rehm: Yeah, so, to answer your first question, anything
that we write on a clinical report would go into the EMR just like any other clinical
report from any other system. As far as access to that data once it’s in there, so we are
actually constrained somewhat by Massachusetts state law that says that only the ordering
physician can see data from a genetic test, if not done -- if done for screening purposes.
But we’ve got around that by consenting the patients to allow the entire health care
institution to see that data, and there’s a balance there between restricting access
for very specific situations versus what we believe is the need to have broader access
to really engage this information in the care of a patient. So we are leaning towards, much
more towards the side of everyone should have access to this data, but there’s a long
conversation there that gets into some more subtle things that we don’t have quite enough
time for. Get in trouble [laughs].
Male Speaker: I have 25 more questions, but I’m going
to take chair’s prerogative and tell myself to go away. But, so that was a really great
introduction to the kinds of things that we want to talk about over the next day and a
half, so thank you, and the next speaker, Debra, is behind me.
[end of transcript]