Gm3 - Implementing genomic medicine programs - Laboratories - Heidi rehm

Heidi Rehm: Thank you for having me. So I was asked to give some of the laboratory perspective. My background is as a board-certified clinical laboratory geneticist, and I direct the Laboratory for Molecular Medicine at the Partners Center for Personalized Genetic Medicine under Scott Weiss’s leadership. And we are in the process of developing a genomic sequencing clinical service through our CLIA lab. We’re working closely with our geneticists' clinics, and Mike Murray from the Brigham is here, in terms of the up-front and return of results environment. Our main focus, in terms of the expertise that we have and our strength, is really focusing on the data analysis interpretation piece, and initially, we will outsource the technical component for the whole genome sequencing, although we are developing the exome sequencing capability in our clinical lab as well. So most of our effort has been surrounding the enormous computational challenges surrounding variant annotation, filtration, and really, although we’ve done evidence-based variant assessment for 10 years in the context of doing clinical sequencing, scaling that to address the genome is, of course, another order of magnitude. We’re also working closely, using our GeneInsight software that we’ve developed over the last eight years, to support clinical reporting of sequenced data, and expanding that to support genomic interpretation. We have a -- I'm working with a U01-funded grant that Robert Green is the PI of called MedSeq, where we will sequence 100 whole genomes through that effort, half with cardiomyopathy, half with healthy patients, and are really, you know, working on the development of the general genome report in terms of general information from every patient, as well as disease-specific reports looking at primary indication and using orthogonal confirmation of those results in the CLIA setting, and also working closely in terms of integration of this infrastructure into our EHR. And I’ll talk about a number of these pieces. And I really want to broaden this, not just to the things that we’re doing but issues that are common across many groups that are working in the space. And I sort of put my punch list of some of the key challenges in the clinical implementation of whole exome and genome sequencing on this list. I think we’re all aware that sequencing technologies are changing very rapidly, and so picking a time point to implement a certain technology is challenging. And that was one of the reasons we chose to initially outsource that component, work on the interpretive piece, which is really the most challenging part, and then bring the technology in later at whatever point it had evolved to. The computation requirements are obviously unprecedented in terms of the large data sets we’re working with. In the clinical implementation of this, it’s not yet ready to just use the data as is. There is still a need to confirm the results, and unfortunately, they all can’t be confirmed by one method, depending on whether you’re talking about point mutations, copy number variants, et cetera, or low-level variants and somatic variation. And so different approaches to how we think about the confirmation process, hoping that at some point we won’t have to do this, but it’s still there now. And also as we implement these technologies in the clinical setting, we need to both address the fact that there are existing very high-quality tests that are targeted, yet we also want to take advantage of the rest of the genome, and so how do we balance that in terms of the fact that the genomic approaches don’t have the level of quality that our targeted tests do. And we still -- we want both components of that, and I’ll talk a little bit more about that. In the CESAR meeting that some of us were in, we spent a lot of time thinking about secondary findings, which are appropriate to return and which are not, and that’s a big area under discussion. Updating results over time and, you know, patients, ideally, may only get their genome sequenced once, but the knowledge changes. How do we address that? And I’ll talk a little bit, some of the strategies we worked on. And probably for me the biggest challenge is that human variation is enormous and rare, and understanding the phenotypes associated with it will be challenging and will require great structure and data sharing, I think. And I’ll talk at length about some of the strategies we’re thinking about. I do want to point out that the ACMG came out with a policy statement surrounding the use of genomic sequencing that’s on the websites. And I’m not going to go into all of the detail here, but I will say that the board has been fairly forward-thinking about embracing this in the clinical arena, not only for diagnostic environments but also even in screening, recognizing the use in pre-conception screening, and even in healthy individuals if there’s a high threshold for what you return back to the patient, though not, obviously, supporting it yet in a prenatal or first tier newborn screening environment. A number of recommendations in that just four-page statement that came out. One thing that will be a theme for me is this last statement here, “labs should share genomic data into public databases,” and I’ll talk a little bit more about efforts there. There are two work groups that are diving deeper into some of the aspects of these topics. One is a secondary findings workgroup that’s chaired by Robert Green and Les Biesecker, and we’ve been working on trying to define recommendations for what to return to patients. I’m also chairing, along with Pinar Bayrak-Toydemir, a standards and guidelines work group addressing the laboratory standards for how we implement both targeted and next-gen sequencing, whole exome sequencing, and whole genome sequencing into a CLIA lab environment. And we have a draft now that we’re still working through the last details before we start disseminating it to the community for feedback. One of the challenges in the clinical lab is we’d like to develop, and we have to for CLIA standards, clear SOP so that every test is run the same. In genomic sequencing, you really don’t want to run every test the same. Each family will have a different assumption about inheritance pattern, different approaches, different family members available, and different strategies, and that it makes it challenging to implement this in a clinical lab where you really want to have very defined workloads. And as I mentioned earlier, the technology still isn’t perfect, so how do we get the technology to a level that is appropriate for clinical efforts? In terms of genome sequencing, well, we can maybe supplement that with an exome to add better depth in the exonic regions that are most looked at from an interpretation standpoint. And some of us are thinking about supplementing whole exome with the clinical exome -- those genes already known to be associated with disease -- to have higher quality data for those critical regions, even using multiple technologies because each has different error platforms, specific errors, and all strategies to think about how to best effectively incorporate this into a clinical environment so that the technical quality’s appropriate. In our lab, we do a lot of targeted next-gen sequencing, and for every test we run, we fill in every last base with Sanger if anything’s missed, so that at the end, whatever the gene content of that test is, we have been able to cover it at 100 percent. Of course, that adds significant labor and cost to that process as we iteratively fill in with Sanger as well as confirm all of the variants by Sanger sequencing. And although this is not a reasonable approach to fill in every last space for a whole exome or genome, many of us that are using this, even in disease-targeted context, still realize there’s critical content that has to be there for every patient. So these types of strategies for targeted are also being, you know, implemented, even in the whole exome and genome approaches that groups are taking. And, of course, adding custom design for confirmatory testing is yet another added challenge to this work flow that is difficult in the clinical lab, but something we’re working on for our whole genome approaches. The other challenge that we struggle with and have for many years is assessing the evidence of variants when there’s often very little evidence there. We time our fellows, who are the first tier of assessing variants, how long it takes per variant that we report out in a clinical context. If there’s no data out there at all, it takes, on average, 20 minutes to search every database and look at in silico data, and up to an average of two hours per variant when there are publications. This is obviously dealing with rare variation, not the level of complexity if you were to evaluate a GUS-type variant that has lots of literature on it. And we do about 300 of these a month in our clinical lab, in terms of reporting out the clinical significance of rare variation that may be clinically relevant, and then that data’s reviewed by counselors as they draft reports, the geneticists who sign it out, for our somatic cancer; also, pathologists. There’s a lot of labor that goes into this process in evaluating the evidence for a variant before it goes on a clinical report. And the challenge here, and, you know, I’ve been collecting some data on certain tests that we run, so this is for hypertrophic cardiomyopathy, one of our gene panels. We’ve tested 3,000 cases to date, found over 500 clinically significant mutations. Two-thirds of them have been unique to a family, so in many cases we get one shot at this. You can’t just wait around for the literature to come out on your variant, because often it’s not going to be there ever. So this is a huge challenge in the clinical laboratory environment. You know, people say, you know, don’t put it in a clinical lab until it’s well-established. Well, sequencing breaks all of those rules, because we find novel variation every day. In hearing loss, it’s even worse. Eighty percent of the variants are unique to a family with the data we have to date. So this presents a challenge. My variant data problem, you know, we do have public data on variants largely in databases like dbSNP, the ESP cohort that’s now available, and that’s very useful for general population frequency data, but it’s largely unannotated with respect to clinical relevance. The data that is annotated with respect to pathogenicity is usually comes from initial research studies, these LOCUS-specific databases that are out there. Unfortunately, a lot of that data is in error. Large enough controls were not tested. There’s publications coming out that, you know, upwards of over a quarter of that data is just wrong in terms of its assumptions. So we really need to get larger data sets to make effective understanding of variation. And a lot of that data, actually, today, the best data that we do have access to, or I should say, that is in existence, is in clinical labs and is not well-published or available to all us in the public domain. So we have been working to try and come up with ways to solve this. It’s a little awkward to talk about this because it’s a grant under review, but I think the principles are still common to all of us, and so I want to talk about them as things we should all think about, whether it’s our grant or other approaches, or ideally both. So, you know, we need to come up with standards in the community in terms of even how we -- the terms we use to assess variation, the rules to evaluate evidence for those variants, and how we think about them. And bring data together, you know. And when this project started there was a group that thought there should be a separate clinical-grade variant database, separate from the other databases we have. The challenge is most of the variants are on some continuum of knowledge and constantly evolving, and it’s very difficult to say a variant should be -- is ready for the clinical-grade database versus it’s still in the research environment. I think everything, at some level, is still in the research environment as we learn more about it. So my hope is that all of this data will be in the same place. We, in clinical labs, use research databases all the time and vice versa, so the goal is to put it all together, get data out of clinical labs into the public domain, out of LOCUS-specific efforts, as well as the uncurated population data that’s coming from large studies. Put it in the same place, and then enable expert groups to, through evidence-based and consensus models, arrive at what our best guess is for those variants. And then that, you know, has a better opportunity to be used in the clinical environment than what we have access today. And the project that we had proposed, we’re working closely with NCBI, to put this data in the ClinVar database so that we can ensure that it’s in the public domain, working with other efforts that are already working in this space but trying to expand it and consolidate it into one place. We are working closely with the ISCA Consortium that David Ledbetter initiated several years ago to try and get copy number variation data out of clinical labs into the public domain, and they’ve been very successful with over 30,000 cases to date in dbVar, accessible to anyone who wants that information. And that -- we’ve learned a lot from their efforts, and we’re now joining forces in a combined grant to try and expand this to include molecular data. And I, you know, I was concerned about the willingness of clinical labs and even commercial labs to want to put this data in the public domain. It’s often considered their proprietary data. But I’ve been pleasantly surprised that many, many labs have agreed to participate in this project. There’s only three that have so far declined -- Myriad, Prevention Genetics, and Medical Neurogenetics -- but many of these labs, and you’ll see commercial labs here as well, have agreed to participate in this effort and are willing to put this data in. The challenge is it does require resources to get that data out of their systems, to have it all structured in the same way, and be able to put it in the public domain. And so that’s something that we, as a community, need to figure out how to support, when labs are willing to share this data. Another challenge that I alluded to earlier is how do we update this data over time? The clinical labs face this as an enormous challenge today with sequencing in the clinical context, and there’s no -- there's no billing mechanism to be able to reinterpret reports. It’s just done, for the few labs that actually do it, it’s done free of charge, in essence, and that’s not a sustainable model. We need to figure out how to do this more efficiently, and the guidelines from the American College of Genetics are that we should be doing this, but most labs are not. So how do we update data over time? And I’ve been looking at our data over the course of a number of years of reporting on many different diseases, in this case hypertrophic cardiomyopathy, and how we have changed the categories as we’ve learned new things over time. And over a five-year period we had changed categories 300 times with different variants and different knowledge that was acquired in both directions, going from benign to pathogenic and vice versa. And we just published some of this data about this sort of arena this past month, and about 4 percent of a physician’s reports per year need to be updated, so it’s a pretty significant challenge for us. We ended up developing a clinic interface to our laboratory software, where the physician can go in, get access to their patient reports in electronic form, and the variants are structured and connected to our variant database. So if we update, if I go in and update a variant, maybe in the context of signing out a new report, then that automatically will update the variant in the physician’s system. And today they actually get an email alert, without PHI, but a link that lands them on this page and tells them what the variant information that changed, and so they can see that information, they can click on the variant, and read the evidence that was the basis for that change. And so this, you know, I could just approve a variant and 1,000 reports could get updated in seconds, and that makes it much more efficient, to be able to do that and support patient -- improve patient care. And this activity has been the subject of an NIH challenge grant that David Bates is the PI of, to look at the usability of the system by physicians, particularly physicians who have never had any training in it, and how easily can they just go in and figure it out, and get the updates, and know what’s happened, and grading them on certain tasks, et cetera. And overall -- and I was a little concerned about the system, and that in giving all this very efficient updating, that the physicians would come back and say, great, now can you go amend that report for me and sign it out again? And they haven’t largely done that. They’ve been satisfied with the system delivery and the updated information. They can print it out and put it in a patient chart if they want to, and we haven’t really had to amend reports. So that’s been a great system for us to support this process and improve patient care. Now as we think about expanding these types of systems into genomic medicine, where we’re dealing with the whole genome, obviously, we can’t ping the physician every time one of those three million variants changes. It’s knowledge. So there’s got to be more infrastructure to support who gets what alerts, or do you deliver alerts at all, or do you allow just real-time engagement of that data in clinical decision support paradigms. And so these are all things that we’re thinking and enabling within the Partners Healthcare EMR environment in discussing how to do this best, with what strategies. We’ve also been talking to the -- so we disabled the alerting mechanism in the oncology domain because somatic tumors, those tumors that are genetics-evolved, and updating years later is not useful. But they have asked us about using the same environment to deliver clinical trial notification, which is, of course, very common within the oncology domain. So we’ve been thinking about expanding this infrastructure to support those types of activities. You know, when we originally developed this infrastructure, we built a network hub to enable many labs to communicate to many health care organizations so everybody didn’t have to build interfaces to every lab, because that’s expensive architecture, to allow knowledge sources and labs to share data amongst themselves but still maintain their own interpretations of data. We’ve talked with Illumina about bringing their CLIA whole genome lab onto this network, being able to robustly share data, and we are rolling out the infrastructure to support this this month, where labs who agree to share their data can show the variant interpretations that they have on a variant-specific level. You can click on lab X’s variant, see what they say about that variant, how many cases they’ve had, what the literature is that supports that, and labs can be able to sort of import other labs’ interpretations into their system, enabling a much richer sharing of data. We also have enabled case history sharing, so if I have all of my cases I’ve ever seen in my system, I can allow that data to be shared with another lab, and we strip all the PHI off of that, so that if I go into my system I can see my cases with all PHI in there, but then I can see de-identified cases with some clinical information and which variants were found in each case, and be able to see that across all the data sets sort of being shared in that environment. And so we hope that that will enable a richer understanding of this, particularly as we try to address the challenges of rare variation interpretation. This gene insight system is now being integrated into our EHR environment so that it will become -- right now it’s been a stand-alone, web-based interface, and that allows us to easily roll it out to any physician around the world who doesn’t even have an EMR. But we feel like it’s more powerful if we can truly integrate it into the EHR environment so it’s the same face that any physician logging into their patient electronic health record will see. So that, by the end of this month, will be integrated into the Partners’ EHR. It will be Partners’ Clinical Genetic Data Repository, is sort of what it’s being called. The same infrastructure and structure data is being pushed into the research patient data registry so that our RPDR that we use today for a lot of clinical research will have structured genetic data within it, and that’s obviously supporting many millions of patients within the Partners’ health care environment. There’s another effort that Scott is leading in terms of the biorepository, consenting thousands of patients that walk in the door of Partners Health Care, to share, to sign up for the biorepository and consenting to broad use of that, of that data and their sample. And we would like to engage other groups in thinking about this data sharing environment, for one of the, perhaps, one of the demonstration pilot projects to be able to use this infrastructure. If labs or groups are interested in getting onto a network and sharing data, we’d be happy to talk with others who might be interested in that and coming together for one of those projects. And that’s all. I thank some of the groups that have worked on some of the projects I spoke to, and I’d be happy to entertain any questions. Marc? Marc Williams: I think it’s really excellent, the work that’s been done to try and aggregate information. I’d had one comment, and I want to pose a question to some of our payer representatives here. So the comment is that is one of the benefits that has been touted for whole genome is the fact that you only need to it once, and that, you know, then this information can be repurposed. I think you very eloquently stated that there is, in fact, work involved with that that could lead to cost and reimbursement issues down the road, and so perhaps we shouldn’t be quite so forthcoming about the idea of saying this is a one-time cost and then, you know, it’s basically free to use the information. I think that’s going to generate some very interesting models about how do we actually reimburse for the updating, and maintenance, and that. But probably beyond the scope of a five-minute discussion. The specific question that I would like to pose to our payer representatives is this issue of the labs that won’t play. And I’m glad to hear that there's a very small number, but clearly for this to work, I mean, I think we have to have all people, you know, being willing to put their data in so that we can all benefit from that. Is there a mechanism in the reimbursement side where if we set aside those that are sole source providers, like PRC, where there’s patent issues, but in situations where laboratories say, "Well, that’s our information, it’s proprietary, we’re going to use that to be competitive," the payer could say, "Guess what? We’ll use somebody else that’s going to contribute data because in the long run that’s best for our patients." Because it’s only that economic pressure that will ultimately, I think, get them to pay. Is that a mechanism that would be possible to explore? Female Speaker: Well, that’s a complicated question. You know, a business relationship with a lab, especially a national lab, is based on lots of factors, mostly economic, so I think you’re not going to impale yourself on a stake over that issue. Although the big national labs, I notice, are on the list there, so it’s the smaller players that, you know, I think, that you’re talking about. And usually the relationships with the labs are over, sort of, quality and economics, not this level of data sharing. So I think probably you’d have to spend some time thinking about that. Now we do actually receive lab data from our labs, again, in the molecular pathology codes are all these stack codes. Again, when we get more specificity over those things, we will be getting, you know, specific data, and I’d be curious about what the LabCorps representative has to say about that. So it can be the payer that’s sharing the data, as a pass-through. Female Speaker: I’ll be talking about reimbursement in my talk, and some of the new changes to reimbursement and what impacts they might have. Marc Williams: One other thought on this, and particularly for the Medicare program, there’s tools other than, you know, directly reimbursement, that, you know, might be explored as ways to encourage data sharing. So, for example, there’s all the conditions of participation that make providers, you know, eligible for reimbursement and, you know, there’s all sorts of sort of regulatory policies throughout Medicare in addition to, you know, the sort of individual reimbursement for tests that, you know, might be explored as ways of, you know, encouraging that kind of behavior. But that’s just Medicare, not necessarily private payers, but something to look into, and wasn’t my area of expertise at Medicare, so other folks would have to be, you know, asked about it. Male Speaker: I’m curious -- Female Speaker: Certainly one could imagine the request that’s made during a contract negotiation that this would be a, sort of, a standard that one was expected to follow. Male Speaker: I’m curious about John Harley’s Cincinnati Children’s, about you have 300 cases in which you’ve already done exome sequencing, and that’s all done outside the institution. What level of data preservation are you using? Are you saving the original data, the terabytes, the five terabytes that you get back from each subject, creating something that’s a petabyte and a half that you have to store? Or are you storing just the final sequence and ignoring the possibility that someone would want to refilter that data from the beginning? And then how long can -- do you plan to keep that available for requerying and that sort of thing? Heidi Rehm: So just to clarify, I think the 300 up there was the number of novel variant assessments per month, but we haven’t done that many exome sequences. But to address your question, which is a very good one, in terms of data storage. So, we’ve addressed this in the laboratory standards that we’re developing for ACMG right now, in terms of data storage for next-gen sequencing, and the guideline that we put out is labs certainly don’t need to keep the raw image files, and that we recommend that labs keep the read-aligned files, or the BAM files, or the FASTQ files for a period of time, suggesting between one and two years, and then they keep the VCF file, the variant call files, indefinitely. At this moment, as a suggested guideline, although a lot of state guidelines and CLIA guidelines may trump certain things. So those are some basic guidance. I don’t, you know, we do recognize the ability to realign data with better algorithms, and so not -- so wanting to keep the raw reads in a context and not just the aligned reads but the non-aligned reads as well so that they could be realigned. But to keep them indefinitely, I think the technologies for sequencing are changing such that the raw technology is going to be improved such that two years later you’d probably still want to resequence, and the cost to store data is large enough that you have to balance the cost to store with the cost to redo it. So there’s a lot of things in flux here, and I don’t think we can easily say this is exactly how you should do it, but we’re making some guidance so that labs that want guidance can have that. But I still think it’s a moving target. Male Speaker: Mary Relling, last question, and short answer. I guess, okay, and Kelly, really short question and answer, and we’re going to have -- so I will claim credit or blame for the fact that there’re 20-minute talks and five-minute discussion periods. We’re over, we’re eating into our break time. That’s fine, because this is a really important part of the discussion. So quick questions, quick answers. Mary Relling: Can you help distinguish for me, because I totally understand sharing these variant data is super important, and we need to come up with a national or international resource. What’s the distinction between sharing to GeneInsight versus sharing what you described in your U41/ClinVar? Heidi Rehm: Yup. So, you know, it’s our hope that, like, for instance, myself, I use GeneInsight, but I’m going to be sharing all of that data into ClinVar that’s from my lab. But there’s different data structures and capabilities that surround those data sets. So, for instance, to be able to submit full genome data sets into NCBI they have to go into dbGaP, and there’s a lot of constraints around an access to that data. In GeneInsight, which is a HIPAA-secure environment, because it’s within our Partners’ HIS system, we can allow patient information to be there and be shared a little more easily, so that’s one distinction. Even though the variant level information I do hope that any lab sharing it within the GeneInsight network that we’re trying to set up would also then, they would be willing to submit that into ClinVar. So, you know, I think it’s, at the end of the day, it’s multiple strategies hoping that we can succeed somewhere or in all of them. But my real goal is to try and get everything as much as possible into the public domain, but the degree to which the full data structure that some of the more sophisticated IT systems can support, whether that’s ready and there already, it’s probably not there yet in ClinVar, but hopefully, over time, we can continue to enhance these systems to really take the full breadth of data that’s associated with cases and clinical data as well. Male Speaker: Okay, so Kelly, Erwin, and then we’re going to go on. Female Speaker: So, you have stated that -- Female Speaker: You state that about 66 percent of your variants had never been seen before. If you suspect that these are involved in the patient’s condition, how do you return those information for inclusion in patient health care, or do you? Heidi Rehm: So, we do return any variant found during targeted testing, whether it’s a variant of unknown significance and marked as such on the report. And the reason for that is we often try to encourage family member testing, because segregation for hypertrophic cardiomyopathy, a dominant disease, the best way to actually understand if we can get a live score that’s, you know, significant, that can guide us to calling it pathogenic. So we’ll report those out, and we’ll often do free testing of family members to support that segregation aspect. So we basically report what we know about a variant, and then over time we may learn from and increase that knowledge. Male Speaker: So, I applaud you on the integration with the EMR, and that raises some questions. One of them: Who actually makes the decision which variant gets into the EMR? And then, second, but equally important, the EMR, for any given patient, can be viewed by many people, you know, from students to nurses to dieticians to -- so, is there any way that you limit control, regulate access to who is going to actually see the genetic information within the EMR? Heidi Rehm: Yeah, so, to answer your first question, anything that we write on a clinical report would go into the EMR just like any other clinical report from any other system. As far as access to that data once it’s in there, so we are actually constrained somewhat by Massachusetts state law that says that only the ordering physician can see data from a genetic test, if not done -- if done for screening purposes. But we’ve got around that by consenting the patients to allow the entire health care institution to see that data, and there’s a balance there between restricting access for very specific situations versus what we believe is the need to have broader access to really engage this information in the care of a patient. So we are leaning towards, much more towards the side of everyone should have access to this data, but there’s a long conversation there that gets into some more subtle things that we don’t have quite enough time for. Get in trouble [laughs]. Male Speaker: I have 25 more questions, but I’m going to take chair’s prerogative and tell myself to go away. But, so that was a really great introduction to the kinds of things that we want to talk about over the next day and a half, so thank you, and the next speaker, Debra, is behind me. [end of transcript]