Break - Out group 1 - It/bioinformatics and cds standards - Marc williams

Marc Williams: Some of us were discussing Yogi Berra at lunch, which is an interesting cross-cultural discussion, but Geoff was talking about the theory of how this is going to work, and it reminded me of another Yogi Berra saying, which is, "In theory, theory is better than practice and practice it ain't." [laughter] So, we'll see how this goes. So, we did the IT Bioinformatics Electronic Health Records group. I wanted to talk just briefly about the process that we used to come up with our prioritized list because I suspect that different groups probably found their own way of doing things. So, what we did was to discuss some more general philosophical principles. To start with, I'm going to lead off with those. And then we articulated a number of things that we had heard through the course of the conference or that had been previously listed, and also new ideas that came up. We came together on shared understanding of what each of these is, which may not necessarily be immediately understandable to everybody in the groups. So, if there are some of these that need to be -- I'll do my best to clarify these, but if there's some that need to be clarified, we'll do so. And then we did a -- we used a voting process to prioritize, and I'll describe that as we get there. And I really want to thank our group. They bought into this and worked extremely well together and it was just a pleasure. It's very difficult to do this in the short amount of time that we had, as I know you've all experienced. So, okay, so the first thing is that we decided that our universe to discuss did not include anything prior to the creation of a VCF file. We know a whole bunch of stuff happens to get to the VCF file, but, since we're on the implementation side, electronic health records, clinical stuff, we figured there are other people that can do that better and we did not want to spend time creating solutions for things that already underway. So, that's one caveat. You can agree or disagree with that, but that was a decision that the group made. We also had a -- there were a lot of conversations of the course of the last two days about standards. And so there was a fundamental question that came up, which is, if we, as a group, which represents a substantial proportion of those that are in the genomic implementation space, if we chose to say, "Here's a standard, and we, as a group, are going to use this standard," there's a fairly decent chance that that could become a de facto standard. And so the question that was raised was, is it desirable, where possible, to identify a standard for a given class of data, description, definition, whatever, and as a group, to decide we would like to use -- we are going to use this standard as we go forward with our projects. And overwhelmingly the group said, "This is the desirable tactic," is to identify a single standard. Now we recognize that there are going to be some situations where that's just not going to be possible. We can't say, "Hey, we're all going to use ICD-10, so the United States, time to turn that on and those of you that are going to 11, come on back." So, but we wanted to avoid, you know, creating maps to different standards unless it was absolutely necessary because the problem with mapping is that it can be done but you lose a certain amount of resolution and fidelity. And so that was just a philosophical decision that the group came to that probably doesn't lead to anything actionable, but might be useful in terms of informing how we implement some of the action steps that we're coming up with. So, any questions about that philosophical discussion that took us about 10 minutes? Tim. Tim Hubbard: So this VCF business, are you saying throw the BAMs away? Marc Williams: No. No, no, no. We're saying, for our discussion, for the purposes of our discussion only, that we are going to discuss data from the VCF file going forward; nothing about throwing the BAMs away. Other people -- Tim Hubbard: Other people will solve the -- and so and they may remap things and generate an updated VCF. Marc Williams: Correct, which we will have to deal with. So, we understand that there's work that's going to be happening over there behind the curtain and that we're going to have to deal with that because even if we decide, for example, on a -- we all agree that we're all going to call this variant, this thing, that when you change the reference, that it may have a different name. And so we just didn't think that was something that this group would -- it would best spend its efforts on. Okay? Is that a fair description? Again, you can agree or disagree with the decision, but that was the decision we made. Okay. So, great. So this is going to be a little bit -- this will go a little bit out of order, so I apologize for this. We just -- in the three minutes we had between the end of the group and the presentation here we didn't get a chance to put these in and reorder them. So, you'll notice here that in the voting there are two scores, okay? So, the first score was of all the things that we listed here, what are the things that you think are most important? And everybody got to vote for three things, okay? So, a problem that we really need to solve, and that's the first score. The second score is how feasible is it? How easy would it be to solve this problem? And the vote was for how easy it was. And we asked people to separate those two problems and vote independently. I can only trust my members of the group that they actually did that, but that's okay. The interesting thing about this was -- and then the complicated algorithm that I came up with is to add those together to get the total score, which gives us the prioritized rank order. So, the one that clearly rose to the top of our group by a large margin was, define the key elements that should be stored in the EHR. And by -- elements is purposely undefined, but it would include information relating to variants, phenotype, all of the things that would be necessary to actually make a clinical decision using genomic information. And so that was scored extremely important, and also it was thought to be very feasible. Number two, with 11 points, was to try and learn from others. We heard a number of groups from different countries over the course of the last two days, as well as consortia within the United States funded by NHGRI, that are tackling a number of these issues. And so the group thought it would be important to study existing solutions that -- and compare them in sort of an in vitro bake-off, if you will, to say, of all the solutions that people have come up with for the problem of representing genomic data in the EHR, or for building a clinical decision support rule, or for whatever, which ones are more robust and more generalizable? In other words, which ones could be readily -- more readily implemented across a group like this. And then use those to select a sort of best-in-class. And so this could be a cross to a variety of different things, variant databases, meta databases, how we store VCF files, informatics pipelines, all these sorts of things. So, it was felt to be reasonably important, but it was also thought to be quite feasible to do that sort of an aggregation. The third rank at 10 was to develop a global resource for actionable clinical variants. There is -- this was deemed to be very important, perhaps less feasible, but certainly something that people thought was a very important thing to do. We had two at the number four position, collection and aggregation of gene and variant data, so we have examples of this, such as the Exome Variant Server, Exome Variant Server, and the -- this should be HGMD, the Human Genomic Mutation Database, and there are others out there, as well. This is thought to be extremely important, but I think everybody recognized that this is not an easy thing to do. And then also, with eight votes, define necessary federated databases that are needed to implement genomic medicine. So again, things like EBS, ClinVar, ClinGen. Now you recognize here that we did something that you can agree or disagree with, is here we basically said we want to have information about, you know, genomic information, but we also had one about actionable variants. And so I think we poached some of the votes from this into the actionable variant group. And so, in some sense, the group agreed that these two things are separable, and that it is more important for a group like this to focus on aggregation of actionable variants, however that is defined, as opposed to more general aggregation. This was not scored as being tremendously important, although people thought it was quite feasible. And let's see here, as I -- Male Speaker: The ontologies and -- Marc Williams: I'm sorry, I missed it. Oh, yes. So, and then there were two at number six, controlled vocabulary for phenotypes -- for a phenotype ontology, and included within that it would be an inventory of existing ontologies. We've heard about a lot of ontologies and we should move those together. Everybody thinks this is really important, but I think they all recognize this is a very difficult thing to do, to create, and then get everybody to agree to an ontology, although it has certainly been accomplished in some spaces. And then aggregation, clearinghouse of genomic medicine implementation guidelines. So, as different groups actually implement genomic medicine, whether it be for pharmacogenomics or cancer, or whatever, and we define, here's the guideline that we're going to use to provide guidance to our clinicians who do this, if we could aggregate all of those, that would be a useful activity. Again, it didn't score very high on importance, but it was thought to be highly feasible. So, I won't go through the rest of them. They're all represented. I'd be -- Geoff, do you want to run the -- you have a question. Geoffrey Ginsburg: No. [inaudible] Marc Williams: The -- do you want to run the actual voting process at this point or -- Geoffrey Ginsburg: [inaudible] but I'd ask you to ask the audience whether -- Marc Williams: Yes, I'm getting there. I just wanted to -- [laughter] -- say, do you want to take over or do I want to take over? All right. Great. [laughter] You know, just try and get the -- you'll pry this microphone from my cold, dead fingers. [laughter] All right. So, is there anything that we missed, that we completely passed on that you think should be represented on this list that we did not cover? Teri Manolio: So, maybe more clarification [inaudible]. So, could you just explain -- sorry. Could you explain in the second one that you have there, studying existing solutions to IVs -- so, identify solutions -- Marc Williams: That was -- Jackie [spelled phonetically] was trying to read my writing on the newsprint. So, this is the idea that we're implementing clinical decision support as an example in eMERGE-PGx. But each one of us is finding that we have to find our own solution to do that implementation. So, part of the work that the EHRI group is doing as part of eMERGE, is to compare the different solutions that we've come up with to say, is there a better way to do it that we can all learn from. And, in fact, you've asked us to expand that to CSER, to say, okay, we've got a CSER group doing this and an EHRI group doing this from eMERGE, are there solutions that we all identify as saying, oh, that's a much better way to do it than what we tried to do. And then expand that to the much larger universe of attendees at this meeting to say everybody's trying to solve these problems. If everybody throws in their solutions around a certain problem that we identify as being important, can we identify what seems to be a best-in-class solution? Teri Manolio: So the idea being to identify best practices and then disseminate them somehow. I mean, we don't just identify them but -- Marc Williams: Correct. Right. Well, I mean, I think the, you know, the idea would be is that the dissemination might be as simple as a clearinghouse. Here's the different ideas and here's what we think. I wouldn't necessarily presume that we would say, "Hey, this is a great, everybody do it," because we recognize there are local problems with that. Bruce, do you want to come -- no, you didn't have a question. Bruce Korf: No. Marc Williams: Tim has a question. Tim Hubbard: All of this discussion about these kind of, you know, databases so you don't have to order it yourselves, the group in this room is kind of the academic group, or the hospital group. It's not the commercial group that's now offering services, you know. And they're all -- as far as I can see, they're all doing these databases themselves. Are they being left out of the loop? Are they happy to do it themselves? Do you -- does anybody know what their state of mind is? Marc Williams: I think it's fair to say, and I don't speak as one that's phenomenally informed, and actually, Heidi -- yeah, Heidi, why don't you talk to this because you've had way more experience with ICCG-related to that. Heidi Rehm: Yeah, so I think -- I mean, those of us who are academic are also offering services, you know, so they span both commercial and academic environment. But if you talk to the commercial labs which we interact all the time with, they're in the same boat we're in. And we all need better resources. And a lot of the -- even the commercial groups are willing to share their data and they want desperately a resource to draw from as well. Tim Hubbard: I'm not talking about the commercial diagnostic groups. I'm talking about the companies that are now offering genomic interpretation services. Heidi Rehm: Right, they also want these databases. And the ones that are -- you know, there are some companies that I know of that are outsourcing curation projects and hiring people to go through literature and building massive databases. The problem is that there's not an evidence-based approach to the evaluation of that data. It's more of a collection process. And so the quality of what comes out of those sort of pipeline processes is unfortunately not that high. But it is the only solution that the heavily bioinformatic-based companies have today. And absolutely, if they have a, you know, a curated, clinically-oriented database to draw from, there's no doubt they will be delighted, at least in my mind. I don't know if others disagree. Marc Williams: Yeah. So, I think that that's good. I also am being given the frantic time signal here, that we -- it is time to vote, as they say. I am going to do this perhaps not the way Geoff would do it, but there was a clear winner from the work group. And so, what I would like to do is to ask the group as a whole, if you endorse the work group's conclusion that this is the one thing that we should take forward from the work group. And I'll just get a straw pull, yea or nay on that. And if there's a strong number of nays, then we'll go through and take a look at the others. Male Speaker: I just have a question on that. [inaudible] when I look at that I just -- is that like a one-month intense effort and you have your list and you're done? Or is this the kind of thing that would take, you know -- Marc Williams: I was not given any specific instructions -- Male Speaker: -- a span of months. Right. Marc Williams: -- but what I told the group was, six to 12 months, that this is something that would be achievable within a six- to 12-month period of time. Male Speaker: Okay. Marc Williams: So that -- but that was only me, so. Geoffrey Ginsburg: So, Marc, I'm letting you have your way with the way you're carrying out the voting, but I insist that you have a second choice. Marc Williams: Okay, fair enough. So, you said we were going to come away with one from this group. I'm trying to make it easy on you, Geoff. [laughter] Now you're making it -- so, if you don't like the result, it's your own darn fault. So, for number two here, how many would endorse this as -- that we got it right, how many would say yes to that. Just show of hands. Male Speaker: Can we vote again? Marc Williams: Hang on. How -- yeah, you can't vote yes and no on this question, all right? [laughter] So, hands down, you can -- now you see what I was dealing with in the group, all right? [laughter] So, how many would say, "No, this is not the most important thing"? How many would say that? Okay, so there are four people, five people who would say that this is not the most -- and most importantly, two of them are the people that are actually funding us. [laughter] So, taking that into account, we will go through the others. Welcome to democracy, friends. [laughter] Okay. So what we're going to do is very quickly go through the other that are on the list. And a quick show of hands, I just say, yes, you only get to vote once, so I'm going to show the whole list. You get one chance to look at it. Then we're going to go through, and I'll just select the one that has the most hands by my determination the second time around. So, again, we're not voting on number two, so what's a connection between genotype and phenotype, determination and location of clinical decision support. To expand that a bit, should it be located in electronic health records, should it be located in the cloud? Should it be located somewhere else? Archiving and aggregation of clinical decisions, so, decisions made using the information -- Female Speaker: Should we start raising our hands? No, because you don't know all the choices. Jeez. [laughter] I didn't go through all of them, so I just wanted to make sure that everybody -- I'm trying to do this so that everybody understands. I know I am. So -- despite the problems. Controlled vocabulary for -- you know we'd be done by now if you -- [laughter] Controlled vocabulary for clinical activities, controlled vocabulary for phenotypes ontology, including an inventory of what information should face patient and how should this be organized. That was the only patient-facing one to emerge. The federated databases we talked about. Define different needs for germline versus somatic variation, collection and aggregation of patient-level data, and automated family history from electronic health record analyzed and pushed to clinicians. Okay, so, votes for what is the connection between the genotype and phenotype? Okay. Determination of the location of clinical decision support. Teri Manolio: So, since not everybody can see what -- the zero, say they're none, or they're three -- Marc Williams: Well, I can't count that fast. There was one for the first one. There are -- I was just going to say here's the most, but that's okay. Jackie, count, quick, real quick, all right? Archiving and aggregation of clinical decisions? Zero. Controlled vocabulary for clinical activities? Zero. Controlled vocabulary for phenotype ontologies, including inventory of existing ontologies? Two. What information should face the patient? How should this be organized? Three, four. Define necessary federated databases needed to implement genomic medicine, all these listed here. Male Speaker: Can we vote twice? Marc Williams: Once. Male Speaker: So now we've already said the number two. Marc Williams: Yeah, so this is your second vote, right. Right. One, two, three, four. Four. Define different needs for germline versus somatic variation. Zero. Study existing solutions that are more robust and generalizable around -- we sort of defined that a little bit more. Two, three, four, five, six, seven, eight. Eight. Collection aggregation of patient-level data. Collection aggregation of gene-variant data, Exome Variant Server. Nine-ish. Aggregation clearinghouse of genomic medicine implementation guidelines. One? You voted twice. [laughs] Automated family history from EHR pushed to clinicians? One. And global resource for actionable clinical variants. That one has about 10. Male Speaker: Yeah. Marc Williams: So, but I -- there were a bit more multiple votes here, but -- so there were -- Male Speaker: Well, that sort of overlaps with some of the others. Marc Williams: Yeah, yeah. So, the two that I saw was this one, the global resource for actionable clinical variants, which actually maps to an existing activity. And then the ontology. I think those were the two that I saw that had the most -- Male Speaker: It just seems to me that seven and 10. One is a proximal of the other. So, first you identify the data resources, and then -- and then the last thing was [inaudible] -- Marc William: Right. So, I'm just telling you that in the group -- I don't disagree with you, but the group said we should look at these separately, not together. Male Speaker: [inaudible] Marc Williams: Right. So -- Male Speaker: I want to change my mind, though. [laughter] I just want to [inaudible] -- [laughter] Marc Williams: Okay. I understand. I'm just reporting what happened. I'm happy to stop any time. [laughter] Male Speaker: We are happy to stop you. Marc Williams: Great. Thank you. Thank you all so much. [applause] I know that this was a tough crowd here.