Group 1 - Understanding the genetic architecture of health and disease at scale

Male Speaker: So, thank you, Adam. I’m reporting for the Genetic Architecture of Health and Disease Working Group. Can you put the slides up, please? And the co-chairs for the group are Mike [unintelligible] and Adam Felsenfeld. And if you want to know the membership, go into the blue folder. The members of the group are listed there. I want to thank them for their input. So, the group focused a lot on goals and less on tactics by design. And here is an outline of what I'll present is really stepping back: the most important thing is what are we doing? What is the overall goal of the effort? Second, why are we doing it? Because this is a strategic planning for NHGRI we looked at some short-to-medium term steps that can we can take to achieve that goal. We did evolve a little bit into tactics. And I'll make some comments on a few tactics that we thought were important to achieve the goal. And then finally as we were reminded, and we all know, NHGRI has a long tradition of creating community resources. And I'll say something about, both clinical and basic sciences community resources that will be generated. First, what is the overarching goal? And it's a series of steps really to define the relationship of germline, somatic, and epigenetic variation to human health and disease-related traits as a means to elucidate path of physiology. That is what we see as the overarching goal. We'll develop and deploy assays that faithfully report disease relevant function and mutations and genes to guide varying interpretation and therapeutic discovery. Third, we'll create and make widely available the knowledge base needed to interpret genome sequence variation and the life sciences drug discovery, prediction, and clinical diagnosis. And finally fourth, do so across the range of human diseases and conditions, including health by the way, health and disease. And a range of populations to expand discovery, define the genetic architecture, and broaden access. And we'll do so in a matter of social justice. We want to make this as an access to care. We want to make this a diverse -- have diverse representation. Why are we doing this? I think it's obvious. But it's important particularly as we often get too far down in the weeds. Why are we doing this is really to fundamentally understand the biology of health and disease, to promote drug discovery, to improve the diagnosis of disease, and enable and improve prediction. As for intermediate goals, we would identify variants underlying exemplar conditions that represent the spectrum of health and disease-related phenotypes. NHGRI is not going to be able to do all the diseases. So, we thought it was important we form partnerships with other ICs and the private sector and identify exemplar conditions that will serve to expand the effort. The phenotypes would span a spectrum of Mendelian and complex diseases. And we assumed from the beginning that Mendelian diseases of course one of them which would form an important exemplar. It would be pediatric and adult diseases also. It would span psychiatric, metabolic, developmental, and infectious disease. We're not excluding infectious disease. We're also not excluding cancer in this initiative. We'd look at the molecular and clinical intermediate phenotypes and also importantly modifier genes. As we're doing this, it's an opportunity again because we're looking at exemplars in forming a new paradigm really for disease research. We can look at differences in study design. We can ask ourselves what's the role of other Olmecs in disease gene discovery. Importantly, and we mentioned this, I believe, yesterday several times, the -- both the scientific aspects and the ELSI aspects of re-contacting individuals, bringing them in, and doing even deeper phenotyping as a way of expanding the research program. It's also an opportunity to weigh differences in exomes and genomes. It's very popular today to say that costs are coming down so, of course we're going to do genomes. We need to remember the costs are coming down for exomes and there's always that ratio that we need to take into account. We need to move beyond the rhetoric of bridging Mendelian disease and complex disease and actually design the study from the beginning, both in size and in scope, so that the Mendelian disease and the complex disease research inform one another and move away from the two -- the siloed approach and think of the full spectrum of genetic models. There was also a lot of enthusiasm in the reality that clinical and personal sequencing to going to increase. It was kind of fun in the group to look at people. Some people -- we all agreed it was going to increase. There were some in the room that thought probably it would double, and there were some in the room that thought it would go up by thousands of fold, literally of thousands of fold. And so -- but the bottom line is it's going to increase. And so that there is a need for the research that we're all doing to drive discovery first. And that will create the tools to increase diagnostic rates and improve translation. And then on the flip side, there is a need to create an infrastructure so that the clinical data can be used to drive novel discovery. So, if we do this right as a community, both as a research community and a clinical/translational community, the two can feed off of one another. The research will help improve diagnoses and by creating larger and larger databases of patients, that are appropriately consented by the way, that will improve the chances for discovery. As we slide into tactics, there was sort of an -- a tactical elephant in the room. I actually don't know how to spell “biggie” by the way. I don't know if that's how you do it or not. We need to create a database. Really that should be maybe a knowledge base infrastructure that promotes a federated model of data and information sharing including sequencing, sequence, existing phenotypic data, and ongoing and live clinical data. And the NIH broadly and NHGRI in particular should consider convening a meeting to define multiple models of such an alliance or such a commons. And so at the moment, I think it's dangerous early on to think of one particular model. This is a great opportunity to look at multiple models and think about how this could best be done. The other tactic is of course we need to select our exemplar phenotypes. I already mentioned that NHGRI should look for opportunities for partnering with other ICs and the private community as -- that loops back at how we'd select. We'd selfish -- we'd be a bit selfish or Eric would be a bit selfish and maybe select some of the phenotypes depending on the partnerships that could be formed. The sample sizes need to be large. And this is maybe putting on a -- changing hats in the sort of geeky statistical hat. We often write sentences that the power is adequate. And in this particular case, the power needs to be more than adequate. The power needs to be high. And I'll -- with that sentence, I'll jump down to the last bullet: that we don't want to compromise at this stage with these exemplar phenotypes. That it's best to maybe do fewer phenotypes or do less in a very comprehensive way for NHGRI than it is to do more phenotypes and have more partnerships to make our constituents maybe happy. But more compromises would need to be made on the way. I've already mentioned this, but we'll need to have this as a multi-ethnic initiative, not for political correctness reasons, but for sound scientific reasons. But because it's multi-ethnic we'll be able to capture a fuller spectrum of genetic information. And in that way we can drive discoveries in one group that would've been frankly impossible in another group. Looping back then to the concept of creating resources, which I think is very important as we think about the large scale programs in developing resources for the scientific community. I divided this into two components one the translational/clinical community. If we can -- if we do the previous job and achieve the previous goal, it will define the context in which the genome scale sequencing improve patient outcomes. Second, it will prove the understanding of the translational utility and validity of genomic variants by multiple approaches. It will also, again, if we do it right, and this is probably -- I don't know if NHGRI has done a lot of this or not. We need to bring costs effectiveness and comparative effectiveness research into our realm and look at the comparative effectiveness of genome sequencing including the indirect cost of down-streaming outcomes both positive and negative. Finally we'll develop and apply this in a diverse populations to ensure maximal application and also access to care. The group had a particular fondness for creating resources for the basic science community. It's important, you know, the pendulum is swinging and NHGRI probably NIH-wide to being more translational. And it obvious, and I think it's the right time to do that. But we can't forget that by doing this translational resource research will create resources for the basic sciences. And in this particular context, we'll define the molecular, cellular, and organismal function of genome sequences so as to inform basic biology as the as to the interpretation of sequence variation. And then below that, if we accomplish this upper goal in this community resource, it will just open many, many doors for the basic sciences: tools for modulating and measuring sequences at scale; large-scale functional characteristics of variants and develop computational models that accurately predict molecular consequences of that variation; we can understand better the mechanisms underscore mechanisms of cis and transgene regulation; and also the functions of non-coding genomic sequences, which really is our next frontier. So, with that I'll thank again the committee and open up for questions and criticisms. Questions? Has everybody had their coffee? Male Speaker: I think it would be great if you could start piloting some of the other Olmecs now. I think other Olmecs -- why wait five years? I think we're already seeing the value in many other situations. I think what will happen is you'll go back and wish you had done it five years earlier if you wait five years from now. Male Speaker: I don't think we're saying for this plan we would wait five years until we would initiate the plan. This is a five-year plan. So, starting pilots now, I would more than agree with you, yes. Eric? Eric Green: Yes, I wanted to make a couple of -- just make a comment about the discussion about a commons. This need for sort of the shared data resources -- this is being discussed a lot. I don't think Phil Borne is here today, but clearly a big emphasis now in his role at NIH more broadly and I think you referred to that more broadly is a creation of these data commons across multiple different disciplines. Hearing what you said, I think your point was, which probably fair, is that NHGRI should show major leadership role in what it needs for the genomics initiatives that wants to pursue and whether that's done on our own or done cooperatively et cetera, et cetera. That's to be -- those are details to be determined. But one thing that hasn't been mentioned -- I don't want to be put David [unintelligible] on the spot, but I don't actually mind doing that -- is that some of those sorts of ideas have come up around this new global alliance, which that discussion about global hasn't come up at all at least in the preliminary part of this and I didn't know if David wanted to -- I'm sure lots of people are very interested in lots of things associated with global alliance. But since this was the first time that was sort of mentioned at the platform, I just wondered if you want to say some things. Male Speaker: So, it was a big part of our discussion. We purposefully did not put or label it “the global alliance” because we wanted to have multiple models. Yesterday in our discussion the global alliance was mentioned. Eric Green: In your breakout group? Male Speaker: Absolutely. Erin Green: But maybe for everybody else who is here who it wasn't part of that breakout group, and I know are very interested. I don't know, David, did you just want to say a few things? And part of what we need guidance from NHGRI needs guidance for is really -- what's so hard for us is trying to figure out what are different groups doing and where do we need to be participants and where do we need to be funders and where do we need to be activists and so on so forth. Male Speaker: Let me make a point, then I'll turn it over to David to make a couple comments about global alliance. I think what the group is saying though for the, at least for these exemplar phenotypes and this particular domain, and I don't mean this as a criticism of anyone, we need to move beyond discussions. We need to move beyond, you know, basically, you know, more and more meetings. We -- this -- for a few of these exemplars, I think we need to put models into place and start bringing these data together at least for these exemplar phenotypes. And I think this is a place where NHGRI could form a major leadership role. David, do you want to say a few words just generally on the global alliance? David Haussler: No. Well, I think it's not the right time or place to distract with the global alliance. We can circulate this. There is a website, and a lot of people in this room have been at meetings of it and all that. So, I don't want to do that. The only thing I'd say is I think -- I'd just highlight the point that Eric just said which is I do think there is a question of is what we need another meeting and, you know, to discuss what we might do. I agree strongly with the idea we shouldn't pick a winner and that we should try different models, but I think there have actually been many, many meetings. And you said, it's time to get on with it. I think the global alliance and many people in this room are actually getting on with it. It's not just the talking. There is actually APIs and there is data sharing projects and all these things. But I strongly always endorse innovation, diversity, even friendly competition among different models. And I think that the idea that NHGRI shouldn't pick a winner but should encourage that is certainly, certainly right. But Maynard Olson made the point, which I do agree with which is we've been talking about this for a long time. We should not say oh, we're definitely making progress because we may not be making as much as we need. And urgency around that -- the sense of commitment is very important. Male Speaker: I think that the only point I'll make is it probably won't be one winner. There may be different models that work best under different conditions. Debbie Nickerson and Jim then you and then Richard. Male Speaker: I just want to comment on the previous point for a moment. You know, there are many different approaches and ways that we can fulfill some of the goals. Some of them are much harder to achieve than others. From the perspective of the Mendelian Project, putting out complete exomes with phenotypes has been extremely difficult. On the other hand, putting out variant allele frequencies would be of enormous value to the broad community both for discovery purposes, how often has this allele been seen, and for clinical purposes when we talk about variants of unknown significance, so knowing what the frequency of an allele is in the general population and whether it's been seen in the homozygous state would be enormously valuable. This is not as complicated as many of the other things that we're talking about and in terms of priority, this is low-hanging fruit that NHGRI with all the exomes that have been produced could really play a leadership role in just getting variant frequencies out in public be an enormous contribution. Male Speaker: I just have to note to Rick that I have to note -- even though I know I'm jumping in -- that we've done some of this. And one of the some of the dax [spelled phonetically] have disallowed it in the last two weeks. So, actually before we get to meetings and grant proclamations just NIH getting its sort of house together on what's allowed and what's not because literally the work is done. And it's been pulled back. Deborah Nickerson: I was going to agree with David's point, but I do think that we've talked about this, talked about this, talked about this. We need to have and try different models, and NHGRI should lead the way. We continually -- I could go back -- I could redact four years. I could go back another four years, right, to it was always a message that we needed to learn how to integrate and share in different ways. And I think we need to test some of those models. It’s more important now than ever. And I think saying, you know, I would worry less about other ICs. I would worry less about different groups. But I do think there is a -- there is a big interest and everyone is interested in this. And I agree with David: the NIH needs to determine what is valuable and what is really sharable. And I think sometimes they caution -- too much caution on the parts of the ICs prevent sharing in novel ways. Male Speaker: Jim. Jim Evans: I think we all agree these ideas are very important, but we have to move on. And the fact of the matter is we've copied them the variation. This has been done for 10 years by decipher in 100 countries. The legalities have been worked out, the consentings have been worked out. Why not see how that model worked? Male Speaker: Agree. Ewan, I think you were next, then Richard. Ewan Birney: Yes, so this deoxy [spelled phonetically] has quite a lot of history. And the one thing I would encourage, which I said also in the breakout meeting, is that I do feel in this science we don't put enough into the implementation teams. We don't put enough. I know this is self-serving for me myself and for EBI, but we don't put enough depth into the engineering teams behind this. There are a few places that do. If you go out to other sciences that do this kind of data [unintelligible] physics, astronomy, oceanography, they are much happier about 10 or 20 men engineering teams that back this implementation than we are in this area. We've got to get our head around the right size for these sorts of data sharing aspects into the right place and not try and do it with two or three post docs. Male Speaker: I was going to say I think that comment is particularly important when doing in the scale. Richard? Then over here, then -- Male Speaker: So, I'm moving to a new point is that -- Male Speaker: Please. I think it's probably time, yes. Male Speaker: So, my question is about whether you discussed the issue of the relationship between the [unintelligible] discovery and function and boldly I know we got to bring this up later I want to preempt it. Broadly speaking, you can find a genetic signal to locate an allele and then go figure out what its function is, or else your measure the function of all possible alleles and correlate that with all possible genetic signals. And I wondered if you'd discussed that point and how much you're going to drive discovery just by genetic signals given the state of the art. Male Speaker: So, others could -- should correct me. No, we didn't discuss that specifically. The way this group is structured and modeled, we would do disease discovery first and function second. And maybe as a complimentary approach -- I don't know what you're going to talk about later. The other group is probably doing function first and discovery second. And it's going to be interesting to contrast those two approaches, you know. It could be, for example, that's it's going to be very difficult to make disease discovery without having knowledge of function outside of genic regions, you know, that the signal to noise ratio may not be there. So, wait just a second -- yes? Go ahead. Female Speaker: I think it's also time to think about different paradigms to not only get consent, but perhaps operate on data that you would not necessarily need consent. If you leave the data where they are and import the computation, which we can do. And take advantage of that data without necessarily hoarding it in one place, which is what some patients are very concerned with. So, we're doing that for VA data even, I think, with genetic VA data the approach could be similar. Male Speaker: Right. And we're not -- again, that's why it's important to have other models. We're not limiting ourselves to bringing all the data into some centralized database or information base. We're going beyond that. Sharon, then Maynard. Female Speaker: Just a quick follow up on Richard's comment. I mean, discovery is always going to lead function. No one cared about the function of triplet repeats until we discovered that triplet repeat size affected disease. No one cared about gene dosage effects until people said the duplications caused disease. So, I think we have to push new ways of discoveries so we find new mechanism of genomic alterations that cause disease. And then we can do comprehensive functional analysis of that type. Male Speaker: I personally agree with that. But my guess is there are others who reversed the order. Maynard? Maynard Olson: So, just on this issue of alternate models for handling the data, I think I was the one who kind of pushed the discussion most strongly in that direction. It's kind of a cause of mine. I've been heavily involved in this discussions for longer than I'd like to recollect and I actually don't share the view that we're making progress. Certainly not making progress relative to the baseline -- the rapidly shifting baseline about how information is being kind of handled more broadly in the world. But in any event, if I were to make one comment about why we actually might need more meetings, is that I think that the problem with the dynamic of this discussion has really been a push toward premature consensus. One meeting after another tries to define kind of what an optimum path forward would be when I think we're clearly headed toward multiple paths and we need to start to think. It's not a long list. There are only three or four reasonable possibilities. But I believe that there exists no document that defines these, tries to flesh them out, and look at what their strengths and weaknesses are -- really predictable strengths and weaknesses will be. That is what I was kind of pushing for, and I think the NHGRI has a lot of leverage in this area. This is something that cuts across all disease areas, which makes it a good NHGRI project. You know, for people interested in this, I would recommend an article in The New Yorker just the last couple of issues on it; “Diagnostic Odyssey” case of a fairly typical sort. And I would simply argue that that article shows what failure looks like. This -- there has been institution failure, and, I think, a kind of community failure. We simply didn't meet the entirely predictable needs of the physicians, the family, and so forth involved in this case. And the case that is described there is just the leading edge of what's going to be a really a vast amount of activity. And I don't see the plan on the table or the set of plans that are going to assure that we do better going forward. So, that's where I was coming from. Male Speaker: Understood. But I think I still would like to quote David though, I think we need to get on with it. I think there are there are models that are in place in particular domains and some of those are ready to be launched otherwise I'm afraid we're going to have this endless cycle of discussion and theoretical debate. Maynard Olson: So, just one quick comment on that is that I don't actually -- sure, we will get on with things and we must. But I don't agree with it as a policy position because that's one model. One model is business as usual projected forward in some chaotic way. And I can write a scenario, I think most of us could write a scenario as where that's going to lead us. And I don't think that's where we want to go. Male Speaker: Other comments? We're probably winding down. Adam? Mike? Things that the committee talked about that this discussion has missed in terms of emphasis. Do you have a comment? No. All right, thank you. [end of transcript]