Broad Overview of The Current State of Genomics - Eric green

Eric Green: Thank you, Larry, and thank all of you for coming. I was asked and happy to serve the function of really just trying to set the stage, if you will, because we realize that some of the folks coming to this event might be not closely familiar with genomics in general. Some of you do cover genomics quite a bit, but we expected somewhat of a heterogeneous group, and I also wanted to help sort of create a foundation, sort of a landscape view of genomics, which would allow the subsequent speakers that you're going to hear to build upon as they drill down into more specific areas. And so that's what I'll try to do for about 20 minutes. Let me just say in starting, and this is both relevant for the opening of this exhibition, and I use this same title for this talk as we use for the exhibition, and also in my responsibilities as the director of National Human Genome Research Institute, is to really reflect back a bit on how different things are now than where they were even as recently as 10 and 20 years ago, having now been involved in genomics for about 25 years. And really, the relevance of genomics has just changed substantially in that time. And I think upon, you know, the time when the Genome Project was just starting, and people like Rick Wilson, who you're going to hear from later, and I got involved really on day one of the Genome Project. You know, genomics was really just about a bunch of biomedical researchers toiling away in laboratories mapping and sequencing DNA. It was really a research endeavor. You know, when the Genome Project ended, certainly there was discussion that, increasingly, this would be relevant for human health, and that certainly got the interest, over the last decade, and involvement of healthcare professionals thinking about the kinds of things that would be needed to make genomics relevant for the care of their patients. But I think the reason why this exhibition is so valuable to do now, and why there is so much excitement around genomics, especially from a health applications point of view, is that it's not longer the distant future but rather really is on the near horizon that genomics is going to be relevant to patients, and friends and relatives of patients, which means all of us. And I now routinely say that if it's not already the case, certainly by the end of this decade, genomics is going to be relevant in multiple ways with respect to specific areas of clinical medicine, and as a result of that, it makes it very timely to be doing this Smithsonian Exhibition, and to be cognizant of the importance of raising genomic literacy for the general public because of the relevance genomics will have as part of medical care in the coming years and decades. Now, this didn't all come by accident. And the thing that really catalyzed this as much as anything, and certainly drew many people into the field, was this audacious project, the Human Genome Project, which we celebrate, this year, its 10th anniversary since its completion, and recognize what a remarkable endeavor this truly was, this large international effort that captivated the scientific community; although there were critics originally, I think, in the long run, really got everybody to be quite enthusiastic about. Similar, and our analogy is to the moon shot, it really did create a situation where a single, focused goal was accomplished by a highly-collaborative team of scientists around the world in this case, but that really set into motion many products and byproducts, some anticipated, many not anticipated, all in a very positive trajectory. Now, in fact, our Institute was created to lead NIHs effort in the Human Genome Project, and indeed, there was an idea that eventually, of course, this would be very relevant for human health. And, in fact, in thinking about the priorities for genomics, they have substantially changed in the last 10 years since the end of the Genome Project, with NIHs focus on turning discovery into health, in fact, the great majority of dollars that have been spent in research related to genomics, in fact, is sprinkled all across the NIH because genomics is finding its way into all disease areas and all disease research areas represented by different components of NIH. Our focus at National Human Genome Research Institute is to try to make sure we have a research agenda that specifically focuses on genomics in a fashion that we can advance human health, and we tackle this in ways that are much more generic with respect to a disease focus, focusing on technologies, focusing on basic knowledge of genome structure and function, and thinking about commonalities of things that have to be done in order for genomics to find its way into clinical medicine. And so what I can tell you is while genomics is everywhere, having an institute like ours focused on common issues has been, and I think will continue to be, critically important. And now, for the last three and a half years, being director of this institute, I can tell you that I have a laser focus on trying to make sure that we are accomplishing these goals and enabling the genomics research being done across the NIH and really across the world. And in thinking about the health applications in particular, our institute has increasingly had its eye on an enabling and emerging medical discipline called "genomic medicine," which is the phrase we use, which is largely synonymous with personalized medicine, individualized medicine, precision medicine. There are other words and phrases; we tend to use genomic medicine, which basically involves using an individual's genomic information as part of their clinical care. Rather than treating patients generically, increasingly, we are going to be able to treat patients based on getting genomic information about them, and using that to tailor their medical care. So where once upon a time, we were focused on the goals of the Human Genome Project, and even then, 10 years ago, with the Genome Project completed, were focused on the many opportunities immediately apparent, having had, in hand, the first sequenced human genome, a couple of years ago, we realized it was important to rearticulate a vision, this time one that had incorporated a much more specific and sophisticated view of how we were going to go from knowledge of the genome to actually applying it for clinical medicine. And described in this strategic plan, which we published now a little over two years ago, and if you haven't seen, I would welcome you to go to this URL and download, and read all about the process that led up to the publication of this strategic vision, has really defined our mission in terms of a research agenda that we found, by talking to a number of people, could be described by five domains of research activity. And I want to just introduce them to you, and then I want to briefly tell you about some of things that have been going on in these arenas. The first research domain is understanding the structure of genomes, familiarity coming out of our experience with the Human Genome Project, understanding how genomes are put together; recognizing that it doesn't end there, of course, that you really need to understand the information encoded in those genomes, and therefore getting to the biology of genomes. Using that knowledge, of course, provides the ability to start to dissect the underpinnings of human disease, with virtually all human diseases having underlying genomic changes associated with them, so understanding the biology of disease through genomics research. And with that knowledge, of course, comes opportunities to then apply that information to advance how you practice medicine by advancing the science of medicine or medical science, but also having a responsibility to say that you could have the greatest advance in the world, but when you throw it into the complicated ecosystem of healthcare delivery, that doesn't necessarily mean you improve healthcare effectiveness, that you have a responsibility to also do research that demonstrates that genomic medicine can be used to improve the effectiveness of health care. So these five domains really represent a framework that we take very seriously in thinking about how we spend our research dollars. Let me just briefly tell you about where we are on some of these, but then I want to spend the bulk of the rest of my talk describing to you the future, and thinking about stories that I know you are going to want to be covering or already are covering. You know, starting with the far left domain, just understanding the structure of genomes, and the next one, in understanding how these genomes work, what's the underlying biology, do keep in mind that 10 years ago, or 10 years and a couple of months ago, this is what the Genome Project produced, or at least this is 0.001 percent of what the Genome Project produced, was just the ordering of the 3 billion letters that make up the human genomic blueprint. And we have spent considerable effort in our first 10 years of having the letters ordered in front of us interpreting those by figuring out where are all the relevant parts, if you will, through efforts like the Encyclopedia of DNA Elements, a major project we have, but many other projects going on, through evolutionary studies, whereby and very much resonating with the research going on in this museum, sequencing many genomes of other animals, comparing those sequences to the human genome, and trying to find the most conserved sequences across all mammalian genomes. And in fact, we sort of now have identified the most highly-conserved 5 to 10 percent of the human genome that is conserved quite at a high level across virtually all mammals. But of that, we now can start to identify those parts of the genomes among the most highly-conserved sequences turn out to be the genes that you just heard about with respect to gene patenting issues. And we now have a pretty good catalogue of the roughly 20,000 human genes, and we know what sequences correspond to them, but it turns out that only about a third of that roughly 5 to 10 percent of our genome that's highly conserved across all mammals, and therefore is functionally important, only a third of it or so, or even less, are genes. That, in fact, the great majority of the most highly-conserved functional sequences in the human genome are -- evolution holds on to them. They are functionally important but they don't directly code for proteins. And the whole world of non-coding functional sequences are sort of coming to the fore. We know that many of these are involved in regulating genes and figuring out where genes are turned on and off, and when genes are expressed, under what conditions, and so forth, but there is a whole code here that I would say, even 10 years after the Genome Project, we've barely cracked, and will represent a significant continued effort in terms of basic science research in the arena of genomics as we go to interpret more comprehensively in the decades ahead of us how the human genome actually works. So that's the first area, but the second area of accomplishments have really come not so much thinking about how a hypothetical human genome sequence works, but thinking about how each of our genomes work a little differently, and how each of our patient's genomes work a little bit differently. And the reason why they all work a little bit differently is because we're all a little different in our genome sequences. And, in fact, across the 6 billion letters that constitute our genome -- we got 3 billion from mom, we got 3 billion from dad, we got 6 billion letters -- any one of us compared to any other of us in this room differ by about one in 1,000 bases roughly. And if you do the arithmetic, that means, compared to the person sitting next to you across your six billion letters, about 3 to 5 million places where your letter is different in the same position in the genome. But it turns out that the great, great, great, great, great majority of these variants, shown here as V, that are out there actually are out there at a reasonable frequency. In fact, we have already seen it before. We recognize, we have databases now, and, in fact, that's what a big part of the last decade has been through projects like the 1,000 Genomes Project, or prior to that, the HapMap Project, or prior to that the SNP Consortium, these are all things some of you have covered, is to develop these catalogues. To give you flavor, when the Genome Project started, we knew about 4,000 places in our genome that might be different among people. When the Genome Project ended, we knew about 3 to 4 million of these variants, places that are different, letters that are different at a given position. But now, based on these efforts and the ability to share all this data and make it publicly available, we know about over 50 million variants that are out there in the human population from different parts of the globe because of samplings that we've done in the 1,000 Genomes Project. And that catalogue is publicly available, and that provides us now the ability with having knowledge about the most common variants that are out there across the globe in human populations to now start studying those variants to figure out which ones are biologically relevant. It turns out that the great majority of these variants are silent in terms of biological consequence. They have no phenotypic consequences, we talk about as geneticists, but a subset of them are relevant. A subset of these variants might confer risks for getting a disease, or perhaps another subset of these might actually be good variants to have. They may protect you from a disease, or they may lead to some attribute that you would regard as positive. And so our knowledge over the past 10 years of studying many thousands of genomes through these various efforts starts to give us a flavor of what an individual's genome looks like: your genome, a patient's genome, a family member's genome, and so forth. Just for fun, we are starting to get a feel for how different we are. Let me just show you some of these numbers. So, each of you, each person has about 6 billion nucleotides, like I said. And I told you about three to five single nucleotide variants, single bases different at a given position. But, by the way, you're not so lucky. It's not like you have a whole bunch of private variants all to yourself. In fact, out of your 3 to 5 million variants, as of today, only about 150,000 of them would be variants that nobody has ever seen in any other human being. So, the great, great, great majority are ones we already know about, and are now available for studying. But, by the way, you have 60 variants in you that neither your parents had. Those are unique. How do those get there? Well, in the process of copying that DNA to get it into you from each parent, there are a couple of little oopses, little typos that got introduced, and there are 60 of them, on average, in a given person. And that's how new variants sort of arise. Oh, by the way, in addition to these single nucleotides that might be different, there are also places that you're structurally might be variant -- you might have a variant compared to a person sitting next to you: an insertion of some DNA, a deletion of some DNA, maybe some DNA that's rearranged. There's about 10,000 to 20,000 of those, and we're slowly learning what those mean. Among these single nucleotide variants, though, how many of those break things? Well, the things we understand, the best of the genes, we're still learning how to understand when you break a non-coding functional element, but just on average, about 100 of your 3 to 5 million variants, about 100 disrupt, seem, at least by a computer analysis, would seem to disrupt a gene. Of course, you have to break it. Of course, you have two copies of almost all genes, one for mom, one for dad, so you just may have one broken copy but still have one functional copy. But, by the way, on average, based on these analyses, about 20 of your 20,000 genes are broken, both copies. So each of you basically has no functional copy of 20 out of your 20,000 genes. Some of that might be medically and biologically consequential, and some of it might be completely inconsequential because you just don't need those genes, or you have other genes that compensate for that one gene being broken. So that starts to give you a flavor for what this is all about and what an individual is, and, of course, the great interest is now really starting to study among these various variants which of these truly has an impact on rare genetic diseases and common genetic diseases, and that leads me to this third area I want to briefly tell you about because this is really now taking off, especially in the last 10 years, and that's understanding the genomic architecture of genetic diseases. And, again, this is why we are in a position to start thinking about not just discovering the genomic base of disease but using that information for possibly changing the practice of medicine. So, genetic, for those that don't think about genetic diseases all the time, they can be -- basically all diseases have an underlying genetic component or a genetic influence, but they are very different depending upon whether we're talking about rare diseases or common diseases. So rare diseases are diseases like sickle cell anemia, cystic fibrosis, Huntington's disease. These are diseases that are rare in the population, but they're genetically and genomically simple because it basically is a change, a mutation in a single gene or monogenic disorder, that is the major risk factor for getting the disease. There might be some other variants floating around that influence the severity of disease, or maybe some environmental influence, but by and large, if you have that gene -- a mutation in that gene, you will pretty much likely to get that disease. These are also known as Mendelian disorders because of the famous geneticist Gregory Mendel. Interestingly, what's been progressed? Well, when the Genome Project began, 1990, we knew about the genomic basis for about 61 rare disorders. Today, that number is over 4,800. Can't argue that mapping and sequencing the genome and what's taken place in the last 10 years has greatly accelerated what's going on in that arena. But those are rare diseases. They're devastating to patients and families that are affected by them, but they don't fill hospitals and clinics around the world because this is what fills hospitals and clinics around the world. These are common diseases. These are common diseases like hypertension, cardiovascular disease, Alzheimer's disease, autism, cancer, mental illness, so forth. The problem with these, besides the fact they're common, is, of course, that they're genetically complex, because it's not a single genetic change or a genetic variant that results in getting the disease, but rather it is a series of these variants that often conspire with what is together a greater contribution of the environment to give some risk for getting the disease. And it's oftentimes not an absolute. But there's been substantial progress on this find, and mainly accelerated by knowledge of the most common variants that are out there by doing all those studies I told you about earlier. And so, once upon a time, when the Genome Project started, or even when the Genome Project ended, we really had very few clues for most of these common diseases, where to look in the genome for these variants through a series of studies, which the aficionados in the room will understand are called genome-wide association studies, and I'm happy to answer questions about it, but basically studies that do the kind of detective work to sort of limit the scope of the whole genome down to small intervals that might contain these variants to decipher them. We basically now have publications where there were zero, zero, five, six, seven -- or seven, eight years ago. Now, there's over 1,500 such publications reporting successful studies where the detective work has brought down to very small discrete regions of the human genome whereby we can now search even further to identify which variants are conferring the risk for the disease. And so while we have a lot to learn to really unpack the complexities of complex diseases, no one can argue that we've accelerated the progress on that journey over the last 10 years. What's going to be needed for taking the next phase to really get at and understand the genomic basis of complex diseases, as well as to further understand the remaining rare diseases that we don't yet know the causative gene, are going to be technological advances that allow us to efficiently sequence not handful of people's genomes but thousands and thousands of people's genomes: a thousand people with hypertension, a thousand people without hypertension; probably many thousands of people with Alzheimer's disease, many thousands of people without Alzheimer's disease; and so forth. And here, the great news has been this remarkable, remarkable advance in technological capabilities with respect to sequencing DNA that we've seen over the last 10 years. Let me make one other point before I move on to that because I forgot to make this. The other thing that is both challenging and intimidating at the same time is the realization from these studies that the architecture not only is reflected by these pie charts, the differences between these two different disorders, but -- of these two different class of disorders, but also where those mutations are residing in the genome because what we've learned more and more is that the great, great, great, great majority of rare diseases are caused by mutations in the protein-coding parts of the genome, that yellow stuff that I highlighted that we actually understand really well. What's a little intimidating, to be honest with you, is that we've also learned really, over the last few years, that it seems that the great majority of mutations or variants that are conferring risks for these incredibly important common diseases are sitting out in the non-coding parts of the genome, that purple stuff, and the reason that's intimidating is we just don't know that language as well now. So we're both trying to interpret how it works, and then try to figure out how changes in those sequences influence how it works, and then confers risk for getting that disease. So there's a good reason why we continue to do basic research in genomics to understand the non-coding parts of the genome. It's because these are turning out to be medically incredibly important. And, again, the way we're going to be able to crack the nuts that remain in terms of understanding the basis for these complex diseases and remaining rare diseases is going to be through the use of these powerful new sequencing technologies. And many of you have seen these iconic graphs that depict the rapidly falling costs of sequencing a genome, where, once upon a time, really just 10 years ago when the Genome Project ended, when we thought the first sequence was in hand but came at a high price tag, we recognized we needed better and better technologies to be able to effectively deploy genomic sequencing for disease studies, and then eventually as part of routine clinical care, those costs were going to have to come down tremendously. And this is data we collected just from the groups that we fund for sequencing, and you can see, in logarithmic fashion, the dropping costs of DNA sequencing. Let me put you some numbers on top of that. Just for sequencing a human genome, let me remind you that when the Human Genome Project sequenced its first human genome and delivered it 10 years and a few months ago, there was six to eight years of active sequencing involving thousands of researchers around the world at a price tag of something like $1 billion. And while it was money well spent, it was clearly not practical for the kinds of studies, the types of clinical applications I've described to you. You know, the day the Genome Project ended, when we asked our genome centers what it would cost to sequence a second human genome, their back-of-the-envelope calculation said they could do it in about three to four months, but it would still cost $10 million to $50 million. And while that was a significant reduction, it still wasn't quite the practical clinical test that we had envisioned. But through a remarkable partnership of the government investing money in grants and contracts, and the private sector investing and seeing this as an incredibly important need, many new technologies have now come to the fore, and so today, as you may even hear about from some of our future speakers, you can sequence a genome in something like two to three days, and I hear maybe by the end of this year it'll be down to a day, and something like $4,000, or $5,000, $6,000. And, in fact, if you just want to sequence just the coding regions of the genome, that's well under $1,000. This is a remarkable story. I might argue it's the most significant technology development effort NIH has ever invested in and delivered as quickly as it did, and so I think -- but, again, this was done in partnership with the private sector as well. So those are sort of the three major accomplishments I wanted to sort of -- or four major accomplishments I wanted to really emphasize. I will, and while I'm extremely enthusiastic and optimistic, I also am very realistic to point out that this isn't all sort of just like cutting butter with a hot knife. There are some big challenges here, and those big challenges really reflect a broader set of issues that biomedical research now faces in the arena of big data. Basically, we created this situation for ourselves because we were so successful as a community of developing new technologies in genomics but also in other areas of biomedicine, in imaging technologies, and phenotype technologies, and so forth, and beyond genomics, thinking about proteomics, glycomics, metabolomics, thinking about the wealth of clinical data available in electronic health records, in, really, in the course of about a decade, we have become a big data enterprise. It is now a point that generating the data is not the bottleneck, it's analyzing the data, and I will tell you that this is becoming a big focus of the tension at the NIH, recognizing that we need to ready the biomedical research community for a new reality, that it's not transient, this is going to be it forever. Biomedical research is going to be a data-intensive enterprise in a way it has never been before, and there's things we need to do to ready the biomedical research community for this new reality. So that's why I wanted to tell you about the past. I'm about to take you into the future, but I did just want to insert one other news -- I thought this was going to be the hottest news item until today when the Supreme Court decided to set down their decision, but there was a little announcement that came out yesterday that I just wanted to point out, and some of you will be familiar with this, but previously, the Battelle Technology Partnership practice funded by efforts of the United for Medical Research Group had done a study to look at the impact of genomics on the U.S. economy and it was really impressive. What they found, and, in fact, in summary, what they found was that -- well, actually, what they found was a remarkable return on investment, but they updated the study now, and that's what they announced yesterday. And some of the findings that they updated, where they're shown here, so, for example, from a couple years before the Genome Project up and through last year, you know, genomic activities generated something like $965 billion in economic output at an investment that was a small fraction of that total. And if you just look at 2012 alone, genomic activities generated $65 billion in economic input, again, at a small fraction of what the federal government, in particular, invested in genomics. And the bottom line is, and there are various numbers you can look at and discuss, but the bottom line is this update of this report once again demonstrated how genomics has had a remarkable demonstrated return on investment. And that is, I think, just emblematic of not only the importance of the scientific aspect of genomics, which is what I've been emphasizing, but also the implications that everyone has seen as having, both in science and research and development, and eventually for clinical practice. So, again, that's just something that happened to come out yesterday. There were news stories about it, as there should be, and we're very excited to see that this continued valuable contribution genomics is making to the U.S. economy. So let me just shift gears now because I was asked in particular to sort of tee up for any of you writing stories now or any of you thinking about stories in the future, you know, what to look for on the near-term horizon. What's coming? What are going to be the stories you're going to be writing, or headlines that we're going to want to see based on things coming out of genomics in the future? And one way to start, to sort of set the stage for it, is to just think a little bit about these five domains I introduced you to, and reflect on where we've been and where we're going, because I think where we're going will foreshadow what you can anticipate to be writing about, and stories to be telling. I mean, one way I think of it, which we did in that strategic plan, is to sort of take a view of what's gone on in the last 20 years or so, and what do we see happening over the next 20 years. And we represented this through these hypothetical density plots of research accomplishments. Again, this is totally made-up data, but it sort of is meant to emphasize sort of a snapshot view of what went on. For example, during the Human Genome Project, where there was a huge amount of accomplishments, but mostly within that first domain of activities understanding the structure of genomes, and maybe just a couple things out in biology genomes. But then the intervening time after the Genome Project, you could see the center of gravity shifting right where we continue to explore genomes, the wow [spelled phonetically] through efforts that have allowed us to develop catalogues of functional elements of the genome. We have significant progress of understanding the biology of genome, and with that came some early opportunities to study disease and maybe even make a few medical advances. The real question is what's going to happen between now and the end of the decade. And even though we published this two years ago, I still stand by this graphical view of what's going to happen, and what's going to happen is we're not going to change the face of medicine between now and 2020. It's coming, but it's not going to happen that quickly, but I think we're going to see continued steady progress on all fronts. We'll continue to explore how genomes are put together, but, in particular, we will continue to invest heavily and see major accomplishments in understanding the biology of genomes. But, in particular, we'll see a surge in our knowledge about the genomic bases of disease through the kinds of efforts that now become available and approachable because of some of the discoveries I described earlier. And with that, we'll see substantial new advances in medical science, and maybe even some new home runs out there to demonstrate the improving the effectiveness of health care. But again, we're not -- it's just going to see a steady progress of the center of gravity. Beyond 2020, I think that will continue. I don't think, again, I think for decades to come, we'll be trying to figure out the biology of the genome, and for decades to come, we'll still be untangling the complexities of disease. So we will always be doing that for the foreseeable future, but I do believe we will see effectiveness of health care demonstrated through genomic approaches. So let me just give you briefly six examples, what I would call hot areas in genomic medicine. Some, probably a little hotter than others. Some are sort of lower-hanging fruit than others. But I believe these are going to be stories worth covering in the coming months. I guess I say that because this is "follow our dollars." This is where NHGRI is investing in terms of research efforts, believing that there's significant advances that could be made. I always would lead with this area because I think it's absolutely the most compelling one, and that's in the arena of cancer. I think cancer really represents some of the lowest-hanging fruit in terms of genomic applications to medicine. Needless to say, our investments at the NIH are heavily in using genome sequencing for exploring cancer genomes through an effort called The Cancer Genome Atlas, a joint effort between our institute and the National Cancer Institute. And I will strongly suspect that Rick Wilson is going to have some cancer stories for us as one of the real pioneers in this field. This is going to be an area that I'm convinced, by the end of this -- we're already seeing it, but by the end of this decade story after story after story where the practice of cancer care will be changed in a positive way because of genomic applications. So we'll wait and have Rick tell us more about it, and we can also talk about it during the question-and-answer period. Another low-hanging fruit, mostly because it's already here and it's just going to grow more, is pharmacogenomics, the merging of genomics and pharmacology. We, in so many ways, and I still remember learning in medical school, are so amateurish in how we decide what medications to give people because, too often, we treat people generically because we have no other way to do it. We just have to try a medicine, even though we know that medicine may not work in some subset of people. We didn't have a tool to figure out what is the best way to select those patients who are going to be good responders or bad responders. Well, guess what? We are beginning to learn that most of that response is genetically determined, and if we could only, up front, read out a patient's genome or parts of it, and figure out what are the best patients for the right medications, we will be in a position to be able to take groups of patients all with the same disease, and then find out those that will not respond, or, worse yet, those that will respond poorly, and just simply don't give them that medication, only giving the medication to the individuals where the response is going to be good. Is this science fiction? No. Before the Genome Project started, there were only a handful of medications, which -- in the United States, where the FDA required labeling on those medications to say, "Genomic information might guide" -- actually, they didn't even use the word back then, so it was, "Genetic information might guide whether or not this is a good medication for your patient." Now that number is over 100, and there's full expectation that number will continue to grow. Standard of care for some medications now, pharmacogenomic studies done before deciding the best medication, but I think this is a low-hanging fruit you will see more, and more, and more of this in the coming years. The third area, which is probably not low-hanging fruit, and I'm about to introduce to you is much more in the research discovery mode than I think it is in the clinical applications quite yet, with an exception I'm going to tell you about in a second, just relate to these rare uncommon diseases, those pie charts I showed you earlier. It is time to press the accelerator on this, and that's exactly what we're doing. We just started a major program to really try to take this pie chart, which refers to those 4,800 diseases for which we now know the genomic basis; that's the good news. Sort of the not-so-good news is there's still another couple thousand that we think a single gene's involved, and maybe another couple thousand on top of that. We want to fill in the rest of this pie chart, and we think now's the time to do this in a very accelerated fashion. We have a major program to try to industrialize the process of taking those remaining thousands of disorders that a single gene is likely involved, figuring out what that gene is. So that's a research study, but I would also point out that there -- in the exhibition downstairs, there's a couple examples of it, and just follow the news. You see it all the time, of these patients with these devastating ultra-rare genomic disorders whereby it's almost for certain a single gene involved, patient gets workup after workup after workup at lots of major medical centers, throw up their hands, people can't figure out what's wrong, and it just seems like a bargain to then just sequence those individuals' genomes, and in something like half the time, the culprit is found just by genome sequencing. And so it is not at all out of the realm of routine medical care of individuals with ultra-rare genomic disorders, genetic disorders, to simply sequence their genomes as a means of trying to see if you can get that clue that otherwise seemed to be elusive. So those are rare diseases. The case of common diseases shown here, sort of in a cartoon-like, is just an example of some of the data that just illustrates all the different regions of the human genome that have now been statistically associated with having a variant that likely confers a risk for an important disorder, like Alzheimer's disease, or autism, or asthma, or cardiovascular disease, and so forth. And we are asking our biggest sequencing centers, in particular, and you'll hear from Rick Wilson who directs one of those, to really take on the challenge of, okay, how do you go from these subregions of the genome and take on diseases like Alzheimer's disease, which, actually, President Obama asked us to do, asked NIH to please accelerate the pace of discovery to understand genetic bases of Alzheimer's disease, but also other disorders like autism and diabetes, and try to figure out how you now make that progression. This is not easy discovery work, but we recognize that we have the same creative minds who were involved in the Genome Project, and all the major advances we've seen in the last decade, asking them to tackle this to see if we can try to now figure out how to do this really hard work with these complex genetic disorders. The fourth area I would point out, and again, some of which we're directly investing in and some of which we're just watching in great admiration, just relates to both prenatal and newborn genomic analysis. Here, there's actually a couple stories to be told. The first story, I guess, is on the prenatal side, only because I've just seen remarkable things published, in some cases, by some of our grantees in the last six months, which, if you are not familiar with, you should be, because it's way cool, the idea that in the case of prenatal genetic testing or genomic analyses, that they're demonstrating that simply drawing blood of a pregnant woman, and finding and easily being able to detect the free-floating DNA from that unborn baby in the maternal blood as a means to now be able to study the DNA of the unborn baby. And both for being able to do routine diagnostic work, such as now done typically by samples obtained by amniocentesis or chorionic villus sampling, or doing whole genome sequencing, and there has been at least a report where a complete genome sequence of an unborn baby was determined just simply by identifying and analyzing DNA floating around in maternal blood. And I think there's a lot of interest, and companies, in fact, forming that are looking into the possibility that amniocentesis and chorionic villus sampling may go by the wayside because we have a much more noninvasive way of doing prenatal genetic testing just simply by analyzing the DNA in maternal blood. And then, of course, there's a whole arena of newborn individuals whereby in all states in the United States, and most developed countries, there is newborn screening that goes on to look at several dozen genetic disorders typically, but in the world where genome sequencing becomes affordable and more informative, one might imagine a day where it might be a value to sequence the genome of a newborn as part of the routine medical care, and then have that information available for their lifetime. But there's a lot of questions that arise both logistical, practical, technical questions, but also many ethical, legal, and social questions to consider with such a scenario. And so we've partnered with the Child Health Institute at NIH, and actually very shortly, we'll be announcing a set of grants that will begin a first set of studies to start to imagine what that new reality will look like. Very much a research area right now, but there's a lot of questions to answer in order to pave a way for what we might be considering for newborn sequencing, perhaps eventually newborn screening in the future. One of the other areas, though, to make this a reality relates to the fact that we are generating a tremendous amount of data, and there's an explosion of literature out there that really could overwhelm most practicing healthcare providers. It's one of the things I hear about all the time, in that while we can generate a lot of information, how is it that healthcare professionals, physicians, nurses, pharmacists, physicians' assistants, and so forth, how are they going to deal with this onslaught of information as tons of genomic data might find their way into electronic health records? Many, many, many variants, and some of them sort of have scientific literature associated with it, but, you know, their encounter with a patient's going to last a few minutes, and they need something to quickly be able to look up, and be able to say whether it's relevant for that patient or not. And this is a brave new world that if we don't get this right, we will not see the realization of genomic medicine in a way that we want to see it. And so, once again, we are pursuing this through a series of new grants we'll be announcing in the coming weeks and months to try to start to see what this world might look like, what could be developed through a community effort eventually, but needs to be first studied to see what it might look like to develop such information systems to see this be routed. And this will be a challenge, but one that absolutely has to be faced. And finally, and it really is the last thing I want to tell you about, I'm just going to tell very briefly is, I hope what you realize is whether you're talking about low-hanging fruit, such as in cancer and pharmacogenomics, or more challenging discovery work, especially around rare and common diseases, or whether it's about prenatal or newborn genomic analysis, or, in particular, around all this -- getting this information right for clinical practice, you know, what it illustrates is a lot of moving parts here, and a lot of things that have to be worked out. And so we also realize we just need to understand what it really looks like when you put this thing together and you take it out for a test ride. You have to figure out what that test ride is going to feel and look like. You need to research what that test ride. And so I guess the last area I think you should watch for is I just labeled, unofficially, "genomic medicine test drive" programs because there's a couple flavors. They have different names. One's called our Clinical Sequencing Exploratory Research Program. Another one that we're going to be announcing in the next few weeks called our Genomic Medicine Demonstration Projects, and they just represent different parts of a continuum of how mature something is, but it literally is, in both cases, taking out either genome sequencing in clinical care out for a real test ride; you'll hear from Jim Evans in a little while as one of the speakers, and he's one of our grantees in that program, to really see what does this look like and what can we learn from it. And then, similarly, in cases where genomic medicine is being applied in other ways, not necessarily through genome sequencing, but for a variety of ways, when -- how do we replicate models that work, and get them effectively disseminated through the community? We want to learn from those test drive experiences, and so, again, we're just trying to capture these things and figure out what is the best practices, and then share those with the broader community. So that's what I wanted to tell you. Just in retrospect, let me just -- let me just summarize by saying, you know, when I got involved in the Genome Project, you know, the idea that some -- you know, we said we should sequence the human genome because one day this will be relevant for human health, and it will change medicine. We weren't very clear on what that was going to look like, and even when the Genome Project ended in 2003, we once again said, "We've got to apply this knowledge to human health." I still think it was a pretty blurry concept. I think two years ago, when we published our latest strategic plan, it wasn't crisp and sharp, but I think we had a much better idea what this was going to look like. But I firmly believe by the end of this decade, at least for some disorders and probably many applications, it's going to be very clear, and I think reaching this reality is both realistic, and still challenging, but it's really on the near horizon. So let me just close, as I often do, with a quote, because I want to just remind you that this progression from the most fundamental basic science of genomes to really applying this for health care, you know, while you sit and make beautiful graphics like this, and I don't mean to imply that any of this is easy, you know, the Genome Project was not easy. The $1,000 genome, reducing the cost of genome sequencing, has not been easy, and trust me, when you try to understand disease, that ain't easy, and you try to do anything in the healthcare delivery system, that's really very murky and very difficult. So we realize we have a huge challenge ahead. It's a marathon. We've just barely gotten into this, but we're -- really have great momentum and a lot of wind to our back. But I like what Winston Churchill said because I think it embodies the spirit of the community I get to represent, and that's the genomics community. He said, "A pessimist sees the difficulty in every opportunity." Trust me, if the genomics community was filled with pessimists, we would've never sequenced the genome in the Human Genome Project, and we would never be anywhere near $1,000 genome. I represent a community of optimists. They know that every one of these domains has huge difficulty associated with it, but they just say, "Great, that's an incredible opportunity we want to be a part of." And that's why genomicists are fundamentally optimists. So I will stop there, and with any remaining time, I'm happy to take questions. Thank you. Larry Thompson: Thank you, Dr. Green. So do we have any burning questions we would like to ask? We'll take just a few questions -- Eric Green: That's fine. Larry Thompson: -- if that's okay. Does anyone have anything they'd like to ask Eric? Dan? Dan Vergano: Could you talk a little bit about how you press ahead -- this is Dan Vergano from USA Today. Could you talk a little bit about how you press ahead doing this in the era of deficit politics? I mean, the reality is you have a core of, you know, scientists who are very -- maybe they are pessimists, more pessimistic now than they were in the past because of circumstances, and, you know, you're talking about pressing ahead, and a lot of them are worried about losing the basic research, you know, that they're doing now. I mean, how does it complicate things for you? What is your thinking about this? Eric Green: So I think it's a tragedy, what's going on. I'm incredibly biased, but I think this is some of the hottest science going on, certainly in biomedical research, and the idea that at a time where we should be pushing the accelerator incredibly hard with a gas tank that's incredibly full, and that -- but rather, we're actually ciphering our gas off and keep hitting our brakes. It's a tragedy. I mean, it's -- and I will tell you that you look around the world, and I'm dealing a lot now talking to folks around the world doing genomics research, many other countries are pushing the accelerator and are fueling up big time in genomics. So I think we risk losing, you know, huge, huge, huge competitive advantage in this, and I think that would be a tragedy. To answer your question on a practical level, we're trying to weather the storm. It's hard. It's very, very difficult. Actually, ironically, two of the speakers you're about to hear from, both Rick Wilson and Jim Evans, are both members of our advisory council, which is the highest level advisory group that I interact with. And they're responsible for trying to help set priorities, including individual -- decisions about individual grants, and it's hard. It's very hard, because what it means is you can't fund all the good research that's out there. You've got to find, in this case, 5.8 percent less this year than last year at a time where we should be doing things substantially more. You know, we'll forge ahead. It just means you have to do all the best ideas first, and anything that's -- you're not quite sure is the -- you know, you just can't do it all. The other thing I will tell you that helps us a little, although, just a little, is if you -- it is time to be incredibly collaborative, and if you noticed in several examples, I talked about how we're doing things with other institutes, because we're a very small institute, high impact but small. We're only 1.7 percent of the NIH budget, but, you know, the cancer project is because we can partner with the Cancer Institute. The newborn sequencing project because we can partner with the Child Health Institute. The rare diseases effort we have, is we're partnering with the National Heart, Lung, and Blood Institute. So one of the ways is to sort of help develop partnerships so that our dollars go a little bit further by having partners who pursue this with us. Larry Thompson: All right, one more question. Mary [spelled phonetically]. Female Speaker: Hi, yeah, you know, we all learned in high school biology, and it was so satisfying that those three bases would code for a particular amino acid, so what are they doing in the non-coding region if they're not making proteins? Eric Green: So they're not necessarily hanging together in groups of three. We know that. So what do we know, and what do we don't know? Well, we don't know what we don't know. And, I mean, I always quickly point out that even if I made my list of what's known today, it would absolutely not be comprehensive. I'm convinced that my children or my children's children will discover new ways that DNA encodes function. But if we look back, speaking about what we learned in our high school biology or whatever example classes you had, you know, for example, what I learned, even in college, actually, I even learned it in medical school, was that DNA makes RNA, or encodes information for making RNA, and then with minor exceptions, that RNA goes on and directly encodes information for making proteins. And that was -- the RNA world was mostly geared towards that. Wow, a lot's happened since I went to medical school. We now know that there's a lot of RNA that gets made, and as a molecule RNA, it runs off and does all sorts of other things, including regulating other genes and how they function. So there's an example of non-coding, because it's non-protein coding, activity of DNA. Another example, and I'm sure a lot of the non-coding DNAs involved in this is, you know, there is a reason why every cell in our body that has -- carries a genome, which is virtually all cells in the body, they all have the same genome, but they use it differently. Only pancreatic islet cells make insulin. Only a subset of blood cells make hemoglobin, and so forth ,and yet they have all the code there to use any which way they want, but they're smart. The reason they're smart is there's all the circuitry that says, "Hey, I'm a pancreatic islet cell. I only need to turn on these genes, and some other ones I'll ignore." And so that circuitry I liken to a dimmer switch on the wall, that it determines when to turn something on, and it determines how high to turn it on. And that circuitry is non-coding DNA. So it's all these regulatory elements, and there's words like promoters, and enhancers, and insulators, and all that. Those are all words, and we're learning them. In fact, that'll be another area our institute will be continuing to invest in, is understanding that choreography. So we know now where some of these dimmer switches are. We don't even know how they interact with one another, and how it is that a pancreatic islet cell knows to turn on certain dimmer switches. That's sort of -- there's all this complexity in there, and I could keep going on. There's other functional sequences in non-coding DNA, but those are some of the classic ones that we're really trying to characterize. Larry Thompson: All right, Dr. Green, thank you very much. If I could get Dr. Wilson to come up here.