Tip:
Highlight text to annotate it
X
Well, good evening ladies and gentlemen, especially if this is your first visit to the Royal Society,
and a particular welcome to this evening's Francis Crick award lecture. I should introduce
myself, I'm Jean Thomas, the biological secretary of the Royal Society and I have one duty to
do, which is to ask you to please switch off your mobile phones because the lecture is
being webcast. The Francis Crick lecture given annually. Preference is given to genetics,
molecular biology and neurobiology and also to fundamental theoretical work which is the hallmark of Crick's science. The lectureship
was endowed by Francis Crick's friend, Sydney Brenner, also a friend of the Royal Society
and the first lecture was given in 2003.
The recipient this year is Dr Matthew Hurles, group leader at the Welcome Trust Sanger Institute
at Kingston, just outside Cambridge. Matthew was undergraduate in Oxford then went to Leicester
to do his PhD with Mark Joblin on the population genetics of human Y chromosome polymorphisms,
and for those who don't know, the Y chromosome is the male specific chromosome. During his
subsequent work in Cambridge on population genetics and molecular evolution he established
the molecular mechanism underlying a recurrent deletion of part of the Y chromosome which
causing male infertility.
Ten years ago mat join the sanger institute in Cambridge where he is currently leading
efforts to apply genome – and to understand the factors influencing the rates of mutation.
He was chosen to be the recipient of the 2013 Francis Crick award for his outstanding contributions
to understanding structural variation in the human genome, the mechanisms that caused this
variation and its medical and evolutionary consequences. So ladies and gentlemen, I'm
very happy to present Dr Matthew Hurles to deliver his lecture on the interesting title,
‘Mutations: Great and small’.
[applause]
Well, thank you very much Jean. It is a very great honour to give this talk, and it was
also a very great honour to meet Francis Crick as I did as an undergraduate when he came
to give a talk at the Oxford union and I was writing some very poor articles for the student
newspaper at the time on science, and had the pleasure of speaking with him so before
this talk I tried to look through pictures of Francis Crick, most of them involve him
looking at a double helical structure. But the photo I remember was this one here, drink
in hand, very convivial, being somewhat gossipy about other scientists in the room, and so
it was a great pleasure to speak with him and he was very pleased when I saw that the
Oxford newspaper doesn't have any online archives so I can't show you and you cannot see the
rubbish that I wrote about what he spoke.
So there is two things I really want to talk to you about today, are the work that we've
been doing on gains and losses of DNA that we all have in genomes and the secondly in
which mutations as we pass on our DNA from one generation to the next, from parent to
child.
The human genome can be seen in a bird's eye view down a microscope and what you see here
are the 23 pairs of chromosomes that all of us have and there are two very special chromosomes
down the bottom here, the X chromosome and the rather stubby portion of the genome that
I did my PhD on. There are a few things that one can tell just from this view of the human
genome. The first is we can tell it is human. There are 46 here, our closest relatives,
the chimps and gorillas, they have 48 chromosomes, but the banding pattern of the chromosomes
is recognisably human. The second thing that we can tell is that this is a male, that the
presence of the Y chromosome which determines maleness, all of the men in this room have
an X chromosome and a Y chromosome and all of the women have two X chromosomes. The third
thing that we can tell about this individual is really from what is not there. This individual
does not have downs syndrome or Edward syndrome or PA T A L syndrome, the three human diseases
that are the largest mutations that we know about where there is a whole extra copy of
a chromosome, which we would be able to see at this kind of bird's eye view. Those are
pretty much the largest mutations that we see, and all that we can really glean about
this genome from the bird's eye view. Historically we've view, the bird's eye view and the worm's
eye view. This is what the 3 billion letters of DNA in the human genome look like close
up. There are four of them, A, C, T and Gs, and it is the order of these that is the code
for life. This particular snippet here is not actually a random set of A, C, T and Gs,
this is actually a portion of the gene fox P2, which was described by a previous Crick
lecture, and actually, one thing that's notable about this gene is even though you are looking
at a small portion of the gene you can tell that this comes from a human as well, because
there are two key spelling differences of single letters that humans have that are also
specific to humans that other mammalian species don't have. The question exists, what are
we missing by looking at the genome from just these he two views, the top down bird's eye
view and the worm's eye view. By looking at the analogy of geography we can look at what
we might be missing. These two pictures here are pictures of two cities in the UK. One
is the city Cambridge, where I now live, where I did my post doctoral training, and the other
is the city Leicester where I did my PhD studies. Cambridge is renowned for its architecture,
it is a beautiful city, Leicester, when I first went to Leicester I read the ‘lets
go’ guide to Britain and its description of Leicester's architecture was that it unfortunately
suffered – and industrial decline. But that's not immediately obvious from this view here.
Close up, this is, on the left is a snapshot of the building that I did my PhD in and on
the right I did my Post Doc in. One can hazard a guess from this worm's eye view of geography
which of these is the city with the beautiful architecture and which is not. But you may
not be very confident about doing that. That's because between these two scales we're missing
all of the architecture, how are these components put together and with these two views we're
probably missing the architecture of the genome as well.
What I introduced to you in the beginning was two opposite ends of the scale in which
a genome, all the DNA in a cell, can differ between individuals. So on the left-hand side
you can see hopefully in red, there is a single base change that differentiates the top sequence
from the bottom sequence. That's the smallest type of change we can have. And on the right-hand
side, we've got an addition of an entire chromosome. That's the biggest kind of change, as we see
in downs syndrome, but in between we've got, there is actually a continuous distribution
of variation between these scales. So at the bottom end, there are losses of a few bases,
gains of a few bases, but with the worm's eye view that I showed you before, we can
see this. What we cannot really see at either of the scales is the kind of variation where
there are large segments of DNA that are lost. In this example, or gained in this example.
And this is the type of architecture of DNA that we're missing when we look at it down
a microscope, or if we look at the individual sequence of the bases close up. So if we think
about any type of variation, we need to think about what is its likely impact? Do we really
care if we've lost segment or gained segments of DNA? Well of course our chromosomes contain
a linear combination of genes, and those genes are in code, the proteins which are actually
the molecules in the cells that do all the work, and this flow of information in unidirectionally
from genes to proteins, is what Francis Crick called the central dogma of molecular biology.
But these genes do not occupy most of the genome. Actually, there are about 2,000 genes
that we're aware of in the human genome that code for proteins and they only occupy between
1 and 2% of the genome. 98 to 9 will% of the genome does not code for genes. We're still
in the very earliest stages of creeping out of our ignorance of what these sequences that
are not genes, what they do. What we do know is not all genes are turned on in every cell
of the body and that's why the cells are different. Each cell expresses a different set of genes,
and the sequences that lie between the genes are clearly involved in the regulation of
those genes, deciding which genes to turn off when in what cells. So if we think a little
bit about how this form of variation that we haven't really been able to ascertain previously,
these large segments that are gained and lost, if we think about what kind of effect they
might have on the genes, well, it is reasonably obvious that if we took a cartoon of a chromosome
on the left with three genes in here, then we might have variance that removed an entire
copy of a gene, the loss of gene B in this example, or we might have variance that gave
us an additional copy of gene B, as in this example. There are other forms of variation
which I'm not going to be talking about, which don't change the numbers of genes, but change
the orientation of genes, and I'm going to be focusing on the gains and losses of DNA
that collectively are called copy number variation. Now, the mechanisms that generate this copy
number variation, they are actually blind to where the genes are in the genome, so I
have shown you examples of removal or gains of copy of a gene, but equally, there could
be removals and gains of other portions of the genome, and it is not simply that one
can lose or gain existing functions in the genome, but one could potentially generate
new genes, for example, here, the bottom is the deletion which is making a new gene, it
is the hybrid of gene B and gene C, so there is potential not just to toggle on and off
existing forms of function, but actually generate new forms of function, and we can see in the
evolutionary history of our genome that many of the genes in our genome have undergone
this kind of process. So in thinking about this talk, I was very much reminded of a quote
from Sydney Brenner who you heard about earlier, and Sydney, ever since I learned this quote,
it has really stuck with me, progress and science depends on new techniques, new discoveries,
and new ideas, probably in that order, and the key thing is that last segment, because
science is often portrayed as a very hypothesis testing kind of approach. You think of an
idea and you go out and do the experiment, but that's to underplay the value of exploring
the new advice that's on nature that new technologies give you. I think the examples that I'm going
to tell you about today very much exemplify what Sydney was talking about.
Because we were very fortunate that new techniques came along about ten years ago which enabled
us to probe the genome to look for the segments of DNA that might be gained or lost. This
technology involved this picture on the left, is what is known as a micro array, so it is
a glass slide, a microscope glass slide which has small portions of our DNA spotted on it,
and the colour of those spots is telling us whether there is more or less of that segment
of DNA in a given individual. And most of them are yellow, which tells you that most
of us, for most of our genome, have exactly the same amount of DNA, but there are a few
spots in green or red which are regions which have been gained or lost. By applying these
technologies and applying them to looking at normal individuals, normal populations,
we could make new discoveries. This picture is a map of the gains and losses of DNA that
we've found in normal individuals, so you will hopefully recognise the chromosomes here,
and every blue line tagged on to those chromosomes is a gain or loss of a DNA in the first map
that we generated of these losses. This was in collaboration with my long-time collaborator
Nigel Carter who has since retired. He really drove this new technology, and together we
made these new discoveries.
So the important thing about this is it suddenly gave us an insight into the fact that actually,
everyone in this room does not have the same genome. We don't even have the same number
of genes. We probably vary by several hundred genes between us, each of us not having some,
and having others. And this was much more extensive than we previously thought. Some
of these variants are very common in the population and some are very rare, but there are hundreds
of genes that are affected. So when we discovered this, we were somewhat disturbed to see the
headline in the newspaper, the book of life is rewritten. Slightly disturbed about this
because this suggested that we had done some kind of genome engineering. The book of life
had done nothing at all, it just sat there, we had just re-read it and our understanding
of it had got better. So it was striking that there were all these differences, that we're
somewhat more different from each other than we thought, but the question remains, should
we really care about them, are they biologically important? Now one can investigate this biological
importance at different scales of biology. One can look at cells, one can look at individuals,
and one can look at populations. So by looking at cells, I don't have time to go into all
of it, but one can essentially say I know this cell does not have a copy of this gene,
or only has one copy of the gene rather than two copies, because we should have two copies
of every gene, one inherited from either parent. And one could go in and say well, do we see
less of that protein that that gene encodes in that cell? Typically we do but not always.
If we have extra copies of that gene do we see more copies of that protein? And typically
we do but not all the time. So there is clearly a molecular change that results in the cell
from that change in the DNA. But we still don't know whether the cell, or even the organism
cares about that molecular change. There are molecular differences between all of us. What
we really care about as individuals is, are these going to affect our health, and so we
can – I will talk a little bit later about investigating whether this type of variation
plays an important role in disease. But first, talk a little bit about populations. So, if
a type of variation is important, then we should see it under natural selection, under
Darwinian selection, winowing out harmful mutations and potentially increasing the frequency
of beneficial mutations. So what happens if we look at CNVs in this way?
So, we can think about purifying selection, which is the selection that's winnowing out
the harmful mutations and we can think about positive selection that's amplifying the beneficial
mutations. In this context it is interesting to think about those copy number variants,
deletions and duplications that affect genes on the right, and those that fall outside
of genes on the left. Do we see a difference between the two? Because know genes are important.
If we don't see a difference in the distribution of these variants between the genes and the
non-gene regions then we're probably thinking that biology doesn't really care about this
form of variation.
But when we look at the effect of this purifying selection, we can see that it is acting very
strongly, because remember, the mechanisms that generate this variation are blind to
where the genes are, so these variants are being generated all the time but the variants
that fall within genes, they are getting removed from the population by natural selection.
And we see there are many fewer, although there are many hundreds of variants that affect
genes, there are many fewer than we would expect and the ones that we do are typically
less common in the population because they are being pushed, kept down by natural selection.
If we look on the other side, positive selection, well, most variants in the human genome, if
they change function, change it in a negative sense. That's because we have a highly evolved
DNA. It is pretty good at doing its job, evolution over billions of years have managed to achieve
that, so most changes that you make are not going to be beneficial, so purifying selection
dominates but we do find examples for both copy number variants that affect genes and
copy number variants that fall outside of genes which show clear signatures of being
selected positively by natural selection. Now, we can see those signatures but we don't
necessarily know why that is. There is one really nice example which our collaborator,
Charles Lee, worked up, for a particular type of genic copy number variant, so the gene
that digests starch, and that exists in multiple copies. What Charles showed was that populations
this have prehistorically been eating high starch diets have more copies of that gene,
and populations that historically have had low starch diets have fewer copies of that
gene, and presumably, there has been positive selection for those additional copies to drive
the digestion of starch in saliva.
So, moving on, then, to diseases. Well, we can broadly break down diseases where genetics
plays a fundamental role into common diseases, such as diabetes, coronary artery disease,
and – or rare diseases such as cystic fibrosis. I will take common diseases in turn first.
Common diseases, we know, genetics plays a role, but it plays a role in concert with
the environment. So the genome and the hamburger. And what we know about the genetics of common
disease is that the genetic variants that influence our risk of those common diseases
are spread throughout the genome, for any one disease that we might think about. There
are many tens, probably hundreds of variants that influence our risk of getting Type 2
diabetes, for example, but each one of those variants only has a very modest effect on
our risk. Maybe increasing it by 10% or 20%. Certainly not causing diabetes, and those
modest effects, those subtle effect, we can see those effects if we look at very large
numbers of individuals, and so we can only really detect these effects if these variants
are common in the population. So there is a fairly standard way of trying to identify
if a variant affect the risk of a common disease, and that's just to take a set of individuals
with the disease, and a set of individuals without the disease, and just look at the
frequency of that variant, between those two individuals.
And so, one can simply, in this case, label the individuals who have this variant in both
the patients, and the controls, and in this situation, as in most times we do this kind
of experiment, we see there is absolutely no difference. This particular variant doesn't
influence the risk of diabetes. So as a large-scale UK-wide consortium, we investigated whether
thousands of copy number variants from the maps that we made in normal populations, whether
they influenced the risk of eight different common diseases, and we found very little.
We found, identified four copy number variants that influenced one of those eight different
diseases. And all four of those were already known previously. So we were left with a rather
surprising conclusion that despite the fact that these variants are very large, and they
remove large segment of DNA, very, very few of them actually influence our common disease
risk. But it is a different story when we come to the genetics of rare diseases. So
the genetics of rare diseases, there is no hamburger on this picture, because the genetics
of rare diseases is driven by very strong mutations that in and of themselves are sufficient
to cause those disorders. Broadly they can be classified into two types. Those where
just a mutation in a single copy of the gene is sufficient to cause the disease, and those
where, like cystic fibrosis, you need both copies of the gene to be damaged before the
disease occurs.
So, in collaboration, so there are many different rare diseases, there are thousands of rare
diseases. It has been estimated though, that although each one is individual ly rare, about
one in fifteen, one in twenty of us has a rare disorder. So it is cumulatively quite
a big impact. We focused just on one study in collaboration with Saddak Farouk in Cambridge
who has been working on genetics of onset obesity in children. Obesity is a common trait,
but the extremeness in the early onset of the patients that she works with is actually
a much rarer trait. We applied exactly the same kind of technologies that I showed you
to generate those maps to patients that she works with and we found that there was one
particular region of a chromosome where, of chromosome 16 in this case, where we found
two different types of deletion. Losses of DNA in these families. We found a small one,
whereabout 200,000 of these letters of DNA were lost, and we found a big one whereabout
1.7 million of these letters were lost. If we looked at the families of those individuals,
then that small deletion that we observed, we observed in children, in families, where
there were multiple individuals with extreme obesity, they often have siblings, and they
had at least one parent who was morbidly obese, whereas the very large event that we observed
that overlapped that, we only found in individuals where they were the only person in their family
to be morbidly obese.
So we think that there are particular genes within this interval that these two deletions
share in common that have very important for the sensing of when we're hungry, and when
we've had enough. And so we looked at these CNVs and we tracked how they passed down through
the families. What we observed was, and that is shown with these green lines here, is that
in the families where there were multiple individuals who were morbidly obese, every
individual carried this deletion that was morbidly obese, but in the families where
there were – the individual was the only one affected, these were new mutations that
the child had that neither parent had, and that explained why these individuals were
the only members of the family that were obese.
Now, we cannot necessarily treat these particular diseases at the moment, but a diagnosis is
very powerful in these families, and in this particular example it was extremely powerful
because the families at the top, several members of those families, several children in those
families, have been taken into social Ccare because they have been regarded as case of
parental neglect. You've got a morbidly obese parent, often morbidly obese sibling and you've
got another child become morbidly obese. These kids were then given back to the families,
so it could be treated but it obviously had a major impact on those families' lives.
This is just one example of one rare disorder where we found that these copy number variants
were important. But the types of copy number variants we're talking about here are not
common. They are very rare. In fact, extremely rare. Fewer than one in a thousand individuals
will have this, one of these rearrangements.
So, and that's what we see in common with many different rare disorders CNVs are not
necessarily the only cause, but they account for an appreciable portion of many of the
rare diseases.
So, why do we see this difference? Why are rare gains and losses of DNA important in
rare disorders, but pretty unimportant in common diseases?
So I think the reason is probably just down to the sheer weight of numbers. So if we think
about common diseases, so remember these are – the genetics of common disease, we can
only look at the common variants, those that are a subtle effect on the risk of disease,
and if we look at the – all the common variants in the people in this room, we’ll see that
the vast majority of them are not CNVs, for every common CNV we've got a thousand other
common variants of other types. And all of those have been through the filter of natural
selection. They have only become common in the population because they are not highly
harmful to us. And so, it is perhaps unsurprising that when we look for common variants that
influence common disease, CNVs do not play a major role, but if we look in rare disorders,
if we look at rare variants that essentially knock out an entire gene, then actually, copy
number variants probably account for one in ten of those. And that underscores why probably
10% of rare disorders are caused by copy number variants. So we see this interesting dichotomy,
CNVs are clearly biologically important but it is really the rare ones that are playing
the major role in human disease.
So I want to move on to the second part of my talk now, which is about mutation rates.
So I alluded initially in that obesity study to how new mutations can be important for
disease, but new mutations occur in all of us. All of us have mutations in our DNA that
our parents did not pass on to us. And we need to think a little bit about the journey
that DNA takes, as it goes from one generation to the next. So, this, you may not recognise
it, is you, at a very early stage in your life, when you were just a single cell, after
the fertilised egg. And we know that as you pass on your genes to the next generation,
you pass it on through your ***, or through your eggs. How does the DNA get from that
original single cell down to the *** and the eggs? What happens is you get early development,
you get division of these cells, copying of the DNA each time the cell divides because
each cell has the same DNA component as each other, until we reach the primordial germ
cell, and here the paths of men and women split. So to create eggs, there are an additional
set of genome copies, but essentially all the eggs that a woman has in her life are
in her ovaries when she is born. And all that happens during the menstrual cycle is that
one of those just matures with – and so that means that every single egg that a woman
produces, the DNA in that egg has been copied the same number of times, rough low about
30 times. But it is a different story with men. Men produce *** throughout the course
of their life, post puberty. And so what you see here is at puberty, a particular special
form of cells called [inaudible term] and these turn over severe sixteen days in the
test ease, so 23 times each year, the DNA is getting copied. That means that the ***
of a 40 year old man has DNA that has been copied probably twice as much?
Maybe three times as much as the *** of a 20 year old man. So if we think that this
copying process, like all copying processes, is prone to error and maybe that's where mutations
come from, then it suggest not only that most new mutations might come from dads rather
than mums, but also that the number of new mutations might increase as a dad gets older.
And that hypothesis was made a long time ago, because you can make hypothesis as soon as
you understood this journey of cells from one generation to the next, but we can use
the tools that we have to measure mutation rates and see if this is really correct. So
there is a number of different strategies one could take to measure mutation rate.
In concept, they are all very simple. All you are really doing is comparing the DNA
of one generation to previous generations. So on the left-hand side, you have comparing
the genomes and the *** of a man to his – to the rest of his DNA, in the middle
you have the obvious example of sequencing the DNA of a child and comparing to their
parents, but equally one could look at the mutations that have happened over evolutionary
time by taking two species that have had a common ancestor. That's a relatively straightforward
experiment to do and it highlighted when the chimpanzee with genome was sequenced compared
to the human genome, some very intriguing observations. Focusing on this evolutionary
approach first, what was observed was look specifically at the single letter changes
and the differences between humans and chimps. What was observed was that the number of differences
you see between humans and chimps depends on which chromosome you are looking at. So
at the top here, you've got the X chromosome, in the middle, you've got all the other chromosomes
bar the Y chromosome down the bottom. And if you compare the humans and chimps genomes,
what you see is on the X chromosome there is a difference between humans and chimps,
about one every hundred letters of DNA, but at the bottom, the Y chromosome, there is
a difference about one every 50 bases, so this suggests the Y chromosome has been mutating
twice as fasts as the X chromosome. And the other chromosomes are in between.
Now, if we think a little bit about how these chromosomes are passed on from one generation
to the next, this fits with the hypothesis that I mentioned before, that most mutations
come through the male line. Because mothers pass their X chromosome on to their daughters,
and to their sons. Fathers only pass their X chromosomes on to their daughters, and that
means the X chromosome's journey through evolution spends two-thirds of its time in the female
genome line and one-third in the male genome line. If we compare that to other chromosomes,
the non-sex chromosomes here, they spend an equal amount of time in the male and female
line, but the Y is only ever passed on from father to son and it has the greatest divergence
between humans and chimps, so this very much fits with the idea of males being more mutagenic
than females.
So through this kind of study, and others, we have the picture of what the average genome
might look like, so your genome here, were we to sequence it now, we would find there
would be about three-and-a-half million variants that differentiate your genome from the person
next to you, and indeed from the referenced human genome. If we focus just on the genes,
we would find a much smaller number, we would see about ten thousand variants in those genes
that would affect the protein that those genes produce. So a much smaller number, and if
we focused specifically on how many of those three-and-a-half million variants are new
mutations, we would find that each one of us has somewhere in the region of 50 to a
hundred new mutations in our genome that our parents didn't have. Now most of these mutations
are the single spelling errors, single base changes. And what we know is that from the
studies comparing humans and chimps and others, we know that the mutation rate for a given
letter of DNA is about one in a hundred million generations. Now that's pretty good. Given
all the number of times that DNA has been copied going from one generation to the next,
and only one in a hundred million bases is being mutated. But a hundred million is a
big number. But it is a lot smaller than the number of humans on the planet today, 7 billion.
And what that means is actually every single base in the human genome, in the reference
sequence, has mutated tens of times in the humans that are currently living on the planet
today.
Now, if we think about other forms of variation, so the gains and losses that I mentioned to
you before, then actually these occur at a much lower rate, so it is probably only one
in 20 of us in this room will have a deletion or a duplication of a segment of DNA that's
longer than maybe, say, a thousand bases, that's new that our parents didn't have. And
we can understand a bit more about that process by taking a different strategy for measuring
mutation rates that I mentioned before. Comparing the *** of a man to the rest of his DNA.
And what we need to do this, because each mutation process is vanishingly rare, the
types of deletions that I'm going to describe to you are – mutate a bit more rapidly than
a single base, it is more like one in a hundred thousand, but we have to design assays that
are essentially capable of picking out the one *** that has a deletion against a background
of 99,999,000 *** that have no deletion whatsoever. A really fantastic Post Doc who
worked in the lab, Dan Turner, actually designed eight of these different assays to look at
different portions of the genome, and I'm going to describe to you what he found. He
looked at four regions of the genome where we know that there are deletions or duplications
that cause genetic disorders. And he designed assays for each one of these to work out how
fast these are occurring in a man's ***.
What he found was that the most rapid one was occurring about one in 24,000 ***, this
deletion here, the slowest one, this duplication here was about one in a million ***. Now,
the individual *** donors that Dan looked at were just normal men, drawn from the population.
They didn't have a genetic disorder, that means that all of the men in this room have
*** in that I ever test ease now swimming around with these deletions and duplications
in them. But we could also determine something of medical relevance from this, because we
could compare the frequency with which we see these mutations in ***, with the frequency
of the disorders that they cause, and if we looked at two disorders where we think they
are pretty well diagnosed, these disorders, and we have a good sense of how frequently
they occur in the population, then they actually agree very well with the mutation rates that
we identified from ***. But if we looked at two other disorders, here shown in green,
actually, what we observed was the mutation rate we observed in *** was considerably
higher than the frequency with which these disorders are being diagnosed so we hypothesised
that actually these are being under diagnosed, and subsequent studies have shown that to
be the case. We also investigated this very rare event here, which had never previously
been reported in humans, but we thought probably, from our understanding of the mutation process,
would exist, and because this duplication is very similar to other duplications that
we know cause disorders, we predicted it would cause a particular type of developmental disorder.
And subsequent to us publishing this, Jim Lupski's group have gone on to show that's
the case. What it shows is if you have a good understanding of the biology of mutation,
you can predict diseases that you haven't observed yet.
We also wanted to think about mutation as a biological process itself. A bit like height.
We know that height is influenced by environment, we know that height is influenced by genetics.
Is it the matter for you take rate is also influenced by mutation rate and genetics we
looked at multiple different *** donors for one particular rearrangement, this fairly
rapid deletion that I mentioned before. And we saw quite a lot of variation. But these
are all quite rare events. You have to sift through millions of *** to find a few tens
of events, so is this variation just random or not. So we thought one way to answer this
is by looking at *** from twins. So if we look at identical twins, and they have very
similar mutation rates, that suggests there is something systematic that's influencing
those mutation rates but if they have dissimilar mutation rates that suggests that genes and
possibly a shared environment is not so important.
So we measured the rate in the first twin and the second twin of an identical pair and
if there is no relationship between the two we would expect to see just a flat line. No
obvious relationship, if the first twin has a high mutation rate the second twin could
have a low mutation rate. What we actually observed was the opposite. If one twin had
a high mutation rate the other twin had a high mutation rate.
Now, it is quite hard getting *** from twins. Harder than you would think. Quite often because
one of them has had a vasectomy. And it is impossible to get *** from twins that are
all the same age. So we've got pairs of twins that are different ages. So if age is an important
determinant of mutation rate, then maybe that's just explaining why some of these twins have
very similar mutation rates. So we looked at that, but actually, what we observed, was
there is absolutely no relationship between the age of a *** donor and the rate of deletions
that occur. So this is a different mutation process, from the one I described previously
of single base changes, and this one doesn't appear to have a relationship with age. But
if we want to understand the mutation process that's most prevalent, those that's generating
the single base mutations that I mentioned before, then we need to do the very obvious
experiment of looking at children and comparing them to their parents.
So, we've done this, and it is a quite simple experiment, what you essentially do is you
have mother, father and child, you have a sequence DNA sequencing machine, hopefully
one with a nice neon light on, and you just look for the new mutations that the child
has that the mother or father doesn't have, and then you need to use a few genetic tricks
to try and work out which ones came from dad and which ones came from mum. And the first
two children that we looked at in this way, as part of a largescale international consortium
called the thousand genomes project, these cartoons represent an awful lot of work, but
essentially in the first child that we looked at, we could, indeed, see most of the mutations,
and each one of these dashed lines indicates a mutation, most came from the dad, and a
small number came from mum. And then we looked at the second child, and we actually found
that more of them came from mum, and fewer from dad, so there is quite a lot of variation
between both of the mothers and how many mutations, new mutations arose on the chromosomes that
they passed on, and between the fathers. So we thought we need to try and explore this
variation. Why is there this much variation? Well, some of it could occur just purely by
chance. Because we're looking at relatively low numbers of events, 50, so if the average
number of mutations, in my *** at this moment in time was 50, not every single one of those
*** would have 50, some would have 40, some would have 60, and if you think about the
mutation process and how it works, broadly speaking, it is quite reasonable to expect
quite a lot of variation between the ***, even between one man at one point in time.
So what we wanted to do, was to try and get around this variation, was to look at families,
where the same mum and dad had had kids over quite a period of time. This is all published
data. So we looked at three families that had had four children over more than a decade's
worth of time, and the simple idea was we wanted to compare the mutations that each
child got from mum and dad and see whether it changes between the oldest and the youngest
child. And what we observed from these three families, so each one of these lines is the
number of mutations we saw in the children in each one of those three families, as the
father got older. So you can see in each one of these families, there is a clear relationship
between the age of the father, and the number of new single base changes that we see in
the children. But strikingly, it’s not the same pattern between the families. So these
two families here, every year that the father gets older, his *** appeared to be acquiring
three mutations per year, whereas this family here, it is fewer than one-and-a-half mutations
per year. And other people have estimated this in large numbers of samples, and they
have estimated a population average about two new mutations every year is about what
we expect to see, but this evidence suggests that it may he not actually be the same between
families or between individuals, so we need to do more work to understand whether genetic
or environmental factors that are influencing these mutations.
So, we also have been able to look at whether, what fraction of work was from the mum and
what fraction was from the dad and as we expected most are from the dad, so that hypothesis
that more mutations would be from dad, and that there would be more mutations as dad
got older really seems to hold true for these single base changes here, but it doesn't hold
true for these – the particular mechanism of gains and losses of DNA that I showed you
before. So why do we see this different effect of the father's age between these two mutations
processes in well, to understand that, we have to go back to the journey that DNA takes
as it moves through a generation. So, if we look at this, this process here, these small
changes of a single letter, then these can actually occur at any stage in this, because
these can occur at any time that DNA is copied as it goes from one cellular generation to
the next. Whereas what I didn't tell you about the mechanism that generates this type of
variation, is that it only occurs in a very specific cell division that only occurs once
during the maturation of an egg, and once during the maturation of a ***, and so this
type of variation doesn't have the same paternal age effect, and it also doesn't necessarily
come more from dad. So there are actually different mutation processes, each one with
their own properties that we need to understand, rather than a one size fits all.
So just in the last few moments I just want to talk a little bit about the disease impact
of these new mutations. I gave you a couple of examples earlier of those families that
had children with extreme early onset obesity, but new mutations are increasingly being recognised
as a cause of rare disorders, especially rare developmental disorders, and I and my colleagues
and some of them are here, Carolyn Write and Helen Firth, have been working on a project
called the deciphering development will all disorders project, which is a collaboration
with the entirety of the NHS, and the genetic services within Ireland, to try and see if
we can use the kind of technologies that we have access to, those arrays that I showed
you, the sequencing machines that we have access to and the NHS doesn't, and try and
use them to diagnose children that the NHS cannot currently diagnose. So the nature of
this kind of clinical problem is that most of the time the child comes in to a clinical
genetic centre with a severe developmental disorder, the parents are perfectly healthy,
and so we can ask the question, are new mutations one of the reasons why we see most of the
time the child is the only one affected. And if we understood what is the genetic architecture
of these different disorders, because there is a whole set of different disorders involved
here, then we better inform the NHS. What kind of technologies they ought to implement
to cost effectively diagnose these children. Now recognising that many of these children,
there won't be cures available for these disorders, but they can have a massive impact on families,
in terms of informing them about how – what the risk is of having a second child with
the same disease, and potentially offering them pre-implantation genetic diagnosis, to
avoid having that disease in the second child, but also many of these children are misdiagnosed
and they are on inappropriate treatments, so the families are really desperate for a
diagnosis, and we're working very hard to try and use these new technologies to provide
them. So the ultimate aim of this project is to recruit into the study 12,000 families,
each with a child with a rare developmental disorder. And thus far, having analysed the
first thousand of those families, we can diagnose about 20% of these children, just with our
current understanding of what kinds of mutations can cause disease, and we're finding that
most of those diagnoses are new mutations, as we might predict, and the reason we can
pick those up is because in this study we're looking at the DNA of the children, alongside
the DNA of their parents. So most of these mutations are new and most of them are the
small, spelling kind of errors of single base changes that I mentioned before. So, the question
then arises, we've had these new technologies, we've made new discoveries, what new ideas
stem from this? Well, and this is somewhat provocative, we can think now with our current
understand, what would it take to minimize the morbidity caused by these new mutations?
What kind of things could we do as a society, or should we consider doing as a society that
might enable us to minimize them? Well, the first thing that we understand, many of these
disorders are caused by new single base changes, and we know that older men have more single
base changes than younger men. So actually, one simple thing we could do would be to all
donate *** to ourselves aged 18 or 20, freeze it, and only then use those *** to conceive
children, and of course, the problem with this is we cannot just target this at individuals
at risk, because all of us have new mutations and all of our children have new mutations,
so it is not possible to identify who is going to be at risk, it has to be a population wide
strategy. Because the number that I mentioned before, a hundred million, one in a hundred
million generations, that's a large number, smaller than the number of the people on the
planet, it is also smaller than the number of *** in the test ease of every man in
the womb, what that means is that in the *** of every man in the room, are *** carrying
every single disease mutation that we know about. And that means it is purely a matter
of chance whether we go on to have children with developmental disorders caused by these
new mutations so not possible to identify those that are at higher risk. Other than,
of course, parental age, but of course one doesn't have a time machine to go back and
get *** from one's 18 year old once one has decided to have children aged 40. So the
other approach that one could potentially take is using prenatal screening. Now, we
already screen prenatally for developmental disorders, and we use, we do this using ultra
sound, and at 20 weeks there is a scan that is offered to parents, most parents take this
up, but this will only pick up developmental disorders that manifest themselves as some
kind of large-scale structural change within the fetus that can be picked up by ultra sound.
It won't pick up for example whether that individual might have seizures for the rest
of their lives or never be able to speak. It won't capture those functional deficits.
It is also very hard to counsel parents about what to do in that situation because when
there is a structural problem it could be that that structural problem comes along with
all kinds of other intellectual disabilities, or it could be that that structural problem
is just in isolation and it could be easily repaired during the first few years of the
child's life who would then go on to have a happy life and currently parents who have
these scans that reveal that there is a structural problem with the fetus, have a very difficult
decision to make, and potentially, we can make that decision more accurate or give them
more precise information, if we added genetic screening into this, but this clearly is something
not everyone will be comfortable to do, and it is not the job of scientists to tell society
what society should do, but it is the responsibility of scientists to say what society could do,
based on our current understanding of what is causing these sometimes devastating disorders.
So these are relatively new ideas that need discussion and debate, and that kind of squares
the circle of what Sydney Brenner was talking about, was that new technologies begat new
discoveries, begat new ideas. And recognising that there are many things that we do today
that are historically or prehistorically would have been regarded as being absolutely abhorrent,
and we take completely for granted, and this gives me a shameless opportunity to show my
favourite Raymond Briggs cartoon as he proposes to his father that perhaps floppy trousers
rather than these chiseled stone pants that his father is wearing might be a more appropriate
way of dressing. So I think Sydney was very much right, there is a lot of value in scientific
progress to exploring what new technologies enable you to see about the world that you
didn't previously appreciate, and then deriving from those new ideas that then begat further
progress. And with that, I would like to thank all of you for listening, I would especially
like to thank many of the people I didn't have time to thank in the talk that we've
worked with over the course of the years, all of what I have described to you is very
much a teamwork between different people, between clinicians, between researchers, between
computer scientists, and it is only that I'm standing here in front of you that the hard
work of all of those people. I mentioned a few of them, as I went along, but there are
many others, some in this room to whom I apologise in advance. And it is also, I think, extremely
important that we thank the families who contribute to these studies, both the families who are
perfectly healthy and volunteer for the research that we do on understanding mutation rates,
and the families that are desperate for a diagnosis and volunteer to be part of studies
like the DVD study that I mentioned before. So, last month we tried to give back to these
families, in some small way through charity bike rides, we organised nine around the country,
with the clinicians, and the researchers involved in the project, and between us, we cycled
about 4,000 miles that weekend, which is pretty much the equivalent of Lands End to John O'Groats,
back to Lands End, back to John O'Groats and back again. So these two family support groups,
Swan UK, that's syndromes without a name, and unique, and completely invaluable job
of working with these families to help them negotiate this tricky path they are trying
to find a diagnosis for their children, and the prize money for this lecture that Jean
will hopefully give me in a moment will be going to this fund and I recognise that not
all of you can read that, but I have a whole set of leaflets that I will be putting outside,
that contain that URL, and I would like to encourage you to support these charities because
they are very worthwhile, what they are doing for these families, and so with that I would
just like to say thank you to you again and be happy to take any questions or comments.
[applause]
We can thank him properly later but Mat said he would take questions so if you have a question, put your hand up.
So here. Yes?
Thank you very much for a wonderful talk. Chromosome abnormalities. You didn't really
talk about them. As far as downs syndrome is concerned, I believe it is the mother's
age that is more important. I was wondering if I could comment on that kind of abnormality.
Yes. So that's very much well-recognised that downs syndrome and the other [Inaudible] do
increase with mother's age and they do increase in this kind of linear way with father's age
but they occur in a kind of S shape so it ramps up dramatically after 35. That's extremely
well known and it is interesting to reflect on the fact that the way in which we orchestrate
prenatal screening in this country takes account of the mother's age. It recognises that epidemiological
relationship. There is no similar equivalent of taking account of father's age, for example.
And yet it is an open question as to whether a father's age and new mutations of single
base changes is actually more detrimental than ma turity in all ages in causing these
chromosome all abnormalities, but that's hopefully something we'll be able to answer in the next
couple of years.
Are there any environmental factors that can affect the rate of mutation?
That's a very good question. So, people have looked very hard, and found precisely there
are no recognised factors, environmental factors that increase germline mutation rates as they
pass on, but it is quite difficult work to do, because the mutation rates are quite rare.
There are several, tens of factors that are known in mouse studies that increase mutation
rates, and so we would assume, but they are the kinds of experiments that one cannot do
on humans, certainly not ethically, anyway. And so it is highly likely that there are
environmental factors, but we cannot do those experiments, so we just have to rely on natural
experiments of, often of our quote-unquote, "ingenuity", things like the Chernobyl accident
or nuclear test sites or exposures to other types of environmental mutagens accidently.
The nature of those experiments, it is often very difficult to do them because we don't
really know what dose people received because it is not a controlled experiment. So there
are likely to be environmental mutagens, and we don't know what are.
Any more questions? Yes? Here. Thanks.
Where you are doing the prenatal screening, presumably you will only suggest rejection
of a fetus or an embryo which has a recognised lesion, which is going to cause a disease.
You won't just ask for an embryo that has got a lot of changes which don't go to particular
areas that you recognise to be thrown away? Because if you do that, aren't you reducing,
in the long-term, the genetic variant of the human population, and its ability to evolve?
So, I mean, firstly, I guess I would say that the way in which, and I think this is the
right way in which prenatal screening is done now, is that it is non-directive. Parents
get to choose, it is up to the parent to choose what the results are, and scientists, doctors,
no-one should really be telling them what to do, and all we can be doing is giving them
the most accurate information to make those choices. I think the point that you made,
I think given the sensitivity of prenatal diagnosis, if it were to be implemented, it
would have to be absolutely rock solid evidence that we knew that that variant was really
going to cause a completely deleterious change, but that's my view. And we already know from
prenatal screening that there will be, you know, fetuses with heart defects that parents
will decide to keep, and fetuses with heart defects that parents will decide to terminate,
and that's their choice. I think that non-directive view of this is very important. And I guess
as to your second question about evolution, I'm not too worried about that. I think there
is plenty of new mutations being produced all the time, every single base is being mutated
in the genome as we speak. So there is plenty of fertile material for evolution.
Is there one? Yes.
I have a question about sequencing. When you say something is something like cc T, can
it also be read as GGA because that's the other side?
Yes. So typically, we're only showing one of the two strands of DNA.
So which ones do you show first? When you give them out and publish them, which comes
first?
It depends exactly on the nature of the publication, so it depends if your gene goes that way along
the chromosome or that way along the chromosome. But typically, if you don't have an imposed
direction we tend to go from left to right.
Well, which of the two halves do you take? A or T?
We take the one that is – so you can imagine, one DNA strand going in that direction and
the other going in that direction, we tend to take that top strand and report it.
Okay. I think we're passed our finishing time, so we should stop there. But I want to thank
Mat for a lecture that was packed full of information, *** up-to-date, and thought
provoking, and I think that is reflected in the sort of reflective attitude of the audience
now on the sort of questions that we've had. I think it has been a splendid lecture, and
in order to recognise that, he is going to get a cheque, but first of all, he is going
to get a nice certificate, so thank you very much. [applause] a very nice medal, the Francis
Crick medal. [applause] and as he said, he is going to donate to charity, which I think
is wonderful. Thanks.