Tip:
Highlight text to annotate it
X
In our last lecture we introduced the concept of zero of a nonlinear function. Let us just
briefly review what we have done on this topic and then proceed from there.
We are considering an equation in one variable; fx is equal to zero. This function is either
a polynomial of this particular form or it could be a transcendental function which will
be a combination of polynomials, exponential functions, trigonometric functions and logarithmic
functions. We are interested in finding the zeros of such a polynomial or a transcendental
function that is given to us.
We have defined a zero or a root of a function fx as any value x is equal to xi which satisfies
the equation automatically. So that fx is equal to f xi is equal to zero. Now I would
like to distinguish between a simple root and a multiple root of a given polynomial
or a transcendental function. If, x is equal to xi, xi is a simple root then f of xi is
zero, but its derivative f dash of xi is not equal to zero. For example x is equal to 1
and 2, 1 and 2 are simple roots of this equation; a polynomial equation which can be factorized
as x minus one into x minus two.
Similarly if we have a root of multiplicity m, then derivatives up to order m minus one
are zero and nth derivative at xi is not equal to zero, then we shall say that it is a root
multiplicity m by m; And if you set m is equal to one we go back
to the simple root that it is an f of xi is zero, f prime xi is not equal to zero. Now
if it is a root of multiplicity m, theoretically it is possible for us to factorize it and
get the factor out, which means I can write down fx minus xi to the power of m into gx
is equal to zero, g xi is not equal to zero. This may not always be possible to do for
a transcendental function because, it is not possible to the write common factor and write
it; but for a polynomial, yes, we can do it. But theoretically it would be possible for
us to write it in the form of, fx is equal to x minus xi to the power of m and gx is
equal to zero. We also gave an alternative definition for defining a multiple root, so
that if you want to you have a method to obtain a simple root and the same method could be
used if you are able to construct a new function which also has a simple root. So to use that
concept we know that, if fx has a root xi of multiplicity m then its derivative has
a same root xi but of multiplicity of m minus one.
So if I now consider the ratio of these two functions i.e. fx and f prime x then hx has
a simple root xi. Now as I said earlier, if I am able to construct methods which give
me the simple root much more easily, then I can use the same to find the multiple roots
also and apply the same method on the function hx in such a way that I will get the simple
root of hx and hence the multiple root of fx is equal to zero. As an example we gave
this as the multiple root of multiplicity three, so that here I am able to factorize
it and then take x minus two whole cube downside. Now what is it that we are really looking
for; we are talking of zero or a root of an equation and what we mean is, if I draw the
graph of this function, y is equal to fx then either the graph is cutting the x axis like
this or it is cutting the x axis like this. We are interested in finding that value which
is the point of intersection of the axis with the graph of the y is equal to fx.
Now if you want to construct methods for this, there are two types of methods; one is called
the direct methods the other is the iterative methods. In the direct methods we can produce
the solution in a finite number of steps. Therefore it means we will be able to count
the total number of operations in a particular method and say you need to do this many multiplications,
this many divisions, this many additions and subtractions. Now usually when you are giving
an operational count, the total count we normally give the major operations which is your multiplications
and divisions; and the minor operations are additions and subtractions. We know that addition
and subtraction take much less than the multiplication and the divisions. Sometimes we include all
the four but very often operation count would mean the major operations.
Now in an iterative method we use the idea of successive approximation to a root which
means we start with an initial approximation, we will get the initial approximation through
some other way and then starting with that initial approximation we shall now proceed
to find the sequence of approximations x1, x2, x3 and which we hope it will converge
to the exact root of the solution. We will have a look at when it will converge
when it will not converge.
Now in the example that we have just given the polynomial a0x square plus a1 + a2 is
equal to zero; If I straight away solve this and write it in this particular form then
this will be called as a direct method. I am able to write down the solution of a problem
in just one step and if I want both the roots then I can write it in two steps. Now as I
said last time we will be able to count the total number of multiplications. Here there
are three multiplications. As we have counted here there are two for this one, a square
can be taken as a one into a one; so I have got here three multiplications, I have a division
here. I have got an addition here and an addition here, so there are two additions that are
coming into the picture here and there is one square root. So I am able to give out
the total operation count for this so that I will be able to tell what will be the total
amount of computed time that a method can take. So if I am given the time factor for
multiplication and division, suitably I can multiply that and then say this is the total
amount of computed time that this method will take for this solving this particular problem.
Now if I take the example of this a0x square plus a1x plus a is equal to a2, I can write
down a number of iterative methods. Which one of them would converge is not known. We
have to do the convergence analysis of this and see which one would converge and which
one will diverge. For example, for a simple trivial equation
I have taken these two terms a1x plus a2 to the right hand side, then a0x square is written
as a0 x into x. So we have taken a0x to the right hand side and then we constructed that,
this will be the new approximation and this will be the old approximations. So I now construct
an iterative method of this particular form. Similarly in this case we have kept the middle
term on the left hand side, the first and the third terms to the right hand side and
then written an iterative method over here; and in third case we have taken these two
terms, taken x as common from here and I have taken this a2 and this quantity to the denominator.
So therefore this describes three iterative methods for solving the same problem and all
of them are in the form xk plus one is equal to g of xk. So that the new approximation
is here and the old approximation is over here. We will later on see which one of these
two will converge.
We have just now mentioned that it gives us is an infinite sequence of approximations.
Now it is very important for us to know, does the sequence converge at all. I mean what
is the convergence criteria, how you are going to impose it. Secondly it is an infinite sequence
of approximations, so when do we stop this iteration; because theoretically we are doing
infinite cycles of computation but we will have to terminate it at some stage. Termination
stage could be a particular accuracy or some other things should be given so that the iteration
can be terminated at some stage.
Now let us define convergence. By convergence we mean that we have a sequence of iterates
xk, it converges to the exact root xi. If we are taking infinite sequence, limit k tending
to infinity xk minus xi is equal to zero. Now I can define xk minus xi as error at any
iteration because this is the value that we are obtaining at a particular iteration and
this is the exact root. So I can define this xk minus xi as the error at any particular
stage particular. Now during that iteration I can find out what is the magnitude of the
error and if the error goes to zero as k tends to infinity, we are getting limit xk is equal
to xi which means we are now achieving convergence. Even though this is the test we shall see
how we are going to apply this.
Now the second point is when we should stop the iteration. Now the stopping criteria or
criteria that we have achieved a particular convergence is, let an accuracy epsilon be
prescribed. We can also use the word tolerance for accuracy. In a particular problem if you
require five plus accuracy or six plus accuracy of the solution or in other words once you
prescribe the accuracy that you want in a problem any one of this criteria can be used.
What we can do is we can find the difference between two successive iterates. Now we starting
with x0 we got x1, we got x2, we got x3. So I can find out the difference between these
two or any two starting at a particular stage. After a few iterations we can start testing
what is x3 minus x2 in magnitude, what is x4 four minus x3. So I can go on finding,
in the magnitude what is the difference between two successive iterates; if it is less than
the tolerance that is given to us then I can stop the iteration and then give the output
that is required. Alternatively we can also stop the iteration procedure when the equation
is satisfied to a certain accuracy i.e. our equation is fx is equal to zero. So if at
a particular iterate namely xk plus one, if I substitute f of xk plus one, for say ten
to the power six accuracy; so once accuracy is prescribed I can look at the equation itself
and see whether the equation has been satisfied correctly or not. Therefore I can have this
particular criterion also.
Now let us just have a look at what it really means graphically. Now I am describing here
the first case where I have taken the x axis and fx here. Now this is the graph of this;
xi is the exact root. Now I have taken xk as a particular approximation. The successive
approximations move towards the exact root. So I have got xk plus one as a new approximation,
I am testing xk plus one minus xk here. As we proceed to successive iterations this xk
plus one would go on moving towards this and the difference between two successive iterates
at any particular stage is going to be less than epsilon; whereas in the second case we
are finding what is the value of f of xk plus one, f of xk plus one is nothing but fx an
ordinate. So if I take a particular approximation xk plus one I will be testing the value of
this ordinate. As we move towards this root xk plus one this f of xk plus one goes to
zero and finally f of xi is exactly equal to zero. Therefore here in this case we are
testing the ordinate, whereas in this case we are testing the abscissa i.e. the difference
between two successive approximations is less than epsilon or not. Now sometimes you may
have to use both, the reason being we would like to know when this criterion that we have
written or this criterion may fail.
Let us take this graph completely. The first criteria may fail when the graph of fx cuts
x axis almost vertically. Now let us see what is going to happen? I have drawn the graph
of this. Now just draw the graph almost vertically to this; then if I take any two successive
points here, xk on the left, xk plus one on the right (these two are the points). Now
even though they are very close to each other, since the graph is almost vertical, this f
of xk or f of xk plus one will be very large; that means the equation fx is equal to zero
has not been satisfied correctly. Since the graph is almost vertical, the first condition
that xk plus one minus xk in magnitude less than epsilon would fail. Similarly the second
criteria would fail if the graph is cutting almost horizontally. So the graph is falling
like this and then cutting the x axis. You could see the reason why it is so. If I take
any two successive approximations xk, xk plus one, when the graph is cutting horizontally,
the f of xk plus one is almost zero but the difference between xk and xk plus one will
be large. Now look at the graph which is almost tangential to this one and cutting it. So
if you take the value of the ordinate, it is less than epsilon but we are not getting
the root because we are far away from the root as it is cutting very horizontally. Of
course if you are going to test both of them together there is no question of failure at
any particular case. However in most of the cases we may not use both of them hoping it
is not a typical case of one of these two cases but there can be failure which means,
when you are solving a particular problem and you have difficulty, you may have to think
that may be a kind of the graph of the function is coming either horizontally or vertically.
Therefore we can use both the conditions together to avoid that particular pit fall.
Now the next step I want an iterative method but I want a starting approximation. Obtaining
a starting approximation is very important because, suppose an equation has got a root
at minus ten. If I start an approximation at say plus five or plus ten which is so far
away from the root, it may not be possible for the iterative method to converge that
root because it is too far away from the root. Therefore it is necessary that the approximation
which we are talking of should be in the neighborhood of the root. Neighborhood does not mean a
small epsilon neighborhood, it should be quite large, a reasonably large one. Maybe say the
root is lying minus ten, you may say the root lying between minus fifteen and minus eight
or root is lying between minus fifteen or minus five. So a reasonably large interval
is allowed so that we need a suitable initial approximation and in fact in many cases the
situation would become much worse if you have a system of nonlinear equations. Here we had
talked off one variable problem i.e. fx is equal to zero, if I take two variable problems
as fxy is equal to zero, gxy is equal to zero then it is nonlinear equation in two variables.
So I need very good approximation for the both the variables x and y. Therefore this
situation will become more important in the case of system of nonlinear equation which
is very commonly uncounted in almost all your applications.
Now there are two ways of obtaining the initial approximation. One is the use of graphic method;
the other is a use of intermediate value theorem. Let us see what we mean by the graphic method.
In the graphic method if I am able to draw the graph approximately (we are not talking
of exactly drawing the graph) if I am able to draw the graph of x is equal to approximately,
I can use it to get an initial approximation. So draw the graph of it let it cut the x axis
at a particular point and when it cuts the x axis at that point I can take that as the
required approximation or initial approximation. So what we are saying is that take your axis,
take the graph of this, substitute few values of this because fx could be a very complicated
function. So substitute few values, get the values and then just plot a rough graph of
this and then wherever it is intersecting take that point as your approximation as x0.
So we shall take that as the initial approximation for starting the iterative method. When once
I have an initial approximation I do not need anything else, the method itself will give
me the successive approximations.
Now if we are not able to write or plot the graph of x is equal to zero we can think of
breaking the function fx is equal to zero into two parts; as f1x is equal to f2x which
means I could write it as fx is equal to zero in a convenient form as f1x is equal to f2x.
For example, if I want this graph the approximation for x minus sin x is equal to zero, I can
write this as x is equal to sin x, take f1x as xf2, x as sin x. So I know the graph of
f1y is equal to x. I know the graph of y is equal to sin x. Now wherever it approximately
cuts it I will take that approximation as my initial approximation. So this is the idea
behind, if you want to write fx is equal to zero in the form of f1x is equal to f2x we
can break it like that and then draw the graph of y is equal to f1x , draw the graph of y
is equal to f2x. The point of intersection is a root. Any value close to this point is
an approximation. So approximately we are drawing the graph and getting the point of
intersection and taking that as my approximation. But of course it is not a practical method;
we strictly are not going to use it in our applications because it is very difficult
for us to draw the graph of this and try to get the initial approximation.
Now let us illustrate how we are going to get the initial approximation. I take a simple
example of fx is x square plus x minus one is zero. So I want to get the initial approximation
for this one. It is a quadratic equation. I know it represents a parabola. I will try
to draw the graph of the parabola approximately. So I can now write this y is equal to fx,
then make this as a perfect square, x plus half whole square and then add and subtract
one by four. So I will get minus one by four. So I take this five by four to the left hand
side, so y plus five by four is equal to x plus half whole square. So from this I can
recognize that this is a parabola with the axis opening upwards.
So I can say that this is a parabola vertex at, minus half minus five by four and opening
upwards because of the negative sign; and now I take x axis, y axis, draw the graph
of this parabola, I know this is a vertex and draw approximately. Just take one or two
points here and then draw the graph of this and then I have got the two points a and b.
Depending on which root is required whether the positive root is required or negative
root is required, I can take the approximation accordingly. I can take this value for the
positive root and this value as the negative root. So I can draw the graph of y is equal
to fx is equal to zero in this form or as an example given earlier you can take the
graph of y is equal to f1x and the graph of y is equal f2x.
Now let us discuss the other way of finding the initial approximation, which one is the
most useful and which one we shall use often in our analysis, which is called as the intermediate
value theorem. We assume of course fx is a continuous function in a, b and the f of a,
f of b is less than zero. Then fx has at least one real root or an odd number of real roots
in a, b. Now what we are really saying here is, fx is a continuous function on a, b. f
of a into f of b is less than zero. So the graph need not be like this. It does not matter
if it is cutting the other way around. Now you can see that f of a into f of b is negative
that means one of them is negative and other is positive. Therefore either f of a is negative
here; this ordinate is negative; this ordinate is positive or alternatively this ordinate
is positive, this ordinate is negative.
When the product of these two ordinates is negative the graph has to cross the x axis.
Since the graph is crossing the x axis we have located a root in that particular interval.
Then fx has one real root or an odd number of real roots in the a, b. Obviously I have
taken a nice graph like this but it is possible that the graph is cutting the x axis more
than once. So we will have located more than one real root. Therefore it will have one
real root or an odd number of real roots, particularly when the roots are very close
to each other. Say 1 and 1.2 were the roots of a particular equation. You might have seen
that the root lies between zero and three because it is going to be odd number of real
roots. So there are three roots 1, 1.1, 1.4 and you have picked up the interval as zero
and two, therefore it will say that it has got at least one root in the interval; one
or odd number of roots. So this way I can locate it and once I locate it I know that
I approximation should be good, that means the initial approximation should be close
to the exact solution. Therefore if necessary I can define, let us
suppose you say that zero is between zero and fifty, now this interval is too large.
I can reduce this interval further to test whether at 25, say zero and twenty five is
the root or twenty five and fifty root is. It is possible by using the intermitted value
to reduce the length of the interval to any length which I want, and then I can go for
a numerical method and construct the numerical method.
Let us take this simple example of this same one earlier, where fx is equal to x square
plus x minus one is zero. Now I would test few values; f of zero substitute zero i get
minus one, substitute f of one I get one, substitute minus one i get one minus one,
minus one is minus one, f of minus two substitute it here I get four minus two minus one is
zero.
Now I can see that f of zero is negative, f of one is positive, f zero into f one is
negative and again f of minus two minus one are negative and positive, therefore its product
is negative. Therefore I have located both the roots of
this equation, one root lies between minus two and minus one; the other root lies between
zero and one. So I now located the two roots where they are lying and once I have this
one, then immediately I can go to any iterative method and I can start the iteration procedure
from there. Now usually we can form a table of values to find the interval in which the
root lies. Now here we are talking of, you need more than one root. So if you need more
than one root you can form a table of values and if you say a polynomial of degree five
is given and you want to find out where the roots are and whether all the five roots are
real, where they are existing, all that you would be able to find by constructing a table
of values.
Now when once we locate that the root lies in a particular interval a, b I would go for
iterating it further. See this is minus two minus one; there is a root between zero and
one. Now you are removing this and saying between minus two and zero, so trivially it
is there but what is the length of the interval minus two to zero? Length is two and the length
of the interval minus two to minus one is one. The length is one. Therefore this is
a refined interval than minus two zero. You can also take as the root lying between minus
two one zero. But once I have given this, which is a smaller interval, length of the
interval is one; therefore I have taken a refined interval in which the root lies. When
one if you want to iterate a particular thing it is okay to take minus two to zero. Minus
two to zero can also be taken and we can proceed on, except that it may take one more iteration,
an extra iteration to get the required accuracy of the problem.
Now the simplest and the trivial method is bisection method. The bisection method is
trivial because we apply the intermediate value theorem repeatedly, each time bisecting
the interval. He asked the question that the root lies between minus two and zero, yes,
the root lies between minus two and zero. Let us bisect it. The root bisection of minus
two, zero is minus one. Now I will test whether the root lies between minus two and minus
one or minus one and zero. Now I find that the root lies between minus two minus one
so starting with a particular length of the interval, each time I will bisect a length
of the interval and then go on repeatedly applying the intermediate value theorem and
we shall call it as the bisection method.
Now what we really mean is the root is laying between a zero and b zero, therefore the product
of these ordinates - f of a zero and f of b zero is less than zero. Now I bisect the
interval and write these a1 is equal to a0 plus b0 minus a0 by two. Here I may make a
comment, I had written the middle point of a0b0 not
as a0 plus b0 by two. But I have written this as a0 plus b0 minus a0 by two. Of course if
I simplify in our ordinary way it will be a0b0 but if you are going to a computation
with finite length, you can take a simple example of a four digit number and then two,
four digit numbers and then find out what will be a0 plus b0 by two and what will be
a0 plus b0 minus a0 by two. This value is going to be much more accurate than taking
a0 plus b0 by two. So when you are doing finite arithmetic where length is very small then
a0 plus b0 by two is an inferior approximation or an inferior value compared to this one.
Therefore I would prefer to use both, even though mathematically both are equivalent;
but for computation purposes I would take the middle point of the interval a0b0 as this
particular formula.
Now when once I bisect it I get two intervals a0 and a1 or a1 and b0 if f of a1 is not equal
to zero. Of course if f of a1 is zero it is a root. By definition f of a1 is zero; it
is a root. So if f of a1 is not zero, I will determine whether the root lays in the interval
a0a1 or a1b0.
Now we can test that immediately again by the intermediate value theorem. Therefore
what I can do now here is, I can test if f a0f a1 is less than zero; if it is so then
the root lies in this one, and otherwise it is in the other one. We do not have to test
the other one because there are only two intervals for us. If the root does not lie in one then
it lies in the other one. So I will test only the first one if the root lies here, then
the root is in the interval a0a1, otherwise the root is in the other interval a1 and b0.
Now we shall repeat this procedure until the required accuracy is obtained.
Now there is something very interesting in this bisection method and that is we are doing
bisection which means the length of the interval is reduced by factor of two, so each bisection
reduces the length of the interval by a factor of two. After n bisections the length of the
interval is reduced by factor two to the power of n. So I am doing n times the repeated bisections
therefore each time it is reduced by factor two namely - two square, two cube and so on,
after n bisections the length of the interval is reduced by the factor of two to the power
of n. That means if I start the interval at zero to ten, then let us hope the next interval
is in the somewhere near zero; so the next time we will have a zero to five that is a
factor of two. Now I divide it by factor of two i.e. 2.50 and 2.5, so it is now reduced
to a factor of two square. So each bisection will reduce for factor two, so that after
n bisections it is reduced by factor of two to the power of n.
Now let us say that an error tolerance is given i.e. accuracy is given to us. We are
trying to say that if I prescribe a tolerance of six plus accuracy or eight plus accuracy,
can you tell the number of bisections that is required, what is a total number of bisection
that would be required; it is possible for us to do that in this case.
Let us see after n bisections, the length of the interval is reduced by factor of two
to the power of n. Therefore the length of the interval is b0 minus a0b and this length
of the interval b0 minus a divided by two to the power of n is the current interval.
After n bisections this is the length of this interval and I want this length of the interval
less than epsilon. Let us suppose finally I say that the root lies between 0.156 and
0.156525; so these two are agreeing to the three decimal places. So we are finally going
to locate the root in the interval in which the a0 and b0. The difference between them
is less than the given accuracy epsilon. So this is the quantity that we have i.e. length
of the final interval is less than epsilon. Once this is given, here b0 is known, a0 is
known, and epsilon is known, so I can determine n from here. So I will take the logarithm
on both sides. This logarithm of b0 minus a0 minus logarithm of two to the power of
n is less than logarithm of epsilon; and then solve from here for n. I take this, keep it
over here, write this as minus n times logarithm of two, less than log of epsilon; then I bring
log of epsilon to this side, n log two to the right sign side and write down what is
the value of n. n is greater than one upon logarithm of two logarithm of b0 minus a0
minus logarithm of epsilon. So n is number of iteration that is required therefore this
states that I need a minimum of these many iteration whatever the nearest integer is.
I take the nearest integer and then say that this is the number of bisection iterations
that will be required to get this accuracy of epsilon.
Now there is something very interesting and the first thing is that (previous slide),
the right hand side gives the minimum number of iterations required because we are saying
n is greater than this one. So this tells us this is a minimum number of iteration that
is required for this one. But usually that is the exact number of iterations also; because
depending on the integer which you are going to round off, if you round it off to the next
integer then that will be the required one.
Now the most important observation that would be here is, in testing this bisection we are
testing on the sign of f of a0, f of b0. We are not actually using the value. Let us say
the value of f is minus 256.4, so we are taking the sign as negative but if we take another
function fx whose value is minus three again we are taking the sign as minus. Therefore
whatever we have discussed does not depend on the value fx, it depends only on the sign
whether the sign is positive or a sign is negative. Therefore all computations use this
sign of fx at a point and not its value, hence this result is true for all fx is zero. As
long as fx is negative the result is same. Therefore it is independent of a given function.
Therefore there are two things that happen; one is the results that you have just now
noted is true for all fx. This result is not for a single function fx, this result is true
for all functions, whatever be the functions that we are considering. And secondly since
we are not using the value of the function and are using only the signs, the method can
never fail because you are always locating the root in an interval a0b0. Therefore the
method can never fail, therefore in bisection method the question of failure does not arise,
but if that is so we do not need any other method. But the method is very slow in the
sense you start with the big intervals say zero to some fifty and then you finally may
have to do about fifty to forty iterations to get the required accuracy or may be more
than that to get an accuracy of six decimal places. Therefore it is for this reason that
we need a better numerical method which gives the solution much faster, but never fail.
Therefore initially when you are starting the iteration procedure one can adopt a bisection
method so that we are sure that you are getting an interval in which the root lies; then we
can jump from bisection method to any other method which will converge very fast compared
to the bisection method.
Now let us see for a typical example what will be the number of iterations that will
be required. Let us take an example like this. Let us say that the root lies between zero
and one. I have taken a simple one because in this case somehow the things become zero
and easy. I just want to substitute a0b0 over here so I have got here n greater than one
upon logarithm of two logarithm one minus zero minus logarithm of epsilon, so that this
actually goes off for me. It is easy to look at the values here, minus logarithm of epsilon
divided by logarithm of two; and then I would give the values for epsilon. The accuracy
that we want in a particular problem ten to the power of minus two and once I put ten
to the power of minus two, I can take this minus two outside and take it as two times
logarithm of ten divided by logarithm of two and then round off to the next integer. So
I will have seven iterations. If I take ten to power of minus three I get ten iterations,
if I take ten to power of minus four I have get fourteen iterations. Similarly I can construct
for any other given epsilon. Now if this interval was not 0, 1, and if this interval is say
zero to twenty or you have taken zero to fifteen, these number of iterations are going to increase
further because this number is going to be much larger, due to that reason even though
it does not fail any time it is still a very slow iterative method.
Now let us just illustrate on a simple example. Find the real root of x cube minus five is
zero. Now we would first of all locate an interval
of length unity. Let us take the interval of length as one. So f of zero is minus five,
f of one is minus four, f of two is three. Therefore I find f1 into f2, they are of opposite
signs, and therefore it is negative. Therefore I would say that the root is in the interval
1, 2. Now once I locate this interval, that it is lying in one and two, I can go for the
bisection and then say what will be the next interval in which the root lies.
Now the midpoint of one and two we are taking it as three by two, then I will find out the
value of f three by two and the value is twenty seven by eight minus five; therefore it is
less than zero. Now I test whether the root lies between three
by two, two because we had taken the interval as one and two, so either it should lie between
one and three by two or it should lie three by two, two. Now I find f three by two, f
of two to be negative, therefore root is minus three by two and two. Now if I take the midpoint
of three by two and two, I get seven by four. Now again I find the evaluate what is f of
seven by four, f of seven by four, this x cubed minus five, 343 by 64 sixty four five
greater than zero. Product f three by two and f seven by four is less than zero. Therefore
root is three by two seven by four. So now I can proceed on each time bisecting it and
get the interval. The procedure would stop when the difference between these two is less
than the given epsilon until that time the iteration shall proceed on.
When we are going to discuss the other methods we would like to discuss the computational
cost between difference methods and then say which is taking less amount of completed time,
so that the maximum time that is taken here is in computation of fx. Because fx is a complicated
function. So the major cost is on computation of fx; it is not in the one extra division
you are doing or extra addition that you are doing, the whole cost will be based on the
evaluation of fx. Therefore we can say the computational cost for bisection is one evaluation
of f, because each iteration will require evaluation of one of the values of the function.
So the computational cost for this is, one evaluation of a function fx. Suppose that
there are three roots in an interval, say the interval 1,2; therefore f of one into
f of two is less than zero.
Let us assume that the roots are 1, 1.2 and 1.3. If we bisect the interval 1,2 we get
the intervals 1, 1.5 and 1.5,2. We find f of one into f of one point five is less than
zero; therefore there is one root or three roots in the interval 1, 1.5. There is no
root in 1.5, 2 since f of one point five into f of two is greater than zero. Now bisect
the interval 1, 1.5. We get the intervals 1, 1.25 and 1.5. Since f of one point two
five into f of one point five is less than zero there is at least one root in it, the
root one point three lies in it. But f of one point two five into f of one point five
will be greater than zero since even number of roots cannot be captured by intermediate
value theorem. However if you bisect the interval 1, 1.25 then we get the intervals 1, 1.125
and 1.125, 1.25. Now f of one into f of one point two five is less than zero. Hence at
least one root lies in it. In the present case the root 1.1 lies in it. Again f of one
point one two five into f of one point two five is less than zero. Hence there is a root
in the interval 1.125, 1.25, the root one point two lies in it.
Now we like to move on to how to get the method better than the bisection methods. One is
the methods based on the first degree equation to solve fx is equal to zero. Now let me see
the concept behind this one. Now let us just look at the graph of fx. This is a graph of
y is equal to fx. This is our root that is required. Now we know that if you are given
any curve and I take any point on the curve, in the neighborhood of this particular point
on the curve the curve can always be approximated by a straight line. An infinite decimal segment
of curve can be replaced by straight line. But now what I would do is since I may not
be able to get infinitesimal length on the curve, I will take a straight line as an approximation
into the curve in the neighborhood of this and then successively make the straight line
become a real straight line at the point xi. Therefore in the neighborhood of an exact
root we approximate the curve by a straight line; and the straight line is of the form
fx is equal to a0x plus a1 is equal to zero. Now once I approximate the curve by a straight
line the root required is the intersection of the curve and the x axis. Now if I approximate
the curve by the straight line, the approximation to the root is the intersection of this line
with x axis; that means if I just said y is equal to zero here I will get an extra approximation
for this. Therefore if I am able to approximate fx by this straight line then I will have
here x is equal to minus a1 upon a0, that means I am just setting y is equal to zero
which is intersection of this line with the x axis. Therefore I would simply get x is
equal to minus a1 minus a0.
Now we say that we have approximated the curve by this straight line. How do you approximate
it by straight line, how do you get the values of a0 and y1. Now we can get different methods
by following different procedures to find the values of the constants a0and a1 and these
give us number of methods.
The first method that we shall take is a secant or a chord method. What is secant method?
I take any two approximations to the root let xk minus one and xk be two approximations
to the root. We are not saying that the root lies between xk, xk minus one but say if a
root is 1.2 I may take 0.58 as an approximation to the root. So there are any two approximations
to the given root, then what I do is, I will take the corresponding point on the curve.
We denote f of xk minus one by f suffix xk minus one, f of xk is f of k. Now if I take
this as the approximation then I know xk minus one, fk minus one is a point on the curve
and xk fk is also a point on the curve. But we have just now approximated . We have approximated
fx by this, that means these points that we have taken, they are all on this line therefore
these points would satisfy this particular equation. So I can substitute xk minus one
fk minus one in the equation of the line to produce f at fk minus one which is a0xk minus
one plus a1. f of k is a0xk plus a1. I have got two equations linear equations for solving
a0 and a1. I can just solve and subtract the two get a1 and then substitute for a0. I get
the other value. So if I subtract these two equations what I would get is fk minus fk
minus one.
Here a1 cancels a0 into xk minus xk minus one and hence a0 is solved. a0 is fk fk minus
one by xk minus xk minus one. Once I solve for a0 I can determine a1 from any one of
either of these two equations. If I take this equation I can write a1 is equal to f of k
minus a0xk. Now I have determined a0 and a1, and the root that we have said x is equal
to minus a1 upon a0 is the required root. So I will substitute the values of a1a0 in
x. I will write down the next approximation which is substituting in this xk plus one
is minus a1 upon a0; a1 is fk minus a0xk and divide it by a0. First I will take this term
xk minus fk upon a0, now substitute the value of a0. a0 is fk minus fk minus one xk minus
xk minus one.
Now this is the required formula which we shall call as a secant method or the chord
method. However we can also simplify this, which is multiply these two and simplify further
and write like this. So I just multiply these two i.e. xk fk minus xk fk minus one minus,
multiply these two xk fk xk minus one into fk and therefore this gives us xk plus one
is these two cancel of xk fk and I would get here xk minus one fk minus this divided by
this. So this is called a secant method and we can use either of these two forms. I can
use the form two or form three to get the next iterations starting from an initial value
x0. As I said this x0 is obtained either from the intermediate value theorem or through
the bisection procedure we obtained, but we can use any one of these two forms. Sometimes
this is convenient and for error analysis two is very convenient.
We shall see in the next lecture how these two are useful in doing this procedure. Now
the very important thing to note here is this. In the starting we said that xk minus one
xk are any two approximations to the root. We need not ask or require that the root should
lie between xk, xk minus one; the root need not lie in xk minus one xk. The method gives
convergence to the root from one side of the root. The convergence for the secant method
is always from one side of the root. Let us suppose the root was 1.5 and the root lies
between zero and three and if the root lies between, this iteration will take one or two
iterations to jump to one side of the root. So it will go either to 1.5 to 3 or it will
go from 0 to 1.5 side and then start converging from one side only. That means if you are
using secant method and if you start with a root lying in an interval, you would lose
one or two iterations so that the convergence will come from only one side. The convergence
is always from one side for the secant method.
Now the cost of the computation will be; at each iteration, we need one function evaluation
fx plus one, which is a major cost in applying this method. Again cost is the same as bisection.
Here also we are having only one function evaluation, there also we have one function
evaluation. Therefore the cost is the same but the procedure is different. But here of
course we can count for each iteration, there are two multiplications here and there are
two subtractions here and one division here. So you have got the other operations also.
For each iteration will be able tell over here.