Mod - 01 lec - 26 l26 - Solution of non - Linear equations

So, we have considered system of linear equations. Now, we will consider single equation f x is equal to 0, where f is any function. We are going to consider methods, such as Bisection method, then Newton's method, Secant method and a variation of Secant method which is known as Regular False method. Bisection method is the simplest method, but the drawback is the convergence is going to be very slow. Newton's method, when it converges, it is going to converge fast. It is going to converge, what is known as quadratically. In case of Newton's method, one needs to know what is the function value and what is the derivative value. So, if the computation of derivatives is difficult, then the Newton's method is not advisable. Then, one considers what is known as secant method, the price one pays is the Secant method will not converge as fast as the Newton's method. Now, these methods, they are not going to converge unconditionally. It will be only under certain conditions. If you have chosen your starting point appropriately, then it is going to converge, but when it converges, Newton's method converges fast and that is going to be the best method. Now, these Newton's method, secant method, these are best on the idea that f is a general function. So, finding its 0 is difficult, but if f is a linear function, that means if it is a straight line, then, we know how to calculate its 0. We have to see where the straight line crosses x axis. So, one approximates the given function by a straight line and that gives us the Newton's method and Secant method. These methods, they are going to be useful when one considers the Eigen value problems. So, in case of Eigen value problem, one needs to find 0s of polynomial. So, let me first explain, what is the Bisection method. Also, when we want to analyze the convergence of Newton's method, convergence of Secant method, it is useful to connect the 0 of a function to a fix point of a function. So, if f of c is equal to 0, then we say that c is a 0 or root of our function f. If f of c is equal to c, then we say that c is a fix point. So, in order to calculate fix point approximately, we will be considering Picard's iteration and then, we will see how the Newton's method, it can be put in the setting of fix point iteration. So, today we are first going to describe what Bisection method is, then I will describe what Newton's method is, what secant method is and then, we will go to fix point. We will define the Picard iteration, we will prove conditions for existence and uniqueness of fix point and then, we will see how the Newton's method can be put in this setting and the convergence of Newton's method and Secant method, we will be considering later. Now, our setting is f is a function defined on closed interval a, b taking real values. Our aim is to find c, such that f of c is equal to 0. If the function f is continuous and f of a, into f of b is less than or equal to 0, so f of a, into f of b is equal to 0 will mean that either f of a, is 0 or f of b is 0. If it is strictly less than 0, then it means f a, and f b, they will be of opposite signs and then, by the intermediate value theorem, f of c is equal to 0. For some c belonging to interval a, b. So, if you have got f a into f b to be less than or equal to 0, we know that a root of function f lies in the interval a, b. Now, this is the simplest method, Bisection method. So, f is from a, b to r is continuous, f a, into f b is less than 0 and f as a unique 0, c in a b. So, this much is given to us, f a, and f b can be of opposite signs and they can have more than one 0, like look at this is say, point a, this is point b. So, I have got f a, into f b is less than 0 and then, there is this 0, this 0, and this 0. So, there are 3 0s. So, in the Bisection method, we assume that we are given that f as a unique 0, c in a b. So, the method is, look at the midpoint. If f of m is equal to 0, then fine, c equal to m. Otherwise, either f a into f m is less than 0 or f m into f b is less than 0. This will mean that our 0 lies in the interval a to m. F m into f b less than 0 will mean, that 0 lies in the second half, that is in the interval m to b. So, you choose a 1 to be equal to a, and b 1 to be equal to m in this case. In the other case, you choose a 1 is equal to m and b 1 is equal to b. So, we started with the interval a b. We subdivided it into 2 equal parts. We know that our function has got only one 0, so if the midpoint is 0, then fine. Then, we have found a 0, otherwise our 0 will be either in the interval a to a plus b by 2, or it will be in the interval a plus b by 2 to b and that can be checked by looking at sign of f of a plus b by 2 f of b. So, when the 2 end points are of the opposite sign that interval is going to contain our 0. So, staring with the interval a b, we have found an interval of half the length which contains our 0 and then, you continue. Suppose, it is in first half, that means a to a plus b by 2. Divide it into 2 equal parts. If the midpoint is 0, then well and good, otherwise there will be one of the intervals which will contain your 0. So, take that interval. So, it is a very simple method, easy to apply. Only the convergence is going to be slow. Like this way, we are constructing the sequence of intervals a n b n and both a n and b n, they are going to converge to c, as n tends to infinity, but at a time what we are doing is, we are dividing the interval into half. So, at every stage, you are going to gain only one binary digit. So, that is the drawback of this Bisection method, but still it is useful to do some, like when we want to apply apply Newton's method, we need a starting point. So, one can do a few steps of Bisection method, try to get the smaller small interval in which our 0 is going to contain and then, take x 0 to be a point in that interval. So, let me describe the Bisection algorithm. It is a 0 equal to a b 0 equal to b and then, for n is equal to 0, 1, 2, m is going to be equal to a n plus b n by 2. If f of a n f m, if they are of opposite e signs, then a n plus 1 is equal to a n, b n plus 1 is equal to m. If this is not the case, then a n plus 1 equal to m b n plus 1 equal to b n and then, you continue. So, in this algorithm, one can give a statement that, if at any stage f of m is equal to 0, then you stop. Another stopping criterion one needs to give is when to stop. So, then, either you have to specify the value of n, the maximum value or you can give it in terms of the, say length of the interval like when the length of the interval, say it is less than 10 raise to minus 6, then you stop. So, you have to give the stopping criteria. So, this is the simplest method which gives us an approximation to a 0. Under I have said, under the condition that it should be a continuous function and it should be given to us, that there is only one 0, but it can be modified like, suppose it has got more than one 0, like if they have opposite signs, the number of 0s is going to be odd and all. So, one can modify. So, this is Bisection method the simplest form and the convergence is going to be very slow. So, now, here is an example, f x is equal to x cube minus x minus 1, x belonging to 1 to 2. It being a polynomial, it is continuous. If at 1 is going to be equal to minus 1, if at 2 is equal to 5. So, f 1 and f 2, they are of opposite signs. The derivative of f is given by 3 x square minus 1 and in the interval 1 to 2, it is going to be bigger than 0. Now, if the derivative is bigger than 0, that tells us that f is strictly increasing. If it is strictly increasing, it has a unique 0 in 1 to 2. So, all the 3 conditions are satisfied. It is a continuous function. The end points have opposite signs and it has a unique 0 in the interval 1 to 2. So, let us look at the midpoint value f of 3 by 2. So, f of 3 by 2 is 7 by 8, which is 0.875. So, this being positive and f of 1 is equal to minus 1, these being of opposite signs, our 0 is going to lie in the interval 1 to1.5, and then, one can continue this and obtain an approximation to 0 of our function. So, this is Bisection method. Now, let us describe what Newton's method is. So, we have got a function. Let us assume that the function is differentiable and at no point, the tangent is becoming horizontal. So, that means the derivative is not vanishing. I am telling you sufficient conditions. So, you have interval a b. You start with a point x 0. At x 0, look at the tangent to the curve at x 0. See where that tangent cuts the x axis. Where ever it cuts, that is going to be our point x 1. The next step is, consider tangent to the curve at x1. See where it cuts x axis. That is going to be our x 2 and so on. So, this is going to be our Newton's method. You have to start with some point x 0 and then, you approximate your function by a tangent. Look at the 0 of that tangent, where the tangent crosses x axis. That gives us our next point. Now, whether this convergence or not, we are going to study this later on in detail. At present, I just want to describe what Newton's method is. I said that the tangent should not be horizontal because if the tangent is horizontal, then it would not cut the x axis. It will be parallel to the x axis and then we cannot proceed. So, that is why when I assume these things. Another important thing will be that, I start with x 0. My function f is defined on interval a b. I am starting with a point x 0. I am looking at the tangent to the curve at point x 0 and then, I look at the point at which the tangent intersects the x axis. So, this point of intersection can be outside the domain of our function. So, these things we will see more in detail. So, at present, let us assume that we do not face such difficulties and then, that describes our Newton's method and another thing is in the Newton's method, we will assume that it has a simple 0. That means, f of c is equal to 0, but the derivative does not vanish. So, here is the graphical representation. This is the Curve. So, this is our function f. The intersection is the point in which we are interested in. You start with x 0. Look at the tangent to the curve at x 0. So, this is the tangent. So, this tangent cuts x axis at x 1. In the next step, look at tangent to the curve at x 1. So, this is the tangent. It cuts x axis at x 2. Then, look at the tangent to the curve at x 2. It will cut the x axis at x 3 and so on. Here is the equation of the tangent at x 0, or if you want at x 0, f x 0. So, y minus f x 0 divided by x minus x 0 is equal to f dash x 0, that is the slope of our tangent and that gives you y is equal to f x 0 plus f dash x 0 x minus x 0. That is the equation of the tangent at x 0 x. Intersection point with the x axis will be given by equating this equation to 0. So, when you equate it to 0 and suppose, that intersection point is x 1, you will get x 1 is equal to x 0 minus f x 0 upon f dash x 0. In general, x n plus 1 will be equal to x n minus f x n upon f dash x n. X 0 is the starting point. It is given or we choose the starting point x 0 and then, you get this. Now, I said that the Newton's method is we are going to consider it for a simple 0. So, we have got f of c is equal to 0, f dash c not equal to 0. So, suppose our function f dash is continuous, then if f dash c is not equal to 0 in a neighborhood of c, f dash x also will not be 0. So, this Condition that our iteration is x n plus 1 is equal to x n minus f x n upon f dash x n. So, whenever f dash x n is 0, our procedure is going to break down, but then if you are assuming that f dash c is not equal to 0, your function f dash is continuous and if you are remaining in the interval around c, then this condition will be satisfied. So, now whether you remain in the interval, or whether you remain in the neighborhood, that is another point. So, for that, we will have some sufficient condition. So, in fact for the Newton's method, this starting point x 0, it is going to be a crucial point. Now, in the Newton's method we need f dash of x n. Now, the function may not be easily differentiable. In that case, what we can do is, we can replace f dash x n by a divided difference. We had considered the numerical differentiation. We had f dash at a, is approximately equal to f of a plus h minus f a divided by h. In a similar manner, we can replace f dash x n by. Now, let us look at the points x n and x n minus 1. So, f dash x n will be approximately equal to f x n minus f of x n minus 1 divided by x n minus x n minus 1. So, if you do this substitution, then what you are going to get is the Secant method. So, in the Newton's method, x 0 is the initial case. In Secant method, we need to have 2 points x 0 and x1. You replace f dash x n by divided difference based on x n minus 1 and x n. So, that is going be equal to f x n minus f x n minus 1 divided by x n minus x n minus 1. That defines the Secant method to be x n plus 1 is equal to x n minus f x n divided by this divided difference. So, let us see what it means graphically. So, you are starting with two points. So, suppose I take the end point x 0 here, and x1 here. Look at the equation of the Secant through x 0 f x 0 and x1 f x1. Equitation of these straight lines, which passes through these two points, it will be given by y minus f x 0 divided by x minus x 0 is equal to the slope f x1 minus f x 0 divided by x1 minus x 0. So, this is the equation of the Secant or the straight line, which passes through these 2 points. When you simplify this, what you get is y is equal to f x 0 plus f x1 minus f x 0 divided by x 1 minus x 0 into x minus x 0. Notice that this is nothing, but divided difference based on x 0 x 1. Equate this to 0, and when you equate this to 0, whatever you get, call it x 2 and that is going to give you, here this should be x 2 not x 1. So, we are we are going to have y is equal to f x 0 plus f of x 0 x 1, the divided difference based on x minus x 0. So, I am going to equate it to 0. So, I will get f x 0 plus f of x 0 x 1 into x 2 minus x 0 to be equal to 0. When I solve it for x 2, I will get x 2 is equal to x 0 minus f x 0 divided by f of x 0 x 1, and in general, I can write x n plus 1 to be equal to x n minus f x n divided by f of x n minus 1 x n, n is equal to 1, 2 and so on. So, this is the secant formula. Now, we are going to look at the fix point of a function. So, here is the definition g from a b to a b. It is a function. A point c is said to be a fix point if g of c is equal to c. So, that means, the fix point is going to be intersection of the graph of g and the straight line y is equal to x, whatever is the intersection that is going to be our fix point. Now, a fix point is related to 0 of a function. For example, if I define f x to be equal to g x minus x, then f of c will be 0 if and only if g of c is equal to c. There can be more such functions, but this is just I want to tell you that fix point is not something totally different than the 0 of a function. The two notions, they are related. So, now, let us look at conditions under which we know existence of fix point. Suppose, your function g from a b to a b is continuous, then g has a fix point c in a b. So, what is important is interval a b should map into interval a b. It should be continuous and then, g has a fix point c in interval a b. The proof is simple. We are going to make use of intermediate value theorem for continuous function. If g of a, is equal to a, then a is a fix point. If g of b is equal to b, then b is a fix point. Now, consider the case when g of a is not equal to a, g of b not equal to b. Since, g maps interval a b to interval a b, when I look at g of a, g of a has to be in the interval a b. It is not equal to a. That means, I will get g of a to b bigger than a. When I look at g of b, so g of b again it has to be in the interval a b and it should be, it is not equal to b. So, g of b is less than b. So, now, look at f x is equal to g x minus x, g of a is bigger than a, g of b less than b will imply that, f of a is greater than 0, f of b is less than 0, f from a b to r continuous because g is given to be continuous. Apply the intermediate value theorem to obtain c, such that f of c is equal to 0 and this implies that g of c is equal to c. So, we have conditions that your function g should be a continuous function. It should map interval a b to interval a b. Then g is going to have a fix point. Now, the condition that interval a b, it should map to interval a b, that is necessary that one can easily imagine. Now, what will happen if instead of interval a b, closed interval a b, if I take open interval a b or instead of the bounded interval a b, if I take infinite interval. Whether the result will still hold? That means, whether g will have a fix point. So, it is not true. So, we will give counter examples. So, the first is the result is not true if closed interval a b is replaced by open interval a b, or one of the Counter example is g x is equal to x square, x belonging to 0 to 1. So, this g is a continuous function. It is mapping open interval 0, 1, 2 itself, but g x is equal to x will mean that, either x is equal to 0 or x is equal to 1 and both the points, they are not in the domain. So, the statement about existence of fix point is not true if you replace the closed interval a b by open interval a b. Next, if the interval a b is replaced by infinite interval a to infinity, still it is not true and here is a counter example. If you look at g x is equal to x plus 1, x belonging to 0 to infinity, so this function is continuous. It maps interval 0 to infinity to itself, but it does not have a fix point. In this case, the function had fix points, but they were not belonging to our domain. Here, g x is equal to x plus 1. It is the translation. So, it does not have fix point at all. The third is continuity, that if g is not continuous, then it need not have fix points. So, this you can define easily. What I have done is I defined g x is equal to x square, x belonging to open interval 0 to 1. I know that this functions fix point is 1 and 0. So, what I do is, I define g x is equal to 1 when x internal equal to 0 and 0, when x is equal to 1. So, I do not need the function to be continuous. So, this function, when the continuity is violated, then it need not have fix point. So, this is about the existence. Now, what about uniqueness whether a fix point is going to be unique? Now, there can be more than 1 fix point. Like look at our example g x is equal to x square on the closed interval 0 to 1, then it has got a 2 fix points. If you look at function g x is equal to x on interval, again it is closed interval 0 to 1. Then, it has got infinitely many fix points and you can easily construct the example when it has got a unique fix point. So, now, what we want to do is, we want to find a sufficient condition which will guarantee uniqueness of the fix point. Now, that Condition is going to be in terms of the derivative. So, for the existence of fix point, continuity was enough. It should map the interval a b to a b, a b should be a closed and bounded interval and it should be continuous. Now, we are going to show that if in addition, g is differentiable on open interval a b, then and if the derivative modulus is less than 1, then it has a unique fix point and the proof is straight forward. What we are going to do is, use the mean value theorem. That if you have got g c 1 is equal to c 1, g c 2 is equal to c 2, then consider g c 1 minus g c 2 apply mean value theorem and conclude that c 1 has to be equal to c 2. Here is the uniqueness of the fix point, where g is a map from a b to a b continuous map. It is differentiable on open interval a b with modulus of dash x to be less than 1 for x belonging to open interval a b. Then, g has a unique fix point in a b. Existence is already proved. So, let us look at uniqueness. So, let g c 1 is equal to c 1, g c 2 is equal to c 2 for c 1 and c 2 in the interval a b, c 1 minus c 2 will be equal to g c 1 minus g c 2. The function g is going to be continuous on the closed interval c 1 to c 2, differentiable on open interval c 1 to c 2 because g is continuous on a bigger set a b and it is differentiable in possibly bigger set open interval a b. So, by the mean value theorem, this is going to be equal to c 1 minus c 2 into g dash d, where d is some point in the interval c 1 to c 2. Modulus of c 1 minus c 2 takes modulus of both the sides. So, you will have mod of c 1 minus c 2 is equal to mod of c 1 minus c 2 into modules of g dash d. If modules of g dash d is less than 1 because that is our assumption, then c 1 minus c 2 has to be 0 because if it is not 0, then what you get will be modulus of c 1 minus c 2 is strictly less than modulus of c 1 minus c 2. So, using this fact, one gets c 1 is equal to c 2. So, we have obtained now the necessary and or we have obtained conditions for the uniqueness and existence of fix point. These are sufficient conditions that if these conditions are satisfied, then definitely g has a fix point. Now, next comes the question that, how to find an approximation to this fix point. As I told you that fix point of a function can be a 0 of another function. So, if we have a way of finding approximation to fix point, this can be translated into a method for finding 0 of a function. So, we are going to now define what is known as Picard's fix point iteration scheme. The convergence of this scheme, we will prove under slightly stronger condition that on the interval, open interval a b, modulus of g dash x should be less than or equal to some constant k, which is less than 1. For the uniqueness, our condition was modulus of g dash x should be less than 1. Now, we are saying modulus of g dash x should be less than or equal to k less than 1. In the first case, modulus of g dash x less than 1 means, g dash x can go arbitrarily near to 1 in the second case when we are saying modulus of g dash x should be less than or equal to k less than 1. It means it has to stay away from 1. Now, this stronger condition we are assuming for the convenience that result is true, also when modulus of g dash x is less than 1, but if we assume this condition, then life becomes easier, the proof becomes simpler. So, under this condition, we define Picard's iteration scheme which is nothing, but go on applying g. Start with x 0, apply g. So, g of x 0 that will be our x 1 and then, you continue that gives you iterates and then, we will show that these iterates, they converge to the unique fix point c. Once again, we will be using the mean value theorem. So, here these are our assumptions, g is from a b to a b continuous, differentiable on open interval a b, modulus of g dash x less than or equal to k less than 1 for x belonging into a b, then g has a unique fix point c in interval a b. This part we have already proved. Start with x 0 in a b and define x n is equal to g x n minus 1, n is equal to 1, 2 and so on. X 0 can be any point in the interval a b, then x n converges to c as n tends to infinity. So, let us prove this result. So, Consider x n minus c. X n minus C is equal to g of x n minus 1 minus g of c because x n is equal to g x n minus 1. By definition, c is equal to g of c because it is the fix point. Apply the mean value theorem. So, you will get this to be equal to x n minus 1 minus c g dash of d n, where this d n is going lie between x n minus 1 and c. Since, modulus of g dash x is less than or equal to k less than 1, you get mod of x n minus c to be less than or equal to k times modulus of x n minus 1 minus c. Apply the same argument and get this to be less than or equal to k square modulus of x n minus 2 minus c and like that, less than or equal to k raise to n mod of x 0 minus c. Now, it is here to that we use the fact that k is less than 1. So, k strictly less than 1 implies k raise to n will tend to 0 as n tends to infinity and this implies that x n converges to c as n tends to infinity. So, we have proved this Picard's fix point iteration that if the derivative is less than or equal to k less than 1, then for any starting point x 0 if you define x n as g x n minus 1, then x n is going to converge to c as n tends to infinity. Now, we are going to talk about the order of convergence. I had mentioned in the beginning that Newton's method has got quadratic convergence, then the Secant method; it will have convergence better than linear Convergence, but less than quadratic convergence. So, let me now make these statements precise. So, we are going to define what the order of convergence is. So, we have, suppose we have got a sequence x 0, x1, x n is a sequence which is converging to c. We set e n to be equal to x n minus c. If there exists a constant m not equal to 0 and a real number p, such that limit as n tends to infinity modulus of e n plus 1 divided by mod e n raise to p is equal to n. Then, M is Called asymptotic error constant and this p is called the order of convergence. If p is equal to 1, what it means is limit of mod of e n plus 1 by mod e n should be equal to M. If p is equal to 2, we want mod e n plus 1 by mod e n square as n tends to infinity is equal to M. So, this m is asymptotic constant and what we are saying is that limit mod e n plus 1 upon mod e n raise to p as n tends to infinity is equal to m. This means, mod e n plus one is approximately equal to m time's modulus of e n raise to p. If you have got p is equal to 2, then modulus of e n plus 1 will be approximately equal to modulus of e n square. This is error at nth stage. This is error at n plus first stage. Our error is going to be small. So, I can assume mod e n to be less than 1. So, here whatever is the error at n stage, at n plus first stage, it is becoming approximately square of that. Since mod e n is less than 1, mod e n square will be still smaller. So, this will be the quadratic convergence. If you have got p is equal to 1, in that case, we are going to have p is equal to 1, means modulus of e n plus 1 is approximately equal to m time's modulus of e n. So, let us look at some illustrative examples which will make the idea little more clear. Look at x n is equal to 1 by n. It tends to 0 as n tends to infinity. E n plus 1 is going to be equal to x n plus 1 minus 0. So, that is 1 upon n plus one divided by e n which is 1 by n. I do not need to take modulus because mod e n plus 1 is equal to e n plus 1. This is equal to n upon n plus 1, which converges to 1 as n tends to infinity. So, that means, here p is equal to 1 and the asymptotic error constant m is also equal to 1. So, this is example of linear convergence. Look at 1 upon root n. This 1 upon root n also tends to 0 and n tends to infinity. When I look at e n plus 1 by e n, this is going to be equal to root n by root of n plus 1. The limit of this as n tends to infinity is again equal to 1. So, for 1 upon root n also, it is a linear convergence with p is equal to 1 and the asymptotic error constant m is equal to 1. Here is the third example. If I define x 0 to be equal to 1 by 3, x 1 to be 1 by 3 square, x 2 to be 1 by 3 raise to 4, so I am defining x 1 is equal to x 0 square, x 2 is equal to x 1 square, x n plus 1 is equal to x n square. Then, this x n tends to 0. When I look at e n plus 1 upon e n square, e n plus one, that means, it is x n plus 1 minus 0. So, that is x n square, e n is going to be x n, so e n square will be x n square which is equal to 1. So, here the asymptotic constant m is equal to 1 and then, p is equal to 2. So, this is the example of a quadratic convergence. Now, in our next lecture what we will show is that if the fix point, which we are looking at, so g is a function and g of c is equal to c. If g dash c is not equal to 0, then the Picard's iteration has linear convergence. If g dash c is equal to 0, then we will get quadratic convergence and that is going to be the case for the Newton's method. So, in the next lecture, we are going to look at conditions, sufficient conditions under which Newton's method converges, then the rate of convergence of Newton's method, rate of convergence of Secant method and we are going to define a method what is known as Regular False method. Thank you.