Mod - 01 lec - 20 l20 - Gauss elimination with partial pivoting

We are considering gauss elimination with partial pivoting. So, our assumption is that the coefficient matrix A is in-veritable. Earlier we had considered Gaussian elimination without to partial pivoting in which case we were assuming a stronger condition that determinant of ak is not equal to 0, where ak is the principle leading sub matrix, which consists of first k rows and first k columns. And in that case, we proved that gauss elimination is equivalent to L U decomposition of our matrix A. Now, in case of gauss elimination with partial pivoting, we assume only the condition that determinant of A is not equal to 0. In this case, we are going to show that the gauss elimination with partial pivoting is equivalent to writing L U decomposition of not A, but P into A, where P is going to be a permutation matrix. So, permutation matrix it is obtained from the identity matrix by finite number of row interchanges. In gauss elimination method with partial pivoting, we interchange rows, it may be necessary to interchange the rows. So, these interchange of rows is accounted by this permutation matrix P. So, our setting is A is invertible matrix. So, we have got system A x is equal to b, where determinant of A is not equal to 0 and we have got n equations in n unknowns. The right hand side b1 b2 bn that is given to us x1 x2 xn that is unknown vector, since determinant of A is not equal to 0. This equation is going to have a unique solution. So, what we do is we first look at the first column and in that we consider the element which has got maximum modulus. If it is in the k th row, then we interchange the first and k th row. Now, since A is invertible, our a k 1 is not equal to 0, because if a k 1 were 0, then each entry in the first column will be 0. So, you will have a 0 column which will mean that determinant of A is 0. So, you interchange the first and the k th row after the interchange of our matrix looks like this. So, we are not changing the system, we are just changing the order of our equations like what was first equation has become? Now, k th equation and what was k th equation that has become first equation. So, now, when you look at the element in the first row and first column, that is going to have maximum modules in the first column. Now, we are going to look at the multiplier, now we will do gauss elimination; that means, we want to introduce zero's below the diagonal in the first column. So, our multipliers m i 1 they are going to be equal to a i1 divided by a k 1. For i not equal to k i is equal to 1 to up to n. Now, since a k 1 has maximum modulus, modulus of m i 1 will be less than or equal to 1. Now, R i tilde denote the modified rows. So, R i tilde becomes R i tilde minus m i 1 R 1 tilde. So, we introduce zeros here, and then we have got the new system in this manner. Now, whatever we have done for the n by n matrix, we are going to work on this sub matrix of order n minus 1. So, our aim is to introduce 0's in the second column below the diagonal. So, you look at the maximum of the element among a 22 tilde 1 up to a n 2 tilde 1. So, suppose it is modules of a tilde k 21. Maximum 2 less than or equal to i less than or equal to n, we do not want to disturb the first row. So, we are working only on this n minus 1 by n minus 1 matrix. Now, since a is invertible again this will not be equal to 0. Now, interchange 2nd kth equation multipliers m i 2 will be a i 2 tilde 1 by a k 2 tilde 2 and then you continue. So, like that for gauss elimination with partial pivoting for every step there may be a interchange of rows. And that interchange of rows that we are going to show that; that can be achieved by pre-multiplying our matrix by a permutation matrix. So, some notation A is our n by n matrix e j is going to be canonical vector with 1 at j th place and 0 elsewhere, when you look at a times e j that is going to give us a jth column. So, you have A e j A is this n by n matrix multiplied by vector 0 0 01 0 0, where 1 is at jth place, when you look at the matrix into vector multiplication only corresponding to 1 that entry is going to contribute. So, what you get is a 1 j a 2 j a n j. So, that is going to be our j th column. In a similar manner, one can show, that e i transpose A that will give us ith row a i 1 a i 2 up to A. Now, look at matrix A it n columns they will be given by A e 1 A e 2 A e n look at another matrix B . So, this is its columns will be given B e 1 B e 2 B e n, where e 1 e 2 e n are canonical vectors, which is equal to C1 C2 Cn. I am denoting when you consider A into B matrix multiplication its columns will be given by A B e 1 A B e 2 and A B e n. But B e 1 is nothing, but C 1. So, matrix multiplication A into B will be obtained by multiplying A and a first column of B. So, that will give us first column of A B A C 2 will be second column of A B and A C n will be nth column of A B. Now, we have seen before that when we wanted to do R i minus M i 1 times R 1. So, we wanted to subtract a multiple of first row from i th row. So, this operation can be performed by pre multiplying our matrix A by a matrix, which we had called E1. So, the matrix E1 had one along the diagonal and then in the first column second entry onwards they were minus m 2 1 minus m 3 1 minus m n 1 and then when you consider E1 times A then you get the modified matrix A 1, which was the first step of gauss elimination method. Now, when you look at the matrix E1 in fact, this matrix E1 is obtained from the identity matrix by doing this transformation. Let me repeat, we have our matrix A, what we are doing is i th row, we are modifying by R i minus M i 1 R1. We multiply the first row by M i 1and subtract from the i th row, suppose this operation you do on the identity matrix, then our starting point is identity matrix 1 1 1 and then when you do R i minus M i 1 R 1 multiply first row by m 2 1 subtract from the second row. Then you will get minus m 2 1 1 and m 0 0. Similarly, R n minus M n 1 R 1 so, multiply this first row by m n 1 subtracts it from the last row. So, you will get minus m n 1 0 and then 1 1 0 0 and then 1. So, the matrix even which we obtained we wanted to do some row transformations, you perform them on the identity matrix, you will get a matrix E1, then you consider E1 times A that is going to give us modified matrix. Now, what we want to show is? In the gauss elimination with partial pivoting, we are interchanging first and k th row that is the first step. So, this step can be achieved by pre-multiplying by a matrix say P1. Now, this matrix P1 will be obtained from the identity matrix by interchanging the first and k th row. So, exactly same operation, whatever you want to do on your matrix A, you do it on the identity matrix, you will get a matrix n by n matrix pre-multiply your matrix A by that matrix and then you will have the desired effects. So, look at the identity matrix and we want to interchange first and kth row. So, we knew interchange the matrix P1, which you will get will be you are starting with the identity matrix and then you are interchanging first and kth row. So, in the first row there was 1 here. So, that becomes occupies the position here. So, it is kth row and first column. The diagonal entry k k that was 1 and that you are interchanging with the first row. So, this becomes 1. So, this is our matrix P1. Now, it is clear that P1 square will be identity. You have identity matrix, you are interchanging first and kth row, you get matrix P1. Now, again you interchange first and kth row. So, then you will get back your earlier matrix. So, that is why P1 square is going to be identity and the matrix P1 is going to be a symmetric matrix. So, you will have P1 transpose is equals to P1. Now, when you look at the matrix P1 i identity matrix it is columns are nothing, but e1 e2 up to en, the canonical vectors. In matrix P1 the first column is becoming ek and kth column is becoming e1, all the remaining columns they remain the same. So, you have P1 is equal to P1 transpose is equal to first column is ek, then e2 e3 ek minus 1 as before then e1 and then ek plus 1 up to en. So, the first column ek then e2 e3 ek minus 1 the kth column will be e1 and then ek plus 1 up to en. Now, if you look at AP 1. So, AP 1 we have seen that, when you want to take multiplication of A into B look at the columns of B. So, we call them C 1 C 2 C n. And then A B it is obtained by first column will be A C1, second column will be A C2 and A C A. So, now look at P1 and then consider A P1. So, A P1, it which first column is A e k then A e2 A e k minus 1 A e 1 A e k plus 1 A e n, but what is A ek? It is the kth column of A then C 2 C k minus 1 C 1 C k plus 1 C n. So, this means when you consider AP 1 you are interchanging the 1st and the kth column of A. So, we have obtained a permutation matrix P1. The permutation matrix P1 was obtained from the identity matrix by interchanging the kth and the first row. Now, if you post multiply your matrix A by P1 then what it does is it interchanges the kth and the first column of A. So, the post multiplication by permutation matrix means interchange of columns. And now, let us show that if instead of AP 1, if you look at P1 A; that means, pre-multiplying, then that will amount to interchange of corresponding rows of A. So, we have seen that ei transpose A is going to be ith row of A. A is n by n matrix ei is a column vector n by 1. So, its transpose will be a row vector 1 by n. So, 1 by n multiplied by n by n matrix that is going to give us a row vector and that is our vector Ri are the ith row of a and now, we are going to look at P1 times A. Now, this P1 is going to be its first row, will be nothing but e k transpose. The second row will be e 2 transpose and so on. So, you look at P1. So, the first row is e k transpose it is the row vector with 1 at kth place and 0 elsewhere. Second row will be e 2 transpose k minus first row will be e k minus 1 transpose kth row will be e1 transpose and so on. So, now look at P1 into A. So, P1 the first row second row and so on into A that is going to give us e k transpose A e 2 transpose A and so on e i transpose A is equal to R i; that means, it is the ith row of A. So, when you look at P1 A the first row is the kth row of our matrix A and kth row becomes the first row. So, interchange of the first and the kth row of our matrix A is done by pre-multiplying by P1. So let me summarize you have got identity matrix, you are interchanging the first and the kth row. Now, for the identity matrix, if instead of interchanging first and kth row you would have a interchange first and kth column you would have obtained the same result. So, we obtain A matrix P1. This matrix P1, if I look at AP 1, the effect is interchange of kth and first column of A, if I pre multiply, if I look at P1 A then the effect is interchanging of the kth and the first row. So, in the case of gauss elimination with partial pivoting at each step you may be interchanging rows. So, this interchange of rows can be achieved by pre-multiplying by a permutation matrix obtained from the identity matrix by interchange of just two rows. So, reduction of A to an upper triangular matrix U using partial pivoting you can represent it by pre-multiplying your matrix A by P 1 E 1 P n minus 2 E n minus 2 P n minus 1 E n minus 1; where P i s are permutation matrices; which account for the interchange of rows and E 1 E 2 up to E n minus 1 that is going to account for the subtracting multiple of an appropriate row by from another row. So, now what we have got is? We have got the gauss elimination with partial pivoting, its effect on the coefficient matrix A, we can write it as E n minus 1 P n minus 2 En minus 2 P n minus 2 and. So, on multiplied by A is equal to U earlier, we had only E n minus 1 E n minus 2 P1 A is equal to U. So, what we did was, all these E n minus 1 E n minus 2 e 1, they were all lower triangular matrices. Then their product is also lower triangular each is a invertible matrix. So, you take its inverse and then you will get A to be equal to inverse of A lower triangular matrix into U. U is upper triangular and then inverse of this that gave us n. Now, what is happening is, we have got in between the permutation matrices. So, we want to show that this equation can be written in the form E n minus 1 dash E n minus 2 dash E 1 dash. And then all these permutation matrices together into A is equal to U. So, these we want to show that all these matrices they are going to be lower triangular and invertible, this P n minus 1 P n minus 2 P 1. This together will give us a permutation matrix P. So, in P 1 P 2 P n minus 1, we were interchanging only one row at a time in P we will be interchanging finitely many rows. So, I am going to show this for 3 by 3 matrices. So, this is to give you an idea and then the similar argument works. So, what we want to do is - we have got lower triangular matrix and permutation matrix in that order. So, I want to take all lower triangular matrices together and all permutation matrices together. So, that is what we want to do and we are going to illustrate it for 3 by 3 matrix. So, you consider E 2 P 2 E 1 P 1 A is equal to U. So, n is equal to 3 A is 3 by 3 matrix U is upper triangular matrix P 1 and P 2 they are obtained by interchanging one row at a time from the identity matrix E 1 and E 2 are going to be lower triangular matrices. So, we know that P 2 square is identity. So, what I can do is after E 1, I can introduce P 2 square because P 2 square is identity. So, you will get U is equal to E 2 P 2 E 1, then I am introducing P 2 square P 1 A. Now, this P 2 E 1 P 2 that I denote by E1 dash. So, I have got U is equal to E 2 E 1 dash P 2 P 1 A So, now, I have achieved to have my permutation matrices together, but I need to show that the E 1 dash, which I obtained that is a lower triangular matrix, because I start with our E 1 is going to be lower triangular matrix. In fact, it will be unique lower triangular matrix. So, now E 1 dash I have P 2 E 1 P 2. So, I need to show that it retains the lower triangular structure. So, let us look at E1. So, E1 is matrix 1 minus m21 minus m 31 0 1 0 0 0 1. So, that means, this E is the matrix which accounts for r 2 minus m 21 r 1 and r 3 minus m 31 r 1 P 2 is the matrix in which you are interchanging second and third row then when you look at matrix P 2 E 1 P 2 E 1 is going to be interchange of 2nd and 3rd row of E 1, we know that if you pre-multiply by a permutation matrix; that means, it is interchange of rows. If you post multiply by a permutation matrix, it corresponds to interchange of columns. So, what we are doing is you have got, you consider E 1. So, that is the first step of gauss elimination method, now you will get a modified matrix. In that modified matrix, now you want to introduce 0 in the second column below the diagonal. Now, you may have to interchange the rows, because I am taking 3 by 3 matrix the only row interchange is going to be second and third row. So, that is what I am doing and now, I am trying to show that E1 is lower triangular. And if you consider P 2 E 1 P 2 that also is going to be lower triangular. So, now, P2 E1 that becomes interchange of 2nd and 3rd row. So, it is minus m 31 0 1 and minus m 21 1 0. What was 2nd row becomes 3rd row, what was 3rd row becomes 2nd row. Now, if you look at this matrix, this is not a lower triangular matrix; you have got 1 here, but now consider P 2 E 1 multiplied by P 2. So, what it will do is it will interchange second and third column. So, now, when you interchange second and third column, the matrix which you get is 1 minus m 31 minus m 21. What was second column becomes third column, what was third column, becomes second column? So, now, this is a unit lower triangular matrix. So, we had P 2 E 1 P 2 that was our E 1 dash. So, that is going to be lower triangular E 2 is also a lower triangular matrix. So, we have U is going to be equal to E 2 P 2 E 1 P 2 then P 2 P 1 A. So, this matrix is E 1 dash. So, E 2 E 1 dash and then P 2 P 1 A, we have proved that this is unit lower triangular. This is also unit lower triangular. So, because it is unit lower triangular; that means, the diagonal entries are equal to one they are going to be invertible. So, we will get P 2 P 1 A is going to be equal to E 2 E 1 dash inverse into U and this will be unit lower triangular . So, that will be our L into U and this together will be P. So, you have got P into A is equal to L into U. So, now gauss elimination with partial pivoting is equivalent to P into A is equal to L into U where P is a permutation matrix. When we considered gauss elimination method without this row interchanges. So, that means, without partial pivoting, we said that we have got two equivalent weights. Either you proceed that you do the gauss elimination method, the store your multipliers at appropriate places and then construct your matrix L and U is the final matrix, which we obtain in case of gauss elimination method. Other way was - start with matrix A try to write it as L into U and then you treat the elements of L and U to be unknowns take the multiplication equate it to the corresponding entry in A and that is how you can determine L and U. And in both the cases, the number of computations they are going to be of the same order. So, we have got a equivalent way of doing it and then from L U decomposition we went to s q decomposition and so on. Now, here we have written P into A is equal to L into U. So, whether I can try to do it directly, now here the problem is we do not know what row interchanges, we are going to need. Like when you do gauss elimination with partial pivoting, it becomes clear like look at the first column look at the entry which has got the maximum modulus and then interchange the corresponding row and the first column, first row, but this we do not know beforehand and this it may be necessary at every stage. So, this P into A is equal to L into U that is going to be useful for the backward error analysis. But, when you want to do gauss elimination with partial pivoting, then you have to proceed and what we have proved is existence of such a L U D composition, not for the matrix A, but for the matrix P. Now, the permutation matrix P . So, it has got start with the identity matrix and you do finite number of row interchanges, because we have got n by n matrix. So, you will be doing it for finite number of times the row interchange when you look at the determinant of P. So, when you do the row interchange, then the sign of the determinant is changed determinant of identity matrix is 1. Now, it will depend on whether you are doing even number of interchanges or whether you are doing odd number of interchanges. So, depending on that determinant of P is going to be plus or minus 1 and then look at P into A is equal to L into U, L is unit lowers triangular. So, determinant of L is going to be equal to plus 1 and determinant of U will be equal to determinant P into determinant A. So, it will be determinant of U will be plus or minus determinant of A. So, it will be not equal to zero, because we are assuming A to B invertible matrix. So, we know how to solve a system A x is equal to B, if you want to find inverse of A matrix? Then we have got formula, in terms of adjoint formula, the classical formula which involves the determinant. So, that is very expensive. So, if you need to calculate inverse of a matrix, what you can do is as follows: so, suppose your A is invertible matrix: aim is to find A inverse. Now, this A inverse will be equal to first column of A inverse will be A inverse e 1, second will be A inverse e 2 and the last one will be A inverse e n. So, look at A inverse e 1 is equal to let me say u 1; this means A u1 is equal to e 1. So, you solve your system of linear equations matrix A is there it is invertible take your right hand side to be vector 1 0 0. When you take that as your vector and solve it you are going to get a vector u 1 write this as first column of your A inverse then you look at the system A u 2 is equal to e 2, where e2 is the canonical vector 0 1 0 0 0, you solve it that is going to be your second column of A inverse and so on. So, you will be solving n systems of equations, in which the coefficient matrix A is the same it is the right hand side it is different. So, whatever work you do, like you can use gauss elimination with partial pivoting then you keep the track of operations, which you do and the upper triangular matrix which you are getting. The classical formulae is A inverse is 1 upon determinant of A into adjoint of A and then you look at A X is equal to I, where A Xi is equal to ei i going from 1 2 up to n. So, what I was calling u 1 u 2 u n here I am writing as x 1 x 2 x n. You do L U decomposition only once. So, that will be n cube by 3 flops and then you will be doing forward and backward substitution n times. So, and then the forward and backward substitution it is of the order of n square, this you will be doing n times. So, it is n cube. So, that means, the inverse of A it can be obtained in n cube by 3 plus n cube, which is equal to 4 n cube by 3 flops and it is a much efficient way than using the classical formula for calculating the inverse. Now, I have been saying that the calculation of the determinant is expansive if you use the formula that it is more than n factorial. So, why not use the L U D composition like, suppose your matrix A you can write has L into U then determinant of U is going to be equal to determinant of A or if you are using the Gaussian elimination with partial pivoting, then your P into A is equal to L into U. So, then your determinant of U will be equal to plus or minus determinant of A. In fact, one does such things like, if you want to calculate determinant by hand then one tries to make the simplification introduce zeros like try to get some simple, if you have many zeroes, then you can the calculation of the determinant becomes easier. So, here using computer you can reduce your matrix A to upper triangular form and determinant of that U will be equal to determinant A, if there are no row interchanges or it will be plus or minus of determinant of n depending on the number of row interchanges. So, L U D composition it can be useful even to calculate determinant of a matrix, when you need to calculate determinant. Now, we have said that the gauss elimination with partial pivoting, now there is one case in which you will not need any row interchange and that is when your matrix is diagonally dominant by columns and that precisely means that when you look at the diagonal entry its modulus is bigger than sum of the all remaining entries. So, that is definition of diagonally dominant matrix by columns. So, suppose you have got summation i goes from 1 to n i not equal to j modulus of a i j; that means, you are looking at jth column and the entries without except the entry on the diagonal. So, if you have got this to be an less than or equal to modules of a jj, j is equal to 1 to up to n then in particular look at the first. A column you are going to have summation i goes from 2 to n modules of a i1 to be less than or equal to modules of a 11. So, what we were doing was, you look at the first column and then, you look at the entry which has got maximum modulus. Now, suppose your first column is such that modules of a 11 is not only bigger than each entry in the column, but it is bigger than bigger than or equal to some of the remaining entries. Then obviously, modules of a 11 is the entry or a 11 is the entry with the maximum modules in that column. So, no need of the row interchanges, now this is for the first step, what about the second one? Like now, I do not need row interchange. Now, I will be subtracting appropriate multiples of the first row from the remaining rows I get a modified matrix. So, what is the guarantee that at the second stage also I do not need row interchange, because now you get a different matrix. So, that we are going to do as a tutorial problem. That this if your matrix is diagonally dominant by columns, then we had our matrix A. Suppose, this is diagonally dominant by columns, then this A we it gets transformed after the first step t o a matrix A1. So, the matrix A1 is you will have the first row as it is a 11 a 12 a 1n the first column is 0 and then here you are going to have sum A tilde matrix A or let me not be used let me call it matrix B. So, the matrix B is going to be n minus 1 by n minus 1 matrix. So, we are now going to work on this matrix B. So, in the tutorial problem we will show that B is also diagonally dominant. So, this is going to be our tutorial problem. So, now, when we started with the gauss elimination with partial pivoting, I had said that you should not divide by a small number. Because then it introduces the error you lose the accuracy. So, then we said that in the column you will look at the element of the maximum modulus now may be a better way of doing is that look at the whole matrix there are n square entries in that n square entries you look at the element, which has got maximum modulus and then that entry you bring it to the place first row and first column. So, when you do this what you will have to do is interchange the rows and also interchange the columns when you interchange the rows you have the same system. You do not change your system when you interchange the columns then what you are doing is you are renaming your variables like suppose, I have a system which has X1, X2, X3, Xn are the unknowns, if I interchange first and third column; that means, what was my variable X1 that is becoming X3 and what was my variable X3 that is becoming X1. So, maybe that is the better way of doing and that is known as gauss elimination with Complete Pivoting. So, A x is equal to b A is a invertible matrix in the First step you look at modulus of a k l. So, the entry in the kth row and lth row it is maximum among the all the elements. So, that means, you will have to do n square comparisons. Now, you interchange 1st and kth row and 1st and lth column in order to do that you will be considering P 1 A Q 1 where P 1 and Q 1 are permutation matrices. The pre-multiplication by a permutation matrix means interchange of rows post multiplication by a permutation matrix means, interchange of columns. So, you will be doing that and then that should be a better way in terms of stability than partial pivoting and that is true, but one has to keep the cost in mind. So, in practice, the partial pivoting it works well and that is why one does not really do complete pivoting. But the complete pivoting will be equivalent of doing the L U decomposition of not a matrix A, but P A Q; where P and Q are permutation matrices; that means, they are obtained from the identity matrix by finitely many row interchange or equivalently column interchange. So, the Gauss elimination without partial pivoting is A equals to L U with partial pivoting is P A is equal to L U and complete pivoting means P A Q is equal to L U and you will need n cube comparisons. So, it is too expensive and that is why the complete pivoting is not done. In our next lecture, we are going to consider vector norms and the induced matrix norm. Thank you.