Tip:
Highlight text to annotate it
X
So, today, we shall discuss about the RSA crypto system. So, we will continue with the
RSA crypto system, and what we shall discuss is, about, how to factor the public modulus.
Therefore, this is a quite big topic, but what we shall discuss is about some commonly
known techniques to do so.
So, essentially we shall be discussing about three main algorithms. One is called the Pollard
p minus 1 algorithm and the other one is called Pollard RHO algorithm, and then, we will conclude
with that so, another algorithm is a random squares algorithm is called Dixon's random
squares algorithm.
So, some basic points about factoring algorithms. So, most obvious way to attack the RSA, would
be to try to factor the public modulus. So, what we have discussed is that, if we are
able to factor the public modulus of RSA, then, we shall be able to compute p n q, because
n is a product of two prime numbers p and q, and if we know how to factor them, we will
note that value of a p and q. So, if you know p and q, then, from there we will be able
to ascertain the value of phi n, also, the torsion function. So, if collision; therefore,
if you know what is phi n, then from the encryption exponent, we shall be also able to compute
the decryption exponent. Therefore, for after that, you know it, RSA security is compromised.
So, this is a most obvious way of attacking an RSA algorithm. Therefore, we shall study
about some commonly known algorithms. But the modern algorithms are some sieving techniques,
like Quadratic sieve, Elliptic curve, factoring sieve and Number field sieve. But we really
not going depth in the sieving techniques. But, what we will essentially discuss is,
about something which is, which are also very important algorithms, but may be, you can
think, of these algorithms, are something like precursors to these algorithms.
So, what we shall discuss is, about p minus 1 Pollard algorithm. So, here is a p minus
1 factoring algorithms. So, in this algorithm, you will, we shall try to factor p minus 1.
And because, you know that, if p is a prime, then p minus 1 is an even number. It is a
composite number. So, it is based upon the factorization of p minus 1. And, the other
very interesting algorithm is called Pollard's RHO algorithm and it is yet another application
of the birthday paradox to cryptanalysis. So, of course, we have the trial division
algorithm, but the trial division is not very efficient when the sizes become large.
So, we know that a complexity of a trial division is like, it works like this; if n is composite,
then, n has a prime factor, which is less than square root of n. So, if your n is typically
less than 2 power of 40, this is quite ok. But the thing is that, if you, if the trial,
if your n becomes larger, then, you need to do something better than the ordinary trial
division technique. So, we shall study about two algorithms. So, in this case, note that,
what we are searching is for non-trivial factors of n. So, which means that, we are not really
going into the prime factorizations. So, because, I mean, what we will study, and
if you are able to do this, then using our previous preliminary tests, which we discussed
and there are many more actually, we can ascertain whether a number is prime or not, with some
reasonably high probability. If it is not prime number, then we shall again use the
same technique. Therefore, we can again and again apply this, these factoring algorithms
and we can factor. But in this discussion, rather in today's discussion, we shall be
concentrating on finding out non-trivial devices or non-trivial factors. So, non-trivial means,
of course, not itself and not 1. So, excluding them, we shall be trying to find out what,
compute the corresponding factors.
So, the Pollard's factoring algorithm is based upon the factorization of p minus 1. So, concentrate
on this, that is if p minus 1. See, you remember that p is a prime number; therefore, p minus
1 is a composite number. So, if p minus 1 is a composite number, then, we shall be able
to factor out p minus 1; therefore, p minus 1 can be factored in terms of prime factors.
So, you see that p minus 1 will be, in that case, equal to q 1 to the power of e 1, q
2 to the power of e 2 and this is not K. So, this is again another problem in the software.
So, this is q 1 to the power of e 1, q 2 to the power of e 2 and so on. So, it works like
this. P, this q 1 to the power of e 1, q 2 to the power of e 2 and so on. So, consider
there are K distinct prime factors. So, now here, for this p minus 1 algorithm, discovered
by Pollard, actually, you need a bound. So, this bound is denoted by say B. So, how is
that bound chosen, I mean we, although we do not really know this factors, but the assumption
is that, suppose, I know a fact, that is, all this prime powers which divides p minus
1 are lesser than or equal to p. So, suppose we know this fact. So, what does it mean,
that is all these prime powers like say q 1 to the power of e 1, q 2 to the power of
e 2. So, for example, if I arrange them like this,
like if I know that these are all, I mean without loss of generative, I can do such
kind of ordering. So, what I can assume is that, all these prime powers are less than
equal to B. Suppose, I know, I do not know these prime powers, but I know that, there
is a bound like this. So, from this we know that all these prime powers.. So, if I consider
the series of B factorial.. So, what is B factorial? B factorial is B into B minus 1,
B minus 2 and so on till 1. So, you note that, because of this particular assumption, in
this terms like B, B minus 1, B minus 2 and so on, these terms like q 1 to the power of
e 1 or q 2 to the power of e 2 or q k to the power of e k must have occurred at least once;
because these are distinct primes. So, they are not equal, that is, you will
not find out two powers which will match. So, all of them will be distinct terms. So,
in this series, they must appear somewhere. So, what does it mean? It means that, p minus
1 actually divides B factorial. Is this clear to us? Because, p minus 1 is nothing, but
these powers. So, if these powers are there in p minus 1 factorial somewhere, therefore,
this particular term must have divided B factorial. So, this is the basic observation, that is,
if you know that, there is a bound like this, then, you know that p minus 1 will divide
the factorial of p.
So, now, you see that, what we are essentially doing in this algorithm is like this; that
is, you see that, for, you start like this, that is you just assume that, a is equal to
2 and then what you do is that, you start from 2 to B and you compute a power j mod
n. So, what are we essentially doing at the end of the for loop? You see that, first of
all, you have got a j equal to 2; then, you have got j equal to 3; then, you have got
j equal to 4 and so on. So, what are you computing? a to the power of B factorial.
So, at the end of the for loop, you have got a is equal to a to the power of B factorial
and modulo n. So, at the end of the for loop, what we essentially have is... So, at the
end of the for loop, you have got a is equal to a to the power of B factorial mod of n.
So, now, you observe another property, which says that, at the end of the for loop.. Therefore,
you have got a equal to a congruent to, a to the, 2 to the power of B factorial, because,
you started with a equal to 2. Therefore, I can say that a is equal to 2 to the power
of B factorial mod n. So, now, you see that, I can write from here that, a is equal to,
because of this, I can write a equal to k n plus 2 to the power of B factorial; because,
if I divide by n, then, 2 to the power of B factorial is a remainder.
So, what is n? n is p into q, product of two prime numbers. So, plus 2 to the power of
B factorial. Now, if I take a modulo p, then I can write that, a equal to 2 to the power
of B factorial mod of p. So, from Fermat's Little theorem, we know that, a to the power
of p minus 1 is equal to 1 modulo p; and we also know, or rather seems, p minus 1 divides
B factorial; then, this implies that, a which is equal to 2 to the power of B factorial
modulo p, is actually equal to 1 mod p; because B factorial divides p minus 1; or rather p
minus 1 divides B factorial and we know that, 2 power of p minus 1 is equal to 1 and therefore,
2 to the power of p minus 1 some multiple will also be 1.
Do you follow this? Therefore, what I am saying is this. That is, p minus 1 divides B factorial;
therefore, B factorial is some constant times p minus 1. Therefore, if 2 to the power of
p minus 1 is equal to 1 mod p, then, 2 to the power of p minus 1 into that constant
will also be equal to 1. So, that means, 2 to the power of B factorial is equal to 1.
So, therefore, from here we know that, p divides a minus 1, correct. But, that p divides n
also. Therefore, which means that, if you compute the GCD of.. Therefore, that p divides
a minus 1 and p divides n. So, therefore, if a is not equal to 1, then the GCD of a
minus 1 and n will give me 1 non-trivial factor of n; so, that means, the GCD of a minus 1
and n should give me 1 non-trivial factor of n. But if a equal to 1, then, that means,
that, this gives a failure; this algorithm fails.
So, which means that, for you have to again start with another bound B. Therefore, this
is, actually, how the algorithm works. And, we can see some examples for that. So, for
example, if you come to this example, then n equal to 15770708441. So, this will be a
quite good number. Therefore, if you set a B equal to 180.. So, see that, this setting
of B equal to 180 is arbitrary, actually. So, what is the assumption that, all the factors
which are there, actually, all the prime powers must be less than or equal to 180. So, if
that be so, then, you say that, here in this case, if you do this, a equal to 2 to the
power of B factorial modulo n, then, actually after the end of the for loop, you get a value
of like this. Example. We shall continue with this example.
Therefore, n is equal to 15770708441, is the number which we have to factor. So, you see
that, what we have done here is that, we have set a value of B equal to 180. So, what is
the assumption that, all the prime powers of n are essentially less than equal to 180?
Therefore, this is an, again an assumption. So, we will start with some assumption and
if I do this computation of 2 to the power of B factorial mod n, then a becomes, after
the end of the for loop, it becomes equal to this particular value.
So, now let us compute d, which is a GCD of a minus 1 and n. So, you see that, a in this
case, is not equal to 1, mod of what you have seen. Therefore, in this, if I consider or
compute the value of GCD of a minus 1 comma n, then I get 135979, which is indeed a factor
of this number. So, you see that, this particular technique, actually relies upon, largely upon
the value of 180; that if 180 is properly chosen or not, this bound is properly chosen
or not. And, you see that, if this bound is large, like for example, if I make the bound
very large, then, this algorithm is guaranteed to succeed.
But if this value of B is small, than the actual value, then, this may, this algorithm
will fail. So, which means that, you see that, if this B is very large, like if I set the
value of B to be very large, then, although, the algorithm will succeed, but actually your
number of computations will also increase. So, very slowly, it will go to the trial division
algorithm. But if the value of B is a, indeed small, then, this algorithm will succeed.
So, which means what? That in order to stop this factorization, by this particular technique,
you need to ensure that, p minus 1 has also got large factors.
So, that means, if p minus 1 has got all these factors very small, then, the Pollard p minus
1 algorithm will be engaged to break or factor the value of n. So, which means that, you
see that, there are some finer points here that, there are B minus 1 modular exponentiations.
So, you see that, what we have done is that, in that for loop, we have computed all these
modular exponentiations; and you remember, from the square and multiply algorithm, which
we engage to compute the exponentiation, there are actually around 2 log B modulo multiplications,
so, taking into account, the multiplication as well as the squaring operation.
Now, the GCD can be computed in O log n whole cubed, using the Euclidean algorithm. And
therefore, the overall complexity is like this, like something like, B log B log n square;
because, multiplication takes log n square plus log n cube. And therefore, if you make
your B to be something like log n or log in n, then actually, we have a polynomial time
algorithm to solve this problem. So, but the thing is that, if this value of B is large
and if these are of the order of n itself, then, this algorithm will not be so useful.
So, which means that... However, if B increases, then, the success probability increases; but
the algorithm becomes as slow as the trial division algorithm.
So, therefore, the modulus n should be such that, p minus 1 does not have all prime powers
small. So, these are the observations. So, if you want to implement or choose RSA parameters,
then, this is also another criteria, that we should keep in mind. That is, p minus 1
should not have all prime powers which are small; should have at least 1 large prime
power. So, that is the observation that we obtain from the Pollard p minus 1 algorithm;
that is the lesson we learn from p minus 1 algorithm.
So, this is the description of the Pollard p minus 1 algorithm. So, Pollard gave yet
another, nice algorithm which is called the Pollard's Rho algorithm. So, any doubts till
this point. So, Pollard rho algorithm works like this. Suppose, you have been given some
value of n, which I say, assume we say 7171 and the question is, what is a value of p
which will divide the value of n. And this is a very interesting algorithm. You see that,
we know, that p must be lesser than or equal to square root of n. So, we know that. Therefore,
a possible method could be like this; like, you randomly choose 2 points a and b and you
find out modulo p. So, if a p is a small number, then, there is a very large chance, that a
and b will be congruent to each other modulo p.
So, what does it mean? p divides a minus b. And, we also know, p divides n. So, that means,
if I compute the GCD of a minus b and n, then, I should be able to get 1 non-trivial factor.
So, that is a basic point. Therefore, now, from, therefore, GCD of a minus b comma n,
indeed gives us a non-trivial factor of n. And from Birthday Paradox, if the number of
elements which I choose, is of the order of, say around square root of, square root of
p, then we have got a large chance of a collision. So, you have got distinct point, say p distinct
points, and from that, if you choose 2 square root of, if you choose square root of p elements
randomly, then, from Birthday Paradox, we know that, there is a high probability, that
both the elements will actually collide. And therefore, this technique should work. So,
but the thing, so, this is the basic principle behind the Pollard's rho technique. But there
are some pitfalls in this straight forward application. What is the pitfall? That is,
from p points, if I choose all possible points, then, what can be in the worst case, the number
of points. It will be, say p c 2; that is, if I choose p points, from there I have to
choose all pairs, all possible pairs and find out which pair satisfies the relation..
See, one thing you have to keep in mind, that you do not know the value of p. So, you do
not know, you will not be able to check this condition that a and b are congruent to each
other modulo. So, what you will do is that, you will compute the GCD of a minus b and
n, and see whether you get a non-trivial value. You do not have a handle over p, because you
do not know p. So, what you can do is that, you can randomly pick two numbers a and b
and compute the GCD of a minus b comma n and hope that, it is a non-trivial factor. So,
what we are, giving you guarantee from the Birthday Paradox is that, if a and b are congruent
to each other modulo p, then, this should give you a non-trivial factor. And if the
number of points you choose, are of the order of square root of p, then, you have a large
probability of such an occurrence. There is a significant probability of this particular
occurrence. But the point is, that, if I give you, for
example, square root of p points; even from there, how many possible pairs can you have.
There can be square root of p c 2 possible pairs. So, square root of p c 2, what is the
order? It is p and what is the order of p? It is square root of n. Therefore, you still
have the efficiency, same as that of the trial division algorithm. So, which means, we have
to reduce the number of GCD computations. The number of GCD computations by the straight
forward method is still quite high. You understand.
So, therefore, what I am saying is this; that is, if I pick say a and b and I compute the
GCD of a comma b, I will choose another value of c and then I need to compute the GCD of
a comma c, GCD of b comma c and so on. Again, you pick up d; you have to compute the GCD
of d and a, GCD of d comma b, GCD of d comma c and so on. So, you see that, if the size
that you choose from, is of the order of square root of p..
Why? Because, if the order is square root of p, then, you have a large probability that,
at least, you will get 2 colliding a and b; you will get 1 pair of colliding a and b.
But then, the number of choosings that you are doing is, or rather number of GCD computations
you are doing, is still of the order of p; because, you are doing, I mean, it is x c
2. Therefore, the order is still p, and that means, the order is still square root of n.
So, which means that, the algorithm is no better, than the trial division algorithm,
till now. The memory is still O root n; time is also still O root n. Therefore, we need
to do something better than this. So, there are some improvements, that we will discuss.
So, therefore, the main objective is that, we have a fairly good idea to find out the
factors; but what we need to do is, still is, to compute less GCDs and do our job. So,
in order to do so, what we do is this; that is, we choose a polynomial f x, which is equal
to x square plus a. Because, this is some random polynomial, that we have chosen. So,
f x equal to x square plus a, to randomly choose the numbers modulo n. So, what we do
is that, we choose a number x modulo n and compute f x equal to x square plus a modulo
n. So, therefore, the idea is that, this should be, should be giving me a random next number
actually. So, before I am engaging this polynomial to give my random numbers. Therefore, I give
a seed value of x. I get x square plus a; I again make that x and again apply it. So,
I keep on applying the same effects again and again.
So, there are some restrictions here, that x, that a cannot be equal to 0 or minus 2
modulo n. Why, because if f x equal to x square only, then, there is a problem. What is a
problem? That, there can be some fixed points. Like, for example, if I choose f x, I mean
x equal to 0, then it gets stuck at 0; similarly, if you choose x equal to 1, it is still stuck
at 1 and if it is minus 2, then, there is another stuck at point, thinks minus 1. So,
minus 1 square, plus. So, if you make the value of x equal to 1, then it is 1 square
minus 2, which means, it is minus 1. So, if you feed minus 1, you get minus 1 square,
means 1, minus 2. So, that is still minus 1. So, therefore, we shall, we shall actually
avoid the value of a equal to 0 or minus 2. So, we can just choose f x equal to x square
plus 1; this is a very simple function.
So, the main idea of the Pollard Rho algorithm is like this. Suppose, you choose x i and
x j which are same modulo p. So, what we will prove from here is that, in that case, if
I mean, rather, we know that, if x i and x j are same modulo p, then f x i and f x j
are also same modulo p; because, f is actually a polynomial, in all integer coefficients.
So, therefore, you have got, if you apply, I mean that, this is a standard result. Therefore,
I know that f x i and f x j will also be congruent to each other modulo p. So, now, you say that,
suppose, I need to compute x i plus 1 and if x i plus 1 is congruent to say f x i mod
n, so, that is a next computation. So, you plug in x i and I compute x i plus
1, which is congruent to f x, I mean, which is actually equal to f x i mod n. So, this
is the next computation of the function of f. So, f x i mod n will be equal to... So,
we have got x i plus 1. So, we have got this result, that is x i plus 1 mod p.. We have
got x i plus 1 mod p equal to f x i mod n mod p and that is equal to f x i mod p. Why,
because x i plus 1 mod p. Therefore, what I am saying is that, which you have taken
x i , x i plus 1 is equal to f x i mod n. So, what you do is that, you compute a mod
p on both sides. So, if you do a mod p on both sides, then what does happen. So, what
I am saying is this, that is, suppose, we have got this result x i plus 1 which is equal
to f x i mod n; what we do is this; that is, I compute x i plus 1 modulo p.
So, this should be equal to f x i modulo p. Why, I have proved this actually; similar
thing in the Pollard p minus 1 algorithm. It is a straight forward thing. So, you know,
I mean, there is nothing to prove there actually. So, therefore, x i plus 1 mod p is congruent
to f x i mod p. So, from this, actually, we know that f x i mod p is actually congruent
to f x j mod p. Therefore, I can write here that, this is nothing, but f x j modulo p,
correct. So, therefore, f x j mod p is equal to from... Therefore, what we are essentially
trying to say is that, x i plus 1 will also be equal to x j plus 1 modulo p. Therefore,
if x i, I mean, if you get x i which is congruent to x j mod p, then actually, using this computation
of f, I mean, this is the next modulo n computation. Then, from there, you can say that, x i plus
1 will also be equal; congruent to x j plus 1 mod p. Do all of you see that? Yes. So,
it is all obvious. Therefore, from, therefore, now, we can say from here that, x i plus delta
will actually be congruent to x j plus delta modulo p.
I can evaluate the delta. Therefore, I can say that delta is, for that is, for all delta
which is greater than 0, this is true. So, what does it mean? It means that, if you find
2 such values x i and x j, which are same, then, after that and because after this collision,
all the values in modulo p are actually locked with each other.
So, therefore, you can imagine that, this particular relation gives you that.. If I,
if I just randomly choose some points, then, you will find that, if... So, if I, if I plot
them, that if I just start to note this numbers. So, I get, for example, x i, I get the next
number, I get the next number, I get the next number and so on. So, you keep on getting
some numbers, which are distinct actually. So, after some point, if you keep on continuing
like this, you will find some value for which this x i and x j will become equal to each
other. So, that means, basically you get a letter like a rho. So, that is an idea, why
it is called a Pollard's rho algorithm. It looks like the Greek letter rho actually.
So, after that, actually, all these computations will give you the same result. So, we have
a tail part here, and after that, there is a circular part. So, that is the reason, why
it is called a Pollard's rho algorithm.
So, therefore, it will again it look like... So, this is an example from corona , which
says that, if you choose, for example, mod 1387, you see that, you get some rho like
this; if you take modulo 19, you get a rho which looks like this; this is a smaller rho;
if you take mod 73, you get a rho which looks like this. So, therefore, what, the idea is
that, whatever you say that, whatever modulo you take, after some value, there is, if you
just randomly choose some points, you are quite expected to get a collision. And once
you get a collision, after that, this follows. Therefore, x i plus 1, x i plus 1 will be
equal to x j plus 1; x i plus 2 will be equal to x j plus 2 modulo p.
So, therefore, you will always get a lock like this. So, it will be x i plus 1. Therefore,
you see that, in this case, may be, it is not so clear from the diagram. You will find
that, if there is a lock here, then, after sometime, you get a 6 here. So, I will spell
out the numbers. It is, I will start with 2; the next number is 3; the next number is
8; the next number is 6; the next number is 16; after that, you again get an 8; that means,
that it keeps on colliding. So, therefore, in the sequence, actually, there are two colliding
numbers and after that, it keeps on colliding, in this fashion.
So, now, our goal is to find out 2 such terms, for which x i and x j are congruent to each
other modulo p. And what we have seen is that, if x, if x i and x j are same modulo p, then
x i plus delta is equal to x j plus delta modulo p, for all delta, which is greater
than or equal to 0. So, now, you see that, j is j minus i is suppose equal to l. So,
which means that, l is the length of the cycle. Therefore, if you consider these numbers,
say x i, x i plus 1, till x j minus 1... So, which means that, all these numbers are where,
are in this particular cycle.
So, you start from x i; the next number is x i plus 1; the next number is x i plus 2
and so on. So, I come till this point, say x j minus 1 and then, you come to x i, which
is equal to x j modulo p. Therefore, this particular rho is for a prime factor of n.
So, therefore, I keep on doing this, and therefore, how many numbers are there in this particular
thing? There are, how many elements are there? there are j minus i; that is l. There are
l elements in this cycle. So, in this cycle, there are l elements. So, out of them, definitely,
from l consecutive numbers, there, that, there will be 1 number which will be a multiple
of l. So, therefore, let that number be, say some
x i dash, for which, x i dash, such that i dash is actually a multiple of l. So, l is
equal to j minus i. Correct or not? Now, what? So, this is the first observation. Therefore,
there must be 1 element whose index is actually a multiple of l. So, now what the Pollard
rho algorithm does is, like this. Therefore, if, you know that, you have to find out 2
values, 2 values for which you, I mean, between which you will calculate the GCD. So, I mean,
rather, you will take the difference and then compute the GCD with n. So, in the Pollard's
rho algorithm, it progresses like this. What you do is, like this. You start from, say
some x i value and you start with some next value, which is some x dash value. So, you
start from 2 values. So, therefore, how do you obtain x dash from
x i? You compute the f x function. So, you can call that x 1. Also, you compute f x 1
and you compute the value x dash. So, x dash is nothing, but equal to f x and what is the
modulo. Modulo n. We are computing the original function, is it clear. So, then what you do
is that, for the next part, you again take this and you apply one function of f x and
you compute the next number. But for this one, you actually take a bigger loop, and
you, and you compute 2 f x values. So, therefore, you again obtain 2 new numbers. Then, what
you do is that, you again compute one loop here and you get the next number; for the
next part, you again apply a bigger loop and you have to obtain 2 numbers; I mean, you
obtain a number here, a new number here and a new number here. And you keep on doing that.
So, what are you doing is that, in each iteration, you are actually generating 2 new numbers,
a and b. Why are you doing? See, what you need to do is, you need to detect the cycle.
So, therefore, that is, therefore, that is the main objective. So you know that, I mean,
I mean, just as a sort of, we give it a digression actually; that if I, if I tell you that, there
is a long linked list and if I tell you that, you have to find out a cycle, so, what algorithm
will you adopt? So, there is one algorithm, very straight forward, which says that, you
actually release 2, say split cards to them or two cards to them, 1 card with the double
the speed than the other, then, there is bound that both the cards will collide, if there
is a cycle. So, therefore, this is a basis of any cycle finding algorithm. Yes, whatever,
but this is the basic algorithm of a cycle finding algorithm. So, what you are doing
here is, in this case, that, you just start from x i, you obtain x i plus 1, x i plus
2 and so on; but since, that, there is an i dash, which is a multiple of l equal to
j minus i. So, whatever, what are you doing? You are
basically calculating x 1 and this is, but this is what? This is the double, I mean each
value is a double of the second one. So, therefore, you start with x 1; this is say x 2. So, this
is what? This is x 4; this is x 8 and so on. So, therefore, what you are obtaining is,
in one case, you are obtaining x 2 i and in the other case, you are obtaining x I; and
what is the.. So, therefore, for this case, when you obtain, when you, when you come like,
when you are computing the incremental function at some, at some place, you compute x i dash.
and that i dash is actually a multiple of l. So, when you are computing x i dash, what
is the other value that you compute? You compute x 2 i dash; because, there is one point, there
is one progression which is like x 1, x 2 and so on and the other one which is actually
jumping 2 steps. So, therefore, that these 2 numbers will also
collide, because, that x i plus delta is equal to x j plus delta modulo p. So, what was the
starting point? That if x i and x j are same modulo p, then x i plus delta will be also
equal to x j plus delta modulo p and j minus i was equal to l. So, therefore, if there
is one element, which is l times ahead of the other element, then they are bound to
collide. So, you see that, j, if I say, j minus i is l, then, this particular element
is l times ahead of this number, correct. So, therefore, this implies that, x i dash
will also be equal to x 2 i dash modulo p. Why? Now, because 2 i dash is i dash times
ahead of i dash and i dash is a multiple of l. Slightly tricky.
So, maybe you can just, you can observe, when you go back, but just think on this. That
is, what I am saying is this; that is, if x i and x j modulo p are same, then x i plus
delta will be congruent to x j plus delta modulo p, for any delta, greater than equal
to 0. So, therefore, that i dash is actually a multiple of l, then, x i dash will be congruent
to x 2 i dash modulo p; because, this number or rather, this index is still multiple of
l times ahead of this index. So, that means, these 2 will collide and you know that,t such
an element is bound to occur; because, if you take l consecutive elements, you will
definitely find one such value. So, that means, such type of, you are bound to get some kind
of number, which will satisfy this property. Though, therefore, the idea is that, if you
get a collision like this, then you are bound to be able to deduct this.
But the thing is that, you are not necessarily finding out the first collision; but you are
finding out some collision; and any collision is ok with me; I do not require to calculate
this particular point itself. If I find 2 values, which collides in the sequence, then,
I am done. So, what will be my next step? My next step will be, to calculate the GCD
of x 2 i dash minus x i dash comma n; because p will divide x i dash minus x 2 i dash; and
p will also divide n. So, p divides n and p also divides a difference of these 2 numbers.
So, therefore, if I compute this GCD and if I get a non-trivial factor, then I am successful.
So, this is the basic working of the Pollard's rho algorithm.
So, the idea is that, if you see this, that is, when you say that x i plus delta is congruent
to x j plus delta modulo p, for all delta which is greater than equal to 0. So, l equal
to j minus i and l is a length of the cycle. Therefore, now l consecutive terms like x
i, x i plus 1 and so on, till x j minus 1; there is 1 index, say i dash, which is divisible
by l. If l divides i dash, then, this is, this implies that, l also divides 2 i dash
minus i dash. Thus, as i dash is greater than i and 2 i dash minus i dash is a multiple
of l, therefore, x of 2 i dash will also be congruent to x i dash modulo p. So, therefore,
p will divide x 2 i dash minus x i dash and p also divides n. Therefore, if I compute
the GCD of x 2 i dash minus x i dash and with n, then, I should get 1 non-trivial factor
of n. And, if this returns a value like n, then, actually it is a failure; because, that
is a trivial factor.
So, what I am looking for, is a non-trivial factor. Now, if I... Therefore, you see that,
in this case, this is a previous case, like when we, this number was 1387. So, what it
says is that, if you, if you just consider these numbers, like x 3 dash, x 4 dash, x
5 dash... So, if you just observe this cycle is modulo 19. So, maybe it is not so visible,
but you can see the dimension of the cycle. Like, if modulo 19 is a smaller cycle. Therefore,
you see that. So, how are the cycles made? Have you understood that, that, that, this
is 1387. Therefore, what we have done is that, we have engaged the function f x to find out
all these numbers; and all these numbers have been reduced modulo 19, in this particular,
this part; that is, in this rho, all the numbers are reduced modulo 19. So, therefore, you
see that, in this case, the cycle is, has got a length of 3. So, what is this x 1, x
2, x 3. So, you see that, x 3 is, in this case, this 3 is the first number, which is
divisible by the cycle length. So, therefore, x 3 and x 6 should also collide.
Therefore, if I compute x 6 minus x 3 and I take the GCD with 1387, then, I should get
one non-trivial factor, which is indeed the case here. I compute GCD of x 6 minus x 3
with 1387 and I get 19. So, you see that, in order to compute x 6 and x 3, I do not
need 19; because x 6 and x 3 are computed modulo 1387. So, then, what I do is that,
I compute their difference. So, in this case, is this number 1186 and the other number is
8. Therefore, it is 1186 and this number is 8.
So, you take the difference and you compute the GCD is 1387; it turns out be 19. Therefore,
this is how? Put together, you can see that, you start with one x 1 value; you start with
another x dash value. So, you see that, there is a first starting point x 1 and this is
the next point, which is f x. So, first you compute the GCD of x minus x dash and n. If
you get a value of p, which is, in this case.. Therefore, what do you expect here? If you
get a value, I mean, which is equal to n, then, that means, you have, what you have
obtained is actually a trivial factor. If you have obtained a value of p equal to 1,
then, you start again this loop. So, what you do is that, there is one computation which
is f x, again f x and I mean repeated applications on f x and this part, like you see that, do,
you take x dash, you compute f x and you, again you compute f x.
So, therefore, these two lines are two applications of the function f x. So, this is, this leap
actually here. So, first you start with this point and this point, you enter the while
loop. Then, in one case, you start incrementing like this and the other case, you start applicating,
applying the function f x twice. So, you come here and again I have come here, and similarly,
like that, you progress. Is it clear to you. So, you start with x equal to x 1 here, and
this is the x dash value. And the next value, what you do is that, you compute x equal to
f x. So, therefore, you compute x equal to f x
means... Therefore, you compute this value of f x here and in the second case, you compute
two applications of f x and in the, and in, inside the loop, for the first computation
of f x is this. So, therefore, you get these two numbers and you compute the GCD, among
these two numbers. So, the main, the main point is that, there is one progression, which
takes like x i and the other progression which takes like x 2 i; and then, you have seen
the rest of the thing, that is, if you know that, there is an x i and x j which collides,
then, there is obviously, an x i dash, which will collide with x 2 i dash modulo p.
So, I mean, therefore, I mean, you can work out this simple example, which says that 7171
is what we need to factor. And we know that 7171 can be... Therefore, suppose we do not
know this factor 71 and 101. So, you start with some f x value, which is f x function,
which is equal to x square plus 1 and you start with say x 1 equal to 1. So, x 1 equal
to 1. Therefore, the sequence of x is begin as follows. You start with 1; you compute
the next number is 1 square plus 1; that is, 2. Then, you start with 2 square plus 1 is
5; 5 square plus 1 is 26; what is 26 square, 676; 676 plus 1 is 677. So, till now, it is
ok, but after that, this is also, I mean, see, that next number is 6557 and similarly,
you compute the numbers. I have noted an extra number 5 7 4 here, because, it will be required.
Therefore, if I, although I do not know the value of 71, let us consider the numbers modulo
71. So, in this case, if I, if I reduce all these elements modulo 71, then you find that,
these are the numbers.
So, you see that, these numbers are inside the modulo 71 field and therefore, you find
that, there are repetitions like, you find that, there are repetitions, means that, you
get 58 here and a 58 there. So, there are repetitions. So, you get 1 2 5, 26, 38, 25,
58, 28, 4, 17 and so on. So, you get repetitions in this list. So, the first collision in the
above list is x 7, x 7 mod 71, which is equal to x 18 mod 71, which is 58. Therefore, this
is the first collision. So, you know that, I do not know the value of 71, but I am just
observing the property. So, you see that, 58 is the first colliding number. But there
are subsequent colliding numbers also, but 58 is a first colliding number and the idea
is that, what is the length between this number and this number. It is actually equal to 18
minus 7, that is 11. Therefore, in this particular list, from this point to this point, there
will be, the next one will be x 11. So, therefore, if I, if I apply the Pollard
rho algorithm, I may not be able to find out this particular collision, but I will compute
x 11 at some point of time. And the other element will be x 22. So, what is x 22? x
22 is actually equal to 574; and x 11 is actually equal to 219. So, there were 1, 2, 3, 4, 5,
6, 7, 8, 9, 10 and 11. So, x 11 is 219 and the corresponding x 22 value is 574. So, you
take 574 subtract 219 and compute the GCD with 7171 and you will find that, the factor
is 71. So, therefore, you see that, this gives you a very nice technique, but in this case,
you are, you still, you are not aware of the value of the factor and you are able to compute
the factor. So, how many, what is the run time of this algorithm now?
So, what we have done is that, can you guess the run time of this algorithm? It will be
r n square root of p, because in each loop, you are computing one GCD and what is the
maximum length of, I mean, sequence, we that, you can go, you can start from i and you can
continue to j and what value should you keep for j? From the Birthday paradox, you will
keep j equal to O of square root of p. Therefore, the algorithm can run maximum up to O root
p; after that, you should be able to get some decision, alright. So, therefore, if you get
O square root of p... So, what is our order of p itself is square root of n. So, in terms
of n, the complexity is O n to the power of 1 by 4. Therefore, this is a significant improvement
over the trial division algorithm.
So, therefore, you have got O n to the power of 1 by 4, which is a significant improvement
over the trial division algorithm.
So, we will conclude with a very simple technique which is called the Dixon's random squares
algorithm. It is a very simple idea, which works like this. So, till this part, of all
that, is it clear to us? Pollard p minus 1 factorization and Pollard Rho algorithm. This
is also very quite simple, I mean, like, suppose, therefore, again there is a supposition here.
That is, suppose there is an equation like this x square equal to y square mod n and
suppose, the x and y are not same to each other modulo n.
So, in that case, therefore... So, here actually, there will be a plus minus. So, what I am
saying is that, x square equal to y square modulo n, but 2 obvious values will be x equal
to plus y and x equal to minus y. So, including them, if I am able to find out 2 solutions...
So, what does it mean? That n divides x square minus y square. So, n divides x minus y into
x plus y, but neither x minus y nor x plus y is actually divisible by n. So, what does
it mean? That the factors of n are the p and q. Therefore, we can have like x minus y is
divisible by p and say, x plus y is divisible by q or the other way round.
So, which means that, if I am, if I compute the GCD of x minus y and n or the GCD of x
plus y and n, I should get non-trivial factors of n. But the thing is that, I need to find
out such pairs. So, you can consider one, so very simple example, like n equal to 77.
You see that, there are two numbers 10 and 32, for which 10 square and 32 square are
same modulo 77 and neither 10 nor 32 are same or they are minus of each other. So, therefore,
if you compute the GCD of 10 plus 32 and 77, then you get 7, which is 1 of the factors
of 77.
So, the next thing is that, how do I compute this? Therefore, I am not really going to
the depth of this algorithm, but this is a simple, I mean, I think, if through this example,
we will get an idea. And, this is the basic of something, which you called as the quadratic
sieve. So, I am not going to that, but this is just an outline. Therefore, you see that,
suppose, I start with n equal to 1 8 2 9 and you consider, the factor base of n equal to...
So, therefore, there is something called a factor base. So, the factor base means that,
if you take, I mean, so, this is a factor, factor base means basically, some consecutive
prime numbers. And, all these prime numbers essentially have ensured one thing, that whatever
I mean. So, this is, suppose, there is a library of prime numbers and this, this and this factor
base is, supposedly a consecutive prime numbers and which are small numbers; that means, I
am not considering the big numbers. So, the first 10 prime number or first 12
prime numbers, first 20 prime numbers, something like that. Therefore, consider a factor base
and we have included minus 1 for convenience, here, actually. Therefore, you see that, the
numbers are 2, 3, 5, 7, 11 and 13. So, this is a factor base and you compute the numbers
like square root of k n. So, k is an integer. So, you start with some numbers like... So,
therefore, you start with like numbers subsequently will be 42.77, 60.48 and something like that.
so that, are all decimal numbers. So, if I take integer values, we are close to these
numbers, which have been generated like square root of k n; then I get, I choose a number
say 42, 43, 61, 74, 85 and 86. So, now, you consider these following congruences
modulo n. So, you see that, z 1 square is what? 42 square and 42 square if I take a
modulo n, that is modulo 1829, it will actually equal to minus 65. So, now you start factoring
minus 65. So, you see that, minus 65 will be factored into minus 1 into 5 into 13. So,
here comes the factor base; that is, all these factors should actually belong to my prime
library. So, minus 1 belongs to this library; 5 also belongs to this; 13 also belongs to
this library or factor base, whatever you say. Similarly, the next number is 43 square.
So, 43 square is 20, which is 2 square into 5. So, in this case also, you find that, all
these belong to these libraries. So, see, you can continue in this fashion and you will
find that, all these numbers are, essentially, you can factor them using this primes in the
factor base. So, now from these linear congruences or these
quadratic congruences, you choose some congruences. So, what you can do is that, you can represent,
see, I mean, I mean, if you say that, this I can represent, this particular factorization
as a tuple of ones and zeros; this tuple ones and zeros would indicate, which prime factor
I have chosen, like whether if I have chosen minus 1, I will indicate that by 1. If I have
not chosen minus 1, the first number will be 0. So, in this case, so, the, you see that,
2 has not been chosen. So, it will be 0 here; again, 3 has not been chosen, 0; 5 has been
chosen. So, 1. So, therefore, the right hand side I can represent as a 1 0 vector. So,
similarly I can represent all this prime factorizations as 1 0 vectors.
So, now from these quadratic congruences, you choose those congruences, for which, if
you add up, you get a all 0 vector. Why, because in that case, the right hand side, if, I mean,
if I multiply the left hand side, then the right hand side, I would be able to express
as something square; and that, I can actually apply some Gaussian technique to do that.
So, but I am not really going to those things in depth. So, therefore, if you consider the
congruence like 42 into 43 into 61 into 85 square, in this particular example, this comes
out to 2 into 3 into 5 into 7 into 13 square. So, therefore, you find that, you have been
able to find out 2 such numbers, which are not equal to each other or not negative of
each other, but for whom, the squares are equal modulo n. So, that means, now, if I
compute the GCD of 1459 plus 901 with 1829, then I get one of the factors of 1829.
So, this forms a basis of quadratic sieves. So, I have not really gone into this thing,
but just as I mentioned, I can just say that, the complexity of this will be like, I mean,
this is a sub exponential algorithm. So, it will be like, e to the power of 1 plus O n
and we remember correctly, it will be square root of l n n l n of l n of n. So, this is
some, I am not going in to the proof or anything of that set, but you see that, this is actually
a quite efficient algorithm and these quadratic sieves actually, have been followed by something,
which is probably even more powerful; it is called the number field sieve.
And there is something, which is called an elliptic curve sieve or something. So, all
these sieving techniques are actually, what are being used in today's, present day factoring
things and the projection is that, in the year of around 2020, 1024 bit will also, will
get factored.
So, therefore, you need to have higher values of bit security, if you are using RSA; may
be 2048 bits. So, I have followed Douglas Stinson's Cryptography, Theory and Practice
as a reference and next day, we shall continue with some comments on the security of RSA.
So, we shall follow tomorrow on this.