Tip:
Highlight text to annotate it
X
Hello there, welcome to this lectures of probability methods in civil engineering, today it is
5th lecture and in this lecture, we will cover the some standard discrete probability distribution
that will be very useful for this; for different problems in civil engineering. Basically,
a this class and maybe one or two classes we will cover some standard distribution of
the random variables. And today’s class mostly we will discuss about the discrete
random variables, and next or next one or two classes we will cover the different distribution
for continuous random variable. So, this distribution, this discrete distributions
even though limited, but have some application in different problem that we will discuss
one after another them. And we will start with our presentation with some quick recapitulation
of the pdf of the discrete random variable. So our today, we will discuss on this discrete
probability distribution, and this there are different discrete probability distribution
and here.
There is some list is given it is not necessary that these are the only distribution which
are discrete there maybe some other, but mostly the civil engineering problem the application
of civil engineering problem limited to this distributions. So, we will first today’s
lecture, we will first start with the pdf of discrete random variable that we; this
part we covered in the last couple of classes. But we will just quickly see that what this
distribution is that in general, what is this distribution? Then, we then will just show
that one after another the binomial distribution then, we will go to multinomial distribution
then Poisson distribution, geometric distribution, negative binomial distribution, hyper geometric
distribution. So, this distribution is having different
application in different civil engineering problem for example, when you call about the
binomial distribution, we generally think of this different rate of success or failure
of; in a particular event. When you talk about multinomial, we generally go for some more
than two outcomes so, more than two possibilities so that distribution, we generally see through
multinomial distribution. Similarly, for the Poisson distribution, we talk about in terms
of it is occurrence over a time or space or over an area. And those things particularly
for this rainfall phenomena whether it is a rainy day or non rainy day or the railway
accident and all this kind of problem will generally delayed that Poisson distribution.
Then, there are one after another the geometric distribution comes then negative binomial
distribution comes, hyper geometric distributions comes all this distribution we will see. First,
we will see this distribution, what are this different distribution properties they are
p m a particularly that you know that for discrete random variable that probability
mass function we use. So, we will see, what are the p m a of the probability mass function
for different distribution first and there some of the moments that first moment second
moment, we will see. And then, we will discuss about some of the applications in different
civil engineering problem.
So, to start with this discrete random variable, as we have discussed earlier also that a discrete
random variable is a function that can take only a finite number of values. So, that value,
it is not continuous over the domain, over the sample space, it can take only some finite
numbers. And in mostly in general case, this is generally equidistance that one, two, three
like that, even though that is not the compulsory case. And the probability density function
of a discrete random variable indicates the correspondence between the values taken by
the random variable and their associated probabilities. And it is concentrated as a mass of a particular
value and generally known as probability mass function.
So, these things we discuss, as when we you are talking about it can take some finite
number of values. So, this here the probability is not treated as a density rather we treat
that probability to be concentrated at those values means, which that random variable can
take. So, it can be treated as a mass, that which is concentrated at that particular value
and that is why, this distribution is generally known as the probability mass function.
So, now this probability mass function PMF is the probability distribution of a discrete
random variable, say this discrete random variable is denoted as X which is generally
denoted by this small p X x. So, here this X denotes as we have discussed earlier, that
X denote that random variable and the small x denote that particular value of that random
variable. And so, this X are now the finite in numbers some specific value that it can
take. So, the small p indicates that it is; this is the probability mass function and
when we indicate that cumulative distribution function then we replace this one as capital
X, as we discussed earlier. So, this p X x this thing indicate that the
probability of the value when it takes a particular X, that is X equals to x taken by that random
variable X. So, this things though a particular function so, this is now a function, which
is denoting the probabilities for a particular value that this random variable can take.
Now, these functions also as we discussed that should follow some properties to become
a valid PMF, that valid probability mass function. So, and this properties are shown like this,
that for each and every value of that random variable can take, that should be greater
than equal to 0 and this is valid for all possible values of the X and also that summation
of all this probabilities should be equals to 1.
Now, when we discuss the different probability a distribution function for discrete random
variable adjust now, what we have given this list, that this list so, we can test that
this two properties. So, whatever the; these are the standard discrete random variable
which is available so, we can check these two properties, that whether this two conditions
that is the; which is required for this PMF is satisfied or not, that we can check.
So, to; we will start with this binomial distribution, this binomial distribution is used to find
the probability of getting x number of occurrence of a particular event in a sequence of n repeated
trials. These trials are called Bernoulli trials provided that the following assumptions
hold good. There are only two possible outcomes for each tri each trial and which are arbitrarily
called as success and failure. So, when we talk about that two particular random experiment
and that experiment is having some two specific outcomes. So, when we are talking about this
two specific outcome so, this is; this success and failure is arbitrary. Now, if I just go
for the very basic example of tossing a coin then, I can say that coming out of head is
success and coming out of tail is the failure. So, this is arbitrary, it is not that head
is always success and tail is failure, I can just reverse the notice and as well. To some
of the specific example of civil engineering, if I say that a reservoir has is some particular
high flood level so, above that, we can; we have to consider that this water level is
dangerous. So, I can say that the outcome are two whether it is below or above the high
flood low level. Now, in this case, I can denote that above the high flood level is
my success case and below the high flood level is the failure case. So, this is nothing to
do with the real life scenario, that which one should be success and which one should
be failure. So, one outcome one particular event, I can say that this is success and
this is failure. And for the binomial distribution when we
are discussing, we have to remember that we are considering only two possible outcome.
And two possible outcome are also this should be mutually exclusive, such is that occurrence
of one event will automatically indicate that non-occurrence of the other particular of
the possible outcome so, that is why? So, when you say that there; so, when you say
that these are the Bernoulli’s trials then, these Bernoulli’s trials of first assumption
that it must hold good is that there are only two possible outcome. And these outcomes are
arbitrarily, we can call that one is success and another one is failure.
Now, the probability of the success is same for each trial. Now, again when we are talking
about that this probability of outcome of each trial is same for the different trials
then, what you mean is that; so, if we take the basic example of this throwing a tossing
a coin then, I can say that the probability of coming of this head is equals to some number
say point five. So, that number is fixed for each trial so, if I do; if I repeat that particular
trial then, that probability of that particular outcomes should not change. Now, if I say
that in a particular day for that reservoir problem, if I say then, in a particular day
whether the reservoir water level should cross the high flood level or not that should have
some probability. And that probability should remain same for
the different trials, that you are considering, if you consider that particular random, even
to be Bernoulli distribution. Now the question is how to assign that particular probability
that is the different issue that we will discuss again in the successive classes. But what
you should remember at this point is that particular event, which we are arbitrarily
naming as success that particular event. Probability of that particular event that should be known
to us and that should be fixed for all the trials that we are going to conduct. So, that
is why it says that, the probability of the success is same for each trial.
Now, the outcomes for different trials are independent. Now, this is also important in
the sense that when we are talking about that, I am conducting a particular trial. So, this
trial, outcome of this trial whether success or failure should not depend on what we got
just immediately previous trial. So, successive trials are independent to each other. So,
and the last condition, last assumption of this Bernoulli trial is there is a fixed number
of trial to be conducted. So, this in so how many trials that we are going to conduct to
get that x number of occurrence of that particular event, which we are calling as success here
so, this in should be known. So, what are two things, that we know here
prior to; prior go for the prior go to define this binomial distribution are the two thing,
one is that total number of trials that we are considering and the probability of success
for the each trial. So, these two information should be known and with these two information
known and with all these four assumptions to be satisfied we can define, what is this
binomial distribution?
So, if the probability of success again I repeat that this success is the arbitrary,
in the sense of arbitrary a particular event out of two possible outcome, I can tell that,
it is to be success. So, if the probability of success, that is the occurrence of an event
in each trial is given by this p so, this p is the probability of success so, it is
known to us with just now, I discussed. So, this p is known to us, that is the probability
of success then, the probability of getting exactly x successful event, among the n trials
in a Bernoulli sequence is given by this binomial probability mass function. So, this how many
trials I will conduct, this is also known to us this probability is also known to us.
Now, that the probability of getting exactly x success so, this number, the number of success
is the random variable that we are considering in this binomial distribution. Now, this one
that probability of x, which is expressed that n C x p power x 1 minus p power n minus
x and x can take any value in between 0 to n. Now, each these terms are having some meaning
so, n C x means that n combination x so, out of total n outcome how many different way,
I can select that x outcome. Now, if this is multiplied with the probability of success.
So, and each success is independent to each other that is why this power to x.
And that should be multiplied as I told that these two events, the success and failures
are mutually exclusive; that means, that the probability of failure is automatically 1
minus p when we the total probability is 1. So, the if this is the probability of success
then automatically the probability of failure so, be close to 1 minus p. And that should
be for the; if the success is for x number of cases then, the failure should be for the
n minus x number of cases. So, that is why this two are multiplied to get the total,
get the probability that x is exactly equals to x out of total n trials. Which is given
by n C x probability p power x 1 minus p power n minus x and; obviously, x can take any value
between any integer value of course, any integer value in between 0 to n.
Now, this n C x you know that this n combination x is expressed by n factorial divided by x
factorial multiplied by n minus x factorial and this is known as this binomial coefficient.
So, this now, if we can see here that if we put any value of x, for x equals to 0 1 2
up to n and all this values are positive. And if you take summation of this probability,
we will see that the summation of all this individual probability masses that is concentrated
at x equals to 0 1 2 up to n, this should be equals to 1. So, this is the valid PMF,
first of all this PMF is known as the binomial; is the PMF for binomial distribution.
Now, this binomial distribution is having one very interesting property, this is known
as the additive property of this binomial distribution. Which says that, if x is a random
variable with binomial distribution having parameters n 1, p. That is; this p is the
probability of success and total number of trial is n 1. And there is another distribution,
there is another random variable y which is also a binomial distribution having the parameters
n 2 and p. Now, then their sum, if I take the sum of this two random variable x and
y, their sum z should be is a random variable, which is again a binomial distribution having
the parameters n and p in such a way that this n is equals to n 1 plus n 2.
So, when we are adding two binomial distribution, we are getting another binomial distribution
and the; while adding we should consider that the probability of success for both the random
variables is same which is p. Then, the summation should also have the binomial distribution
with the probability as the same the probability of success same as those of two random variables
which is p. And this total number of trials is the summation of the total number of trials
for the first random variable and the total number of trials for the second random variable.
So, here this b, generally we write as in the capital letters so, this two should be
capital letter.
Now, if we see that different moment of this distribution, we know that what are we discussed
in this last class. So, if you see the first moment that is mean, the mean of the binomial
distribution is given by is n p and the variance of the binomial distribution is given by n
p in to 1 minus p. And the coefficient of Skewness of binomial distribution is given
by gamma, which is equals to 1 minus 2 p divided by square root of n p multiplied by 1 minus
p. Now, this Skewness, I will come little later with this description before that, if
we just see this mean and variance here, this can be easily shown like this.
That for a particular trial, one individual trial we are claiming that this probability
of success is p and we are saying that, there are n numbers of different trials are there.
What we also assume during the discussion of the probably that different trial, that
is the Bernoulli’s trials which are mutual, which are independent. So, the successive
observations that successive outcomes are independent. So, what we can say that, if
the expected value, this is the expected value for the success is p for one trial and there
are such n trials which are independent to each other. Then obviously, the total number
of expected value of the success should be the; for one trial it is p, two trials it
will be 2 p and similarly, for the n trials it will be n p.
And so, that is why this value this n p is your the expectation of that random variable
x. Similarly, if we take that variance, we know, we can just say that arbitrarily that
this success is 1 failure is 0. If I say then we can say that, then we can multiply that
this one multiplied by this p is the number of success and the number of failure should
be 1 minus p so, this is coming, this is for the one particular trial. And as there are
n different independent trial then, we can say that total; that this variance for this
whole this n different trials should be multiplied by simply by n.
So, this is; this one is equals to here variance of x similarly, we can also see the Skewness.
That the interesting point here, that for the Skewness is that; this Skewness and we
also discussed that positively skewed negatively skewed and symmetrical distribution now, which
is dependent on this probability of success. Now, if we put the probability of success
is equals to 0.5 then, you can see that this Skewness becoming zero. So, this Skewness
factor, the coefficient of Skewness if it becomes zero, we know that the distribution
becomes symmetric. Now, for the probability; if the probability
of success is and probability of failure are exactly same to each other then, the resulting
binomial distribution is symmetric. Now, if this p is greater than 1 minus p then, it
will be skewed to the left and we know that is skewed to the left, it means the positively
skewed. And if this p is less than equals to one, if it is less than 1 minus p then,
it is skewed to the right that means, it is negatively skewed. So, depending on the probability
of success, this Skewness coefficient of Skewness of this binomial distribution changes from
positively skewed to symmetric to negatively skewed.
Next we will discuss about this multinomial distribution now, it is similar to the binomial
distribution in the sense, that in the binomial distribution we as we told that there are
only two possible outcome. Now, if you say that there are more than two possible outcome
then, the resulting random variable is becoming as a vector and that the distribution of that
vector we call as a multinomial distribution. Now, if there are n independent trials with
each trial allowing k mutually exclusive outcomes, whose probabilities are p 1 p 2 up to p k.
Now, when we are saying that there are k mutually exclusive outcome just remember that for binomial
case we are talking about this two mutually exclusive outcomes
So, here we are making in general and mostly more than two so, which is that the success.
Now, this probabilities are p 1 p 2 up to p k obviously, as these are mutually exclusive
the summation of this probabilities should be equals to 1 which is written here that
a summation of this all p i from 1 to k is equals to 1. Then, the probability of getting
x 1 outcome that is one particular outcome is the number is x 1. Second kind of the outcome
is exactly equal to x 2 and in this way the x k number of outcomes for the kth kind; obviously.
Now, when we are talking about that x 1 x 2 x 3 and up to x k then as we have already
stated that there are n independent trials that means, the summation of this x 1 x 2
x k should be equals to n so, this is written here.
Then, the distribution of this kind of that where there is more than one possible outcome.
So, that probability of this exact numbers that is x 1 x 2 x k for the first kind, second
kind and the kth kind respectively is given by this distribution. That means, here the;
that n factorial divided by x 1 factorial, multiplied by x 2 factorial, x k factorial
like this. Multiplied by that success, that the probability of the success for the first
kind power x 1, probability of the success power x 2, like this up to k. Similarly, if
you just compare it with the binomial that means, in the binomial there are only two
possible outcome. So, here, for the first possible so, that
is why is the subscription was not there and the probability of success was p and obviously,
if the probability; so, if I replace this p 1 by p then obviously, this p 2 is equals
to 1 minus p 1. So, that is what exactly we got in the binomial distribution.
The mean and variance of this multinomial distribution, the joint probability distribution
whose values are given by these probabilities is called the multinomial distribution. It
is so called, because of the different values of x i, he probabilities are given by corresponding
terms of the multinomial expansion, that is p 1 plus p 2 p 3 up to p k. So, this mean
of this distribution can be shown that, is that the n p i, p i means that for a particular
outcome when we are talking ith outcome, that mean is n p i. That is total number of trial
multiplied by the rate of success that the probability of success for that particular
outcome. And similarly, for the variance for that particular outcome is equals to n p i
into 1 minus p i, exactly similar can be this can be done same way from this binomial distribution.
Another important discrete distribution is known as this Poisson distribution and the
process is known as the Poisson process. And this is important when you are talking about
this, when we are modeling this rainfall occurrence of rainfall or occurrence of this road accident,
or rail accident in transportation engineering particularly then, we generally use this kind
of distribution. This Poisson distribution, this Poisson process is analogous to the binomial
process, but it corresponds to the occurrence of the event along a continuous time or space
scale whereas, the binomial process correspond to the occurrence of the event along a discrete
time scale. Now, this is important when you are talking
about this binomial process then, we are talking about the n different trials. And obviously,
the n different trials should be a particular integer value out of that an independent trial
how many; what is the success rate and all we are just investigating. Now, when we are
talking about this Poisson process this is generally on a continuous time scale and this
continuous time scale over a time scale, when this particular event is occurring. Now, or
what we can say that over a particular span of time that, what is the possibility of this
different; the number of outcomes of particular event. If you say that rail accident then
over a month of time, this is the time, this is the month in a month the number of rail
accidents that can happen. So, this is the; this can be; this is known as the Poisson
process. Now, this Poisson distribution is used to
model a particular event that can occur at any time, or at any space. So, this is not
only over the time, we can also say that along the stretch, along the stretch of a a highway
or along the stretch of a railway line. So, this can be happened in the time direction
or in the special direction or this can be even extended to the area. So, over a particular
area that number of occurrence or it can be even extended to the volume, over a particular
volume that number of occurrence. So, that over a continuous medium, I can say now, over
that this can be this can be time or 1 dimensional space or 2 dimensional space, or 3 dimensional
space. So, over that domain that number of occurrences is modeled through this Poisson
process. So, now if we consider, suppose that how we
can make the analogy with this Bernoulli’s process. Now, the; say we if we consider a
Bernoulli’s process in a certain time interval, where p is the probability of occurrence of
an event within the time interval, if the time interval decreases then, the probability
of p also decreases obvious. Whereas, the number of trials n should increase now, n
should increase in such a way that if the decrease of p and increase of n occurs, in
such a manner that the product n p remains constant. Then, the binomial distribution
approaches to the Poisson distribution.
Now, if you see that the; what assumption that we should follow to for a particular
process to call as the Poisson process, these are this. The Poisson process is based on
the following assumptions. First, a particular event can occur at random at any point in
the time or space and obviously, over this space means, over a line segment or over an
area etcetera. The number of occurrences of an event in a given time or space interval
is independent of that in any other non overlapping time (or space) interval. So, if I say that
over the temporal direction, the number of occurrences over say t 1 to t 2 is totally
independent of the number of occurrences from t 2 to t 3.
That means, in a line if I just start from here so, the number of occurrences in this
from this are t 1 to t 2 and from this t 3 to t 4 as long as we say that this t 3 is
greater than t 2 that means, these two zone are non overlapping to each other. Then, the
number of occurrences over this interval is independent of the number of occurrence over
this interval. And similarly, the same thing can be extended to the area as well as for
the time and space direction. So, I repeat the number of occurrences of an event in a
given time or space interval is independent of that in any other non overlapping time
(or space) interval. The probability of occurrence of an event
in a small interval that is delta t is given by lambda delta t, where lambda is the mean
rate of occurrence of the event. This lambda is the parameter of this distribution where
it is the mean rate means number of occurrences over the unit time so, over an unit time how
many times that particular event can occur. So, that is designated by the lambda which
is the parameter for this Poisson process. So, this should be known before and so that
we can define what is Poisson distribution. The probability of more than one occurrences
of an event in the small interval delta t is negligible.
So, what we are talking about there is single occurrence in this small time interval delta
t that is a unit time is this one. So, this probability of more than one occurrence of
an event in the small interval delta t is negligible. So, with this assumption, if we
define that is the as per this assumption of this Poisson process, the number of occurrences
of an event x t in the time t is given by this Poisson distribution. One thing I just
want to mention here for whenever, we are talking about any particular distribution
it is very essential to know that what is the; means which event we are calling as is
random variable so, each and every distribution that we are discussing.
Similarly, for this the binomial distribution that we discussed just now and the Poisson
distribution all the distribution that, we are going to cover this class and as well
as in the successive classes. First, what you should try to understand is the what is
the random variable involve in it is number, is it the temporary direction over the time
and this thing. So, if we understand that which one is the random variable is being
referred here then, the understanding of that particular distribution will be easier. So,
here that is why I am repeating here, that what we are calling about this Poisson distribution.
The random variable random variable is the number of occurrence, the number of occurrence
of the event in the time t. So, that number here it is shown as random
variable x and a particular value of the random variable is x, with this parameter lambda
over the time t which is denoted as by this distribution, which is lambda t power x by
x factorial exponential minus lambda t. Now, this lambda is greater than 0 and t is greater
than 0 and this discrete random variable that is x, which can take the value from 0 1 2
and it can go mathematically up to infinity. Where this lambda t is the mean rate of occurrence
that is the average number of the occurrence per unit time so, fine as we are talking about
this unit time is specifically mentioned here, this t is not required here that is the lambda
is the mean rate of occurrence, that is the average number of occurrence per unit time.
So, if you multiply with the t then, it is the total number of occurrence over that particular
time t.
The mean of the Poisson distribution now, if we just show the first few moments, if
you see of the mean of the Poisson distribution is given by expectation of that x is equal
to lambda. And this mean as well as the variance of this distributions are same which are same
of this parameter of this distribution which is lambda. And this coefficient of Skewness
here for this Poisson distribution is can be shown that, this is lambda power 1 by 2
that is 1 by square root of lambda. Now, if so this is indicating that if with the increase
of the value of lambda so, if the lambda value increases that is the main of occurrence value
is increased. Then, the distribution shifts from the positively skewed to positively skewed
distribution to a nearly symmetric distribution. So, for this lower, for this low value or
small value of this lambda this distribution is nearly positively skewed now, as this lambda
increases it is generally approach to the to a symmetric distribution.
So, there is also the additive property is there for this Poisson distribution as well, if there are
2 Poisson random variables with parameters. Now, here this lambda 1 and lambda 2 then
their sum is also a Poisson random variable with the parameter lambda in such a way, that
lambda is equal to lambda 1 plus lambda 2. So, we can add more than one Poisson distribution
and this summation is also a random variable with Poisson distribution, with the parameter
lambda is the summation of the parameter of the summing of random variables. So, this
is the additive property of the Poisson distribution.
Now, another distribution that is known as the geometric distribution, this number of
trials until the first success that is the occurrence of an event a particular event
obviously, this success again here is the arbitrarily chosen. So, the number of trials
until the first success that is the occurrence of an event in a Bernoulli sequence is given
by this geometric distribution. Now, here what is the random variable, as I was stressing
that the number of trials until the first occurrence. So, if I start one sequence of
this Bernoulli is process then how many trials I have to conduct to get the first success.
So, that number so, the few failure after first few failure this first success will
come. So, that number is here the random variable which follow this geometric distribution.
If the probability of occurrence of an event in any particular trial is p now, you recall
from this the Bernoulli’s process that where each trial all the trials are independent
successive trials are independent to each other. And for a particular trial, the probability
of success is p then, the probability that the first occurrence of the event is on the
t th trial. Now, what we are saying is that this t will take the value t, the first success
will come in the t will be given by this p multiplied by 1 minus p power t minus 1. How
we are getting this one that is 1 minus p is the probability of failure which has occurred
for the t minus 1 times before at the t th trial we get the first success.
So, this two are independent so, we are multiplying with each other to get the it is distribution,
which is nothing but this p multiplied by 1 minus p power t minus 1, this t minus 1
number of failures has occur before the first success has come. So, that is why this t can
take the value from 1, 2, 3 up to like this. There is one concept, which is known as the
shifted geometric distribution, where it says that just the concept is changed to that whether
the number of failures before the first success. Here what the way we discussed is the number
of trials until I get the first and now what I am saying is that in some other cases also
you can see that number of trials sorry, number of failures before the first success.
Now, when we are talking about the number of failures before the first success that
means, that that is the little shifted so, that is supported on this zero so that even
the first trial itself is the success then, the number of failure before the success is
zero. So, that support starts from the zero and go up to infinity. And that is generally,
means both are same, but to differentiate this two factors that is generally in some
text. We will find that is call as the shifted geometric distribution, but here, what we
are considering is that this number of trials until the first success. So, that is why we
are the; ours support here is from the 1, 2 up to infinity.
So, this is our distribution which is; so, thus the distribution for the geometric distribution
can be denoted as, this is the p m a that is; which is equals to p multiplied by 1 minus
p power x minus 1 or this p is the probability of success in each trial and this x can take
value from 1 to infinity. The expected value of this geometric distribution so, this is
important in the sense, that this we can say as the return period. This return period in
sense now, this expected value what this now; what we are talking about this number of trials
before the first success. Now, what we can say that if we take the expected value of
this one, which is indicating nothing but how frequent that particular success, here
the success again that particular event that we are referring to how frequently that particular
success is coming or returning. And which is a very important term known as
this return period, we will again the again discuss in the context of this frequent analysis
in successive modules. But here; so, this is the return period that is a particular
event is returning again, which is the expected value of this geometric distribution. Now,
this expected value of geometric distribution if you want to get then obviously, we can
get. So, before that this average time between two successive occurrence of an event in a
Bernoulli sequence is called the mean recurrence time or the return period. Even then that
this; what we are discussing is for this discrete random variable, but this can as well happen
for the some of the; for continuous random variable as well, which we will discuss later.
But here, we are discussing only with respect to this geometric distribution. So, the expected
value of this geometric distribution which is obviously, from this basic, this is the
basic equation we know that we have to multiply with that variable with this PMF and sum it
up over the support. So, support here is 1 to infinity so, if you add; if you do this;
if you take this infinite series and it is coming to this one so, this expected value
is 1 by p.
So, the mean of this geometric distribution so, this mean of this geometric distribution
is 1 by p. The other moment there is variance of geometric distribution can be shown that
this is 1 minus p divided by p square. And this Skewness of the geometric distribution
again can be shown as sorry, this will be Skewness coefficient gamma not be here, this
will be gamma is equals to 2 minus p divided by square root 1 minus p. So, this is the
Skewness of this geometric distribution.
Now, another discrete distribution that is also being used in different civil engineering
problem is this negative binomial distribution. This negative binomial distribution is used
to find the kth occurrence of an event in a series of Bernoulli trials. Now, again if
you see which what is the exactly that random variable, you are talking about is to find
the kth occurrence of an event in the series of Bernoulli trials. So, earlier what you
are talking about the first occurrence of that event or of that success here we are
talking about kth occurrence of that success. So, this one is follow this negative binomial
distribution. So, in a series of n Bernoulli trials if T
k is the number of trials until the kth occurrence of an event, then how we get that what is
the distribution of this T k. So, the probability that T k is equals to t, which is nothing
but here PMF for that particular trial, that particular required trial is equals to given
by t minus 1 combination k minus 1 multiplied by p power k 1 minus p power t minus k, where
this t can take the value from k, k plus 1 up to infinity and for t less than k this
is equals to 0. Now, if we see this distribution, the basis of this distribution we can say
here that what we are taking is the; add the T kth.
So, that kth occurrence, there is kth occurrence in this trial then just before this one so,
if this is equals to t so, at the t th trial we got that kth success or kth occurrence
of that event. Then, what we can say up to that t minus 1 trial, k minus 1 success is
occurred. Now, if we just take that what is the probability that k minus 1 success will
come out of t minus 1 trials then this is a simple binomial distribution and that distribution
that probability will be given by; I can write it here.
That is t minus 1 combination k minus 1 power this is p power total number of success that
is k minus 1 multiplied by 1 minus p power t minus k. So, here from this binomial distribution
what we are getting total number of trial is t minus 1 and total number of success is
k minus 1 so, this distribution is given by this. Now, immediate next what is there immediate
next trial that is the t th trial, we are getting that kth we are getting the kth; that
kth success. So, for this particular trial, what is the probability of a success that
is p again this is independent of this earlier whatever has happened. So, this t minus 1
number of trials has been taken, where thus k minus 1 success is there so, we have to
calculate this probability multiplied by; what is the probability that at t th observation
we will get one success. So, this one is the; we are getting directly
from this binomial distribution, which is t minus 1 combination k minus 1 probability
of success, number of success k minus 1 multiplied by probability of failure power t minus k.
Now, at this t th trial the probability of success is p, because this is independent.
So, that is why as it is independent we can multiply directly with this one, which is
resulting you that this required distribution t minus 1 C k minus 1 p power k 1 minus p
power t minus k. So, this is the distribution which is known as; now, here this t is taking
the value from kth onwards. So, k, k plus 1 until this and for this t less than k this
is naught, this is zero. Now, in some of the text you will find that
this one is shifted, this k is shifted to it is arranged in such a way that this can
be shifted to zero. So, that the support is mathematically shown from this 0 1 2 3 so
that you have to replace in such a way this distribution that this support should instead
of k, k plus 1 up to infinity, it should be 0 1 2 3 in this way that is also possible.
But here, we are taking the support from k onwards so, the distribution function looks
like this that is t minus 1 combination k minus 1 p power k 1 minus p power t minus
k.
So, from the binomial law here sometimes this we also call as this law this distribution,
from this binomial law if there are k minus 1 occurrence of an event in the first t minus
1 trial has been occurred. And the kth occurrence is that t th trial then, this probability
of T k equals to; this t is equals to t minus 1 combination the this one. Just what we have
discussed is the; this we will get from binomial process, multiplied by this p that is thus
kth occurrence at the t th trial and we will get this distribution for this in negative
binomial distribution.
Now, the mean of this negative binomial distribution can be shown that it is k by p and the variance
of this negative binomial distribution will equals to k multiplied by 1 minus p divided
by p square. Now, whatever the distribution that we are discussing in this class will
be used in the next to next module, where we are talking about the different application
to this particular civil engineering problem with that module; so here what we the distribution
that you are talking about this negative binomial distribution, this two variances shown.
Next what we are talking is the hyper geometric distribution in a series of n repeated trials,
the outcome of the trials are not independent. Then, the probability of x success and n minus
x failures can be determined by this hyper geometric distribution. Considering a group
of n items out of which the m are defective and the remaining n minus m being good, obviously.
If a sample of n items are chosen at random the probability of x defective item in this
sample is given by this distribution and hope is shown in this one. One basic difference
of this earlier distribution that is that it is the sampling with; sampling without
replacement, that is once we are taking out; obviously, we are not replacing it back to
the sample so, back to the population. So, here that is why this one that exactly
x success and n minus x failures is considered out of these n repeated trials, which is shown
by this n, there n is the number of defective item here. So, n combination x multiplied
by N minus m combination n minus x divided by n combination C and this x can take value
from value 1 to m. So, minimum possible value is 1 and the maximum value that it can take
is the m, because the total number of defective item is m.
The mean of this hyper geometric distribution is n m and variance of hyper geometric distribution
can be shown that to follow this expression. So, in this class there are some discrete
distribution is discussed here and there are some continuous distribution also will be
covered in this next class or next to next class. So, whatever the distribution that
we are learning with their basic properties and the basic assumption that we have done
and we will see some specific application. While we are modeling the different problems
in the civil engineering for different discipline and that we will see later. And that time
it will be helpful for us to use this particular distribution depending on what problem at
hand. So, that we can understand, which is the random
variable and we are talking of about and what is that probable behavior based on that we
can select the distribution. What we are discussing now and that will help to the model that particular
random variable of the different civil engineering problem. And that we will do mostly in this
next to next module and in the next class we will discuss some continuous distribution.
What we discussed this class is the discrete and in the next class and next to next class,
we will discuss some of the standard continuous distribution. Thank you.