The Poisson distribution arises in many situations. It is safe to say that it is one of the three most important discrete probability distributions (the other two being the uniform and the binomial distributions). The Poisson distribution can be viewed as arising from the binomial distribution or from the exponential density.

This law was introduced by Poisson into his treaty " Search on the probability of the judgements out of criminal matter and civil matter " published in 1837. His goal was to estimate the influence of the size of the jury and the rule of majority on the reliability of the verdicts.

We will take an " industrial " example to introduce this law which has
been used in very varied fields (communications, astrophysics, nuclear
power, queues...). A factory manufactures a textile of which the density
of defects (an average number of defects to m2) is lambda. By supposing
the defects distributed independently from each other, we propose to determine
the law of the number of defects in a part of textile of 1 m^{2}.

We note N the number of defects in the piece of textile , we proposes the following modelisation

We divide the part of textile into n pieces of same size and sufficiently
small, and by noting N_{i} the number of defects in the ith piece,
we can suppose that the random variables N1... Nn are independents, with
values in { 0,1 }, of hope l/n. In other words,
N1.., Nn are variables of same law B(l/n) and
per continuation N=å (Ni) ( for i=1…n)
follows a law B(n, l/n).

**Proposal on the convergence of the binomial distribution
towards the Poisson distribution**

Given (p_{n}) n> 1 a series of reals
in ]0, 1[ and l >0 such as

So if S_{n }follows a law **B**(n,p_{n}) we
have :

and to pass to the limit in each term of the expression. š

The law of Poisson meets however to model phenomena of a type different from the repetitions of rare events that we have just approached here; we go with this intention carry out the first probabilistic modeling " dynamic " noncommonplace of this course.

Suppose that we have a situation in which a certain kind of occurrence happens at random over a period of time. For example, the occurrences that we are interested in might be incoming telephone calls to a police station in a large city. We want to model this situation so that we can consider the probabilities of events such as more than 10 phone calls occurring in a 5-minute time interval. Presumably, in our example, there would be more incoming calls between 6:00 and 7:00 P.M. than between 4:00 and 5:00 A.M., and this fact would certainly affect the above probability. Thus, to have a hope of computing such probabilities, we must assume that the average rate, i.e., the average number of occurrences per minute, is a constant. This rate we will denote by l. (Thus, in a given 5-minute time interval, we would expect about 5¸ occurrences.) This means that if we were to apply our model to the two time periods given above, we would simply use different rates for the two time periods, thereby obtaining two different probabilities for the given event.

Our next assumption is that the number of occurrences in two non-overlapping time intervals are independent. In our example, this means that the events that there are j calls between 5:00 and 5:15 P.M. and k calls between 6:00 and 6:15 P.M. on the same day are independent.

We can use the binomial distribution to model this situation. We imagine that a given time interval is broken up into n subintervals of equal length. If the subintervals are sufficiently short, we can assume that two or more occurrences happen in one subinterval with a probability which is negligible in comparison with the probability of at most one occurrence. Thus, in each subinterval, we are assuming that there is either 0 or 1 occurrence. This means that the sequence of subintervals can be thought of as a sequence of Bernoulli trials, with a success corresponding to an occurrence in the subinterval.

To decide upon the proper value of p, the probability of an occurrence in a given subinterval, we reason as follows. On the average, there are lt occurrences in a time interval of length t. If this time interval is divided into n subintervals, then we would expect, using the Bernoulli trials interpretation, that there should be np occurrences. Thus we want lt=np so p=lt / n.

We now wish to consider the random variable X, which counts the number of occurrences in a given time interval. We want to calculate the distribution of X. For ease of calculation, we will assume that the time interval is of length 1; for time intervals of arbitrary length t.

We know that P(X = 0) =B(n; p; 0) = (1 - p)^{n} =(1-l/n)^{n}

For large n, this is approximately e^{-l}
. It is easy to calculate that for any fixed k, we have

b(n; p; k)/ b(n; p; k - 1) = (l - (k - 1)p)/ kq

which, for large n (and therefore small p) is approximately l/k.

Thus, we have P(X = 1) » l
e^{-l} ,

And in general :

The above distribution is the Poisson distribution. We note that it must be checked that the distribution really is a distribution, i.e., that its values are non-negative and sum to 1.

The Poisson distribution is used as an approximation to the binomial distribution when the parameters n and p are large and small. However, the Poisson distribution also arises in situations where it may not be easy to interpret or measure the parameters n and p.

**Example 1:** In his book, Feller discusses the statistics of flying
bomb hits in the south of London during the Second World War. Assume that
you live in a district of size 10 blocks by 10 blocks so that the total
district is divided into 100 small squares. How likely is it that the square
in which you live will receive no hits if the total area is hit by 400
bombs? We assume that a particular bomb will hit your square with probability
1/100. Since there are 400 bombs, we can regard the number of hits that
your square receives as the number of successes in a Bernoulli trials process
with n = 400 and p = 1/100. Thus we can use the Poisson distribution with
l= 400 * 1/100 = 4 to approximate the probability
that your square will receive j hits. This probability is p(j) =e^{-4}
( 4)^{j} / j!.

The expected number of squares that receive exactly j hits is then 100 * p(j).

If the reader would rather not consider flying bombs, he is invited to instead consider an analogous situation involving cookies and raisins. We assume that we have made enough cookie dough for 500 cookies. We put 600 raisins in the dough, and mix it thoroughly. One way to look at this situation is that we have 500 cookies, and after placing the cookies in a grid on the table, we throw 600 raisins at the cookies.

**Example 2:** Suppose that in a certain fixed amount A of blood,
the average human has 40 white blood cells. Let X be the random variable
which gives the number of white blood cells in a random sample of size
A from a random individual. We can think of X as binomially distributed
with each white blood cell in the body representing a trial. If a given
white blood cell turns up in the sample, then the trial corresponding to
that blood cell was a success. Then p should be taken as the ratio of A
to the total amount of blood in the individual, and n will be the number
of white blood cells in the individual. Of course, in practice, neither
of these parameters is very easy to measure accurately, but presumably
the number 40 is easy to measure. But for the average human, we then have
40=np, so we can think of X as being Poisson distributed, with parameter
l= 40. In this case,it is easier to model the
situation using the Poisson distribution than the binomial distribution.

**Back to the introduction:**

**Some exercices :**

**A little bit of linguistics... : exo6**

**A story of TGV : exo7**

_________________________________________________________________________________________________________________________________________________________