Virtual University of Pakistan Lecture No. 22 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah

IN THE LAST LECTURE, YOU LEARNT • Independent and Dependent Events • Multiplication Theorem of Probability for Independent Events • Marginal Probability

TOPICS FOR TODAY • •

Bayes’ Theorem Discrete Random Variable • Discrete Probability Distribution • Graphical Representation of a Discrete Probability Distribution • Mean, Standard Deviation and Coefficient of Variation of a Discrete Probability Distribution • Distribution Function of a Discrete Random Variable.

First of all, let us discuss the BAYESâ&#x20AC;&#x2122; THEOREM. This theorem deals with conditional probabilities in an interesting way:

BAYES’ THEOREM If the events A1, A2, …, Ak form a PARTITION of a sample space S (that is, the events Ai are mutually exclusive and exhaustive (i.e.their union is S)), and if B is any other event of S such that it can occur ONLY IF ONE OF THE Ai OCCURS, then for any i,

for i = 1, 2, …, k.

BAYESâ&#x20AC;&#x2122; THEOREM If A1, A2, ...., and Ak are mutually exclusive events of which one must occur, then P( A i ) . P( B | A i ) P( A i | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 ) + .... + P( A k ) . P( B | A k )

If k = 2, we obtain:

Bayesâ&#x20AC;&#x2122; Theorem for two mutually exclusive events A1 and A2: P( A i ) . P( B | A i ) P( A i | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 ) where i = 1, 2.

In other words,

P( A1 ) . P( B | A1 ) P( A1 | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 )

And: P( A 2 ) . P( B | A 2 ) P( A 2 | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 )

EXAMPLE In a developed country where cars are tested for the emission of pollutants, 25 percent of all cars emit excessive amounts of pollutants. When tested, 99 percent of all cars that emit excessive amounts of pollutants will fail, but 17 percent of the cars that do not emit excessive amounts of pollutants will also fail.

What is the probability that a car that fails the test actually emits excessive amounts of pollutants?

SOLUTION 1. Let A1 denote the event that it emits EXCESSIVE amounts of pollutants. 2. A2 denote the event that a car does NOT emit excessive amounts of pollutants. (In other words, A2 is the complement of A1.) 3. let B denote the event that a car FAILS the test. The first thing to note is that any car will either emit or not emit excessive amounts of pollutants. In other words, A1 and A2 are mutually exclusive and exhaustive events i.e. A1 and A2 form a PARTITION of the sample space S. Hence, we are in a position to apply the Bayesâ&#x20AC;&#x2122; theorem.

4. We need to calculate P(A1|B), and, according to the Bayesâ&#x20AC;&#x2122; theorem: P( A1 ) . P( B | A1 ) P( A1 | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 )

P( A1 ) . P( B | A1 ) P( A1 | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 )

P(A1) = 0.25 P(B|A1) =

0.99,

and P(B|A2) = 0.17

Substituting the above values in the Bayesâ&#x20AC;&#x2122; theorem, we obtain:

P( A1 ) . P( B | A1 ) P( A1 | B) = P( A1 ) . P( B | A1 ) + P( A 2 ) . P( B | A 2 )

( 0.25)( 0.99 ) = ( 0.25)( 0.99) + ( 0.75)( 0.17 ) 0.2475 = 0.2475 + 0.1275 0.2475 = 0.3750 = 0.66

This is the probability that a car which fails the test ACTUALLY emits excessive amounts of pollutants.

The example that we considered pertained to simplest case when we have two mutually exclusive exhaustive events A1 and A2.

just the only and

As stated earlier, the Bayesâ&#x20AC;&#x2122; theorem can be extended to the case of three, four, five or more mutually exclusive and exhaustive events.

Let us consider another example:

EXAMPLE In a bolt factory, 25% of the bolts are produced by machine A, 35% are produced by machine B, and the remaining 40% are produced by machine C. Of their outputs, 2%, 4% and 5% respectively are defective bolts. If a bolt is selected at random and found to be defective, what is the probability that it came from machine A?

In this example, we realize that “a bolt is produced by machine A”, “a bolt is produced by machine B” and “a bolt is produced by machine C” represent three mutually exclusive and exhaustive events i.e. we can regard them as A1, A2 and A3. The event “defective bolt” represents the event B.

Hence, in this example, we need to determine P(A1/B). The students are encouraged to work on this problem on their own, in order to understand the application and significance of the Bayesâ&#x20AC;&#x2122; Theorem.

This brings us to the END of the discussion of various basic concepts of probability. We now begin the discussion of a very important concept in mathematical statistics, i.e., the concept of PROBABILITY DISTRIBUTIONS. As stated in the very beginning of this course, there are two types of quantitative variables --- the discrete variable, and the continuous variable. Accordingly, we have the discrete probability distribution as well as the continuous probability distribution. We begin with the discussion of the discrete probability distribution. In this regard, the first concept that we need to consider is the concept of RANDOM VARIABLE.

RANDOM VARIABLE Such a numerical quantity whose value is determined by the outcome of a random experiment is called a random variable.

For example, if we toss three dice together, and let X denote the number of heads, then the random variable X consists of the values 0, 1, 2, and 3. Obviously, in this example, X is a discrete random variable.

Let us now discuss the concept of discrete probability distribution in detail with the help of the following example:

EXAMPLE: If a biologist is interested in the number of petals on a particular flower, this number may take the values 3, 4, 5, 6, 7, 8, 9, and each one of these numbers will have its own probability.

Suppose that upon observing a large no. of flowers, say 1000 flowers, of that particular species, the following results are obtained:

No. of Petals X 3 4 5 6 7 8 9

f 50 100 200 300 250 75 25 1000

Since 1000 is quite a large number, hence the proportions f/â&#x2C6;&#x2018;f can be regarded as probabilities and hence we can write

No. of Petals X x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9

P(x) 0.05 0.10 0.20 0.30 0.25 0.075 0.025 1

Properties of a Discrete Probability Distribution: (1) 0 ≤ P( Xi ) ≤ 1 for each Xi (i = 1, 2, … 7) and (2)

∑ p( X i ) = 1

And, since the number of petals on a leaf can only be a whole number, hence the variable X is known as a discrete random variable, and the probability distribution of this variable is known as a DISCRETE probability distribution. In other words: Any discrete variable that is associated with a random experiment, and attached to whose various values are various probabilities â&#x2C6;&#x17E; (such that â&#x2C6;&#x2018; P Xi = 1) i =1 is known as a Discrete Random Variable, and its probability distribution is known as a Discrete Probability Distribution.

( )

Just as we can depict a frequency distribution graphically, we can draw the GRAPH of a probability distribution.

EXAMPLE Going back to the probability distribution of the number of petals on the flowers of a particulars species, i.e.: No. of Petals X x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9

P(x)

0.05 0.10 0.20 0.30 0.25 0.075 0.025 1

This distribution can be represented in the form of a line chart.

Line Chart Representation of the Discrete Probability Distribution .30 Probability P(x)

.25 .20 .15 .10 .05 0

3

4

5

6

7 8 9 No. of Petals (x)

Evidently, this particular probability distribution is approximately symmetric.

In addition, this graph clearly shows that, just as in the case of a frequency distribution, every discrete probability distribution has a CENTRAL point and a SPREAD. Hence, similar to a frequency distribution, the discrete probability distribution has a MEAN and a STANDARD DEVIATION.

How do we calculate the mean and the standard deviation of a probability distribution? Let us first consider the computation of the MEAN:

We know that in the case of a frequency distribution such as X 1 2 3 4 5

f 1 2 4 2 1

the mean is given by

fX ∑ X= = ∑f

∑ Xf ∑f

In case of a discrete probability distribution, such as the one that we have been considering i.e. No. of Petals P(x) X x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9

the mean is given by:

0.05 0.10 0.20 0.30 0.25 0.075 0.025 1

XP( X ) ∑ XP( X ) ∑ µ = E( X ) = = = ∑ XP( X ) 1 ∑ p( X )

Hence we construct the column of XP(X), as shown below: No. of Petals x x1 = 3 x2 = 4 x3 = 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9 Total

P(x)

xP(x)

0.05 0.10 0.20 0.30 0.25 0.075 0.025 1

0.15 0.40 1.00 1.80 1.75 0.60 0.225 5.925

Hence Âľ = E(X) = â&#x2C6;&#x2018;XP(X) = 5.925 i.e. the mean of the given probability distribution is 5.925.

In other words, considering a very large number of flowers of that particular species, we would expect that, on the average, a flower contains 5.925 petals --- or, rounding this number, 6 petals.

This interpretation points to the reason why the mean of the probability distribution of a random variable X is technically called the EXPECTED VALUE of the random variable X. (â&#x20AC;&#x153;Given that the probability that the flower has 3 petals is 5%, the probability that the flower has 4 petals is 10%, and so ON, we EXPECT that on the average a flower contains 5.925 petals.)

Next, let us consider the computation of the STANDARD DEVIATION:

COMPUTATION OF THE STANDARD DEVIATION: Just as in case of a frequency distribution, we have f (X − X) ∑ S= ∑f 2

2

2

fX  ∑ fX  ∑ = −   = ∑f  ∑f 

2

∑ X f −  ∑ Xf    ∑f  ∑f 

2

Similarly, in case of a probability distribution, we have

σ = S.D.(X)

X P(X ) ∑XP( X )  ∑ = −  ∑P( X )  ∑P( X ) 

2

2

= ∑X P( X ) − [ ∑XP( X ) ] 2

  

 ∑ P ( X ) = 1 

2

In the above example No. of Petals x x1 = 3 x2 = 4 x3= 5 x4 = 6 x5 = 7 x6 = 8 x7 = 9 Total

2

P(x)

xP(x)

x P(x)

0.05 0.10 0.20 0.30 0.25 0.075 0.025 1

0.15 0.40 1.00 1.80 1.75 0.60 0.225 5.925

0.45 1.60 5.00 10.80 12.25 4.80 2.025 36.925

Hence: σ = S.D.( X ) = 36.925 − ( 5.925)

2

= 36.925 − 35.106 = 1.819 = 1.3

Graphical Representation: .30

Probability P(x)

.25 .20 .15 .10 .05 0

3

4

5

µ = 5.925

6

7

σ = 1.3

8

9 No. of Petals (x)

Now that we have both the mean and the standard deviation, we are in a position to compute the coefficient of variation of this distribution:

Coefficient of Variation

C.V.

σ = × 100 µ 1.3 = × 100 5.925 = 21.9 %

Let us consider another example to understand the concept of discrete probability distribution:

EXAMPLE a) Find the probability distribution of the sum of the dots when two fair dice are thrown. b) Use the probability distribution to find the probabilities of obtaining (i) a sum that is greater than 8, and (ii) a sum that is greater than 5 but less than or equal to 10.

SOLUTION a) The sample space S is represented by the following 36 outcomes:

S = {(1, 1), (1, 2), (1, 3), (1, 5), (1, 6); (2, 1), (2, 2), (2, 3), (2, 5), (2, 6); (3, 1), (3, 2), (3, 3), (3, 5), (3, 6); (4, 1), (4, 2), (4, 3), (4, 5), (4, 6); (5, 1), (5, 2), (5, 3), (5, 5), (5, 6); (6, 1), (6, 2), (6, 3), (6, 5), (6, 6) }

Since each of the 36 outcomes is equally likely to occur, therefore each outcome has probability 1/36.

Let X be the random variable representing the sum of dots which appear on the dice. Then the values of the r.v. are 2, 3, 4, â&#x20AC;Ś, 12.

The probabilities of these values are computed as below: 1 f(2) =P( X =2 ) =P[{1, 1}] = , 36

as there is only one outcome resulting in a sum of 2, 2 f(3) =P( X =3) =P[{(1, 2 ) , ( 2,1)}] = , 36 3 f(4) =P( X =4 ) =P[{(1, 3) , ( 2, 2 ) , (3,1)}] = , 36

Similarly

4 5 6 5 4 f(5) = , f ( 6 ) = , f ( 7 ) = , f ( 8) = , f ( 9 ) = , 36 36 36 36 36 3 2 1 f (10) = , f (11) = and f (12 ) = . 36 36 36

Therefore the desired probability distribution of the r.v. X is xi

2

3

4

5

6

7

8

9

10

11

12

f(xi)

1 2 3 4 5 6 5 4 3 2 1 36 36 36 36 36 36 36 36 36 36 36

The probabilities in the above table clearly indicate that if we draw the line chart of this distribution, we will obtain a triangular-shaped graph. The students are encouraged to draw the graph of this probability distribution, in order to be able to develop a visual picture in their minds.

b) Using the probability distribution, we get the required probabilities as follows:

i) P(a sum that is greater than 8) = P(X > 8) = P(X=9) + P(X=10) + P(X=11) + P(X=12) = f(9) + f(10) + f(11) + f(12)

4 3 2 1 10 + + + = = 36 36 36 36 36

ii)

P(a sum that is greater than 5 but less than or equal to 10) = P(5 < X < 10) = P(X = 6) + P(X = 7) + P(X = 8) + P(X = 9) + P(X = 10) = f(6) + f(7) + f(8) + f(9) + f(10)

5 6 5 4 3 23 + + + + = . = 36 36 36 36 36 36

Next, we consider the concept of the DISTRIBUTION FUNCTION of a discrete random variable:

DISTRIBUTION FUNCTION The distribution function of a random variable X, denoted by F(x), is defined by F(x) = P(X < x). The function F(x) gives the probability of the event that X takes a value LESS THAN OR EQUAL TO a specified value x.

The distribution function is abbreviated to d.f. and is also called the cumulative

distribution function (cdf)

as it is the cumulative probability function of the random variable X from the smallest value upto a specific value x.

Let us illustrate this concept with the help of the same example that we have been considering --- that of the probability distribution of the sum of the dots when two fair dice are thrown. As explained earlier, the probability distribution of this example is:

xi

2

3

4

5

6

7

8

9

f(xi)

1 2 3 4 5 6 5 4 3 2 1 36 36 36 36 36 36 36 36 36 36 36

F(x) = P(X < x) = p(X = 2) + p(X = 3) + p(X = 4) + p(x = 5) = 1/36 + 2/36 + 3/36 + 4/36 = 10/36

10

11

12

The term â&#x20AC;&#x2DC;distribution functionâ&#x20AC;&#x2122; implies the cumulation of the probabilities similar to the cumulation of frequencies in the case of the frequency distribution of a discrete variable.

xi

2

3

4

5

6

7

8

9

10

11

12

f(xi)

1 36

2 36

3 36

4 36

5 36

6 36

5 36

4 36

3 36

2 36

1 36

F(xi)

1 36

3 36

6 36

10 36

15 36

21 36

26 36

30 36

33 36

35 36

36 36

If we are interested in finding the probability that we obtain a sum of five or less, the column of cumulative probabilities immediately indicates that this 10 probability is . 36

In the next lecture, we will discuss the concept of distribution function in further detail, after which we will begin in a formal manner the concept of MATHEMATICAL EXPECTATION.

IN TODAY’S LECTURE, YOU LEARNT

• •

Bayes’ Theorem Discrete Random Variable • Discrete Probability Distribution • Graphical Representation of a Discrete Probability Distribution • Mean, Standard Deviation and Coefficient of Variation of a Discrete Probability Distribution • Distribution Function of a Discrete Random Variable

IN THE NEXT LECTURE, YOU WILL LEARN •

Graphical Representation of the Distribution Function of a Discrete Random Variable

Mathematical Expectation

Mean, Variance and Moments of a Discrete Probability Distribution

• Chebychev’s Inequality

STA301_LEC22