The Normal Distribution
Section 4.4

 

A distribution that is nearly symmetric with most of the data in the center resembling a bell-shaped curve is a normal distribution.

 

Discrete Variables: values that are distinct and separate, i.e. number of children is discrete.  Values that can be counted: {0, 1, 2, 3, . . .}

 

Continuous Variables: any value in an interval of the real number system, not distinct or separate values, i.e. height is continuous.

 

Characteristics of a Normal Distribution:

  1. Frequency of data points occur more in the middle and less when farther away from the center.

  2. The distribution is symmetric, one side mirrors the other side.

 

Because of #1 and #2, the mean, median and mode all occur at the center in a normal distribution.

Also, in a normal distribution, more than 2/3 of data (68.26%) falls within one standard deviation of the mean.  And 95.44% of the data falls within two standard deviations of the mean.  And 99.74% of the data lie within three standard deviations of the mean.

Notation for Sample Data (Subset):

Variance = s2

Standard deviation = s

Mean = x

 

For Population Data (Universal Set):

 

Variance =  S

Standard deviation = s  (lowercase sigma)

Mean = m     (lower case mu)

 

Recall relative frequency was a type of probability, i.e. If there are 14 English Majors in a group of 100 students, then 

        rf = 14/100 = .14 = 14%

 

The bell-shaped curve represents probability with the area under the curve = 1

        P(S) = 1 = 100%

 

Most of the data (a large percentage) falls in the middle where the area is greatest (dense).  Since normal distributions are symmetric, 50% falls to the right of the center (mean m ) and 50% falls to the left.

        P( x < m ) = .5                        P( x > m ) = .5

 

                        _________________|_____|__________|___________

                                                            a      m                     b

 

To find probability of  

        x < a    or         x > b    or         a < x < b

Find the area under the curve corresponding to the endpoints.

 

In order to find the area under the curve, mathematicians have standardized the bell curve so that the mean m  = 0 and the curve covers three standard deviations.

 

 

 

 

 

                        __________________________________

                        -3         -2         -1         0          1          2          3         

 

A standardized normal distribution is called a z-distribution, using values of z to differentiate from any normal distribution using x.  The z-value is the standard deviation from m  = 0. 

Appendix VI is a body table.  It gives the probability of an interval in the body (middle) of the bell curve.

 

Example 1: 

Using the body table, find:

a)     P(0 < z < .75) = .2734

 

b)     P(z > 1.2)             Subtract the tail from .5        

        .5 – .3849 = .1151

That means ~ 11.5% lie to the right of 1.2

 

c) P( 15 < z < 1.35)               Take the area of  P(0< z < .15) from the area P(0 < z < 1.35)

        P(0 < z < 1.35) – P(0< z < .15) = .4115 – .0596 = .3519

 

d) P(-1.56 < z < 2.11)           Add the Left and Right areas together.

 

                                                ___________________________________

 

Since the normal distribution is symmetric, then 

        P(-1.56 < z < 0) = P(0 < z < 1.56)

 

        P (0 < z < 1.56) + P(0 < z < 2.11)    = .4406 + .4826 = .9232

So 92% of the z values fall in this interval.


Converting to Standard Normal Distribution.

Every value x in a normal distribution has a corresponding value z in a standard normal distribution.  In order to use the body table, x values must be converted to z values.

 

        Z = x – m                    m  = mean
                s

        s                      s  = standard deviation

       

Example 2: 

Let 

        s = 3,    m = 65

Find

        P(62 < x < 70)

 

When x = 62,             

         z = 62 – 65  = - 3  = -1
                   3                          3

When x = 70,              

        z = 70 – 65  = 5  = 1.67
                3                          3

So we must find 

        P(-1 < z < 1.67) = P(0 < z < 1) + P(0 < z < 1.67)

        = .3413 + .4525 = .7938

 

Application: 

Given heights of men:

        m = 68",        s = 4"

a)     find the percentage of men over 6 feet tall, find P(x > 72)

b)     find the percentage of men between 66 inches and 71 inches tall, find P(66 < x < 71)

 

a)     z  =  72 – 68  =  4  =  1 
            4            4

Find 

        P(z > 1) = .5 – P(0 < z < 1) = .5 –  .3413 = .1587 ~ 16%

 

b)     z = 66 – 68  = -2 = -1 = .5                       z = 71 – 68  =  3 = .75
          4            4      2                                          4           4      

 

Find 

        P(-.5 < z < .75)  = P(0 < z < .5) + P(0 < z < .75) 

        = .1915 + .2734 = .4646

~ 46% of the men are between 5’6” and 5’11”.

 

How to find z values given their probabilities (percentages)? Going backwards.

 

Example 3: 

Find P(0 < z < c ) = .4495

Look on body table for area = .4495, the z value that corresponds = 1.64  So 

        P(0 < z < 1.64) = .4495

 

Find P(z > c) = .6950 = .5 + .1950

Look on body table for area = .1950 gives us z = .51, but our value is to the left of m, so

        Z = - .51          P(z > - .51) = .6950

 

How to find x values given their probabilities  (percentages)?

How tall is a man if he is taller than 85% of the men?  So he is at the top 15% of the distribution.

Find x when P(x > c) = .15

First look on body table to find z value when area is

.5 - .15 = .35         

        Z = 1.04          Use conversion equation to find x.

        1.04  =  x – 68         4.16  =  x – 68         
                        4     

        x = 72.16       

A man 6 ft tall will be taller than 85% of the men.    

 


Back to Statistics Main Page

Back to the Survey of Math Ideas Home Page

Back to the Math Department Home Page

e-mail Questions and Suggestions