Chapter 9
Probability and Integration
9.3 Continuous Probability
Section Summary
We started this section by considering models for data distributed continuously over an interval, finite or infinite. Such a model consists of
- a distribution function `F` with the property that `F text[(] t text[)]` is the probability that a random data value is less than `t`, and
- a probability density function `f` that is the derivative of `F`.
For such a model the probability that a random data value lies between `a` and `b` is
In particular the integral of over the entire domain must be `1`. The expected value for a random data value from a set with this distribution is the integral of `t f text[(] t text[)]` over the entire domain. Our initial examples of continuous distributions were the exponential, Cauchy, and uniform distributions.
We moved on to investigate the most important class of data distributions — normal distributions. We saw that the standard deviation of a data set is a measure of the spread of the data around the mean. Any data set can be standardized by subtracting the mean from each data value and then dividing each of the resulting values by the standard deviation. If the original data set was normally distributed, then the standardized data set is normally distributed with mean `0` and standard deviation `1` — the standard normal distribution. Thus questions about the original data set may be recast as questions about the standard normal distribution.
The density function for the standard normal distribution is given by the formula
where `c=0.3989`. The corresponding distribution function is
Although we do not have a formula for this distribution function in terms of elementary functions, we can calculate its values by numerical approximation to the integral or by using a computer algebra system. Such a system will describe the distribution function in terms of a new (to us) function, the error function. In the next chapter we investigate how such a system might evaluate this function.