Formally now, we calculate how far the estimated mean
is likely to be from the true mean m for a sample of length n.
This difference is the
variance of the sample mean
and is given by
, where
![]() |
(17) | |
(18) |
![]() |
(19) | |
(20) | ||
(21) |
![]() |
(22) |
![]() |
(23) |
![]() |
(24) | |
(25) | ||
(26) |
For n weights, each of size 1/n, the standard deviation of the sample mean is
|
This is the most important property of random numbers that is not intuitively obvious. Informally, the result (27) says this: given a sum y of terms with random polarity, whose theoretical mean is zero, then
![]() |
(28) |
If we are trying to estimate the mean of a random series that has a time-variable mean, then we face a basic dilemma.
Including many numbers in the sum in order to make
small
conflicts with the possibility of seeing mt change during the measurement.
The
``variance of the sample variance'' arises in many contexts.
Suppose we want to measure the storminess of the ocean.
We measure water level as a function of time and subtract the mean.
The storminess is the variance about the mean.
We measure the storminess in one minute and call it a sample storminess.
We compare it to other minutes and other locations and we find
that they are not all the same.
To characterize these differences,
we need the variance of the sample variance
.
Some of these quantities can be computed theoretically, but the computations become very cluttered and dependent on assumptions that may not be valid in practice, such as that the random variables are independently drawn and that they have a Gaussian probability function. Since we have such powerful computers, we might be better off ignoring the theory and remembering the basic principle that a function of random numbers is also a random number. We can use simulation to estimate the function's mean and variance.
Basically we are always faced with the same dilemma: if we want to have an accurate estimation of the variance, we need a large number of samples, which limits the possibility of measuring a time-varying variance.