Probability Chapter 5: Central Limit Theorem
5.1 Introduction
The normal distribution is one of the most important distributions in the world of statistics and data science. Many problems involve the normal distribution—even when the underlying data is not normally distributed. Why is this?
The answer lies within the Central Limit Theorem (CLT).
5.2 What the Central Limit Theorem Says
The Central Limit Theorem states that when \(n\) is large, the distribution of the standardized sum approximately follows the standard normal curve, regardless of the common distribution of \(X\).
The CLT justifies the widespread use of the normal distribution in statistical procedures such as confidence intervals and hypothesis testing. Even when the underlying population is skewed, discrete, or heavy-tailed, the sampling distribution of the sample mean becomes approximately normal as the sample size grows.
5.3 Illustration: Sampling from a Skewed Distribution
Imagine sampling from a skewed distribution, such as an exponential distribution. The histogram of a small number of sample means will still resemble the skewed distribution. As you increase the sample size, the histogram becomes more symmetric and bell-shaped. By the time \(n\) is large enough, the sampling distribution of the sample mean is often close enough to normal for practical purposes.
Example:
Suppose you are conducting an environmental study measuring the concentration of a contaminant in different water samples. The actual distribution of contaminant levels might be heavily skewed. However, if you collect 40 samples and compute the average concentration, the distribution of these sample averages (if the experiment is repeated many times) will be approximately normal, allowing you to make reliable statistical inferences.
5.4 Conditions for the CLT
To apply the Central Limit Theorem, a few conditions must be met:
- The random variables must be independent.
- They must be identically distributed with a finite mean \(\mu\) and finite variance \(\sigma^2\).
- For non-normal distributions, a larger sample size (typically \(n \geq 30\)) is recommended for the approximation to hold.
5.5 Dice Example
Let’s say we roll a six-sided die, where outcomes 1 through 6 are equally likely. The expected outcome is 3.5. If we roll the die once, the outcome is not normally distributed.
However, if we roll it 30 times and compute the average, then repeat that entire process 1,000 times, the histogram of those 1,000 sample means will approximate a normal distribution.
5.6 CLT vs. Law of Large Numbers (LLN)
The Law of Large Numbers (LLN) tells us that the sample mean converges to the population mean as \(n \to \infty\), but it doesn’t tell us how the sample mean is distributed.
The CLT fills that gap by describing the distribution of the sample mean when \(n\) is large.
5.7 Standard Error
The standard deviation of the sampling distribution is known as the standard error:
\[ \text{SE} = \frac{\sigma}{\sqrt{n}} \]
This becomes essential when applying the CLT in practice, as it quantifies the variability of the sample mean from one sample to another.