Statistics and Data Science: A Modeling Approach
9.8 Exploring the Properties of Sampling Distributions
By now you should have some idea of what a sampling distribution is and how to use one for thinking about the accuracy of an estimate such as the mean.
If you have been paying attention, you may have started to notice some patterns among sampling distributions of the mean. In particular:
Shape: Regardless of the hypothesized shape of the population distribution (e.g., die throws were uniform in shape, thumb lengths were normal) the sampling distribution tends to have a normal shape.
Center: At least in our simulations, the mean of the sampling distribution of means seems to have the same mean as the hypothesized population distribution.
Spread: Sampling distributions of the mean appear to be less variable (less spread out) than the population distributions that they come from.
In this section we want to explore whether these three properties of sampling distributions of the mean are always true, or just happen to have been true in the particular simulations we have done.
We also want to explore some other sampling distributions, of statistics other than the mean. We can imagine a sampling distribution of any statistic we can calculate based on a sample (e.g., variance or median). Do these sampling distributions have the same properties as the sampling distribution of means?
Effect of the Shape of the Population on the Shape of the Distributions of Means
Let’s do some experiments to investigate how the shape of a population distribution affects the distribution of means.
Here’s a video with Dr. Ji where we try this out with uniform, skewed, normal, and crazy distributions. The following video uses this simulation available here: (http://onlinestatbook.com/stat_sim/sampling_dist/). Press the “begin” button on the upper left to begin the simulation.
Effect of Sample Size on the Shape of the Distributions of Means
So far, we have this kind of miraculous idea that all sampling distributions might be normal. But notice in the prior video, we only took samples of n = 25. What about small samples versus large samples? We’ll experiment with sample size in the next video.
Effect of Sample Size on the Mean of the Sampling Distribution and Standard Error
We’ve focused a lot on the shape of the sampling distribution. Now let’s turn our attention to finding some patterns about the center and spread of sampling distributions.
So the whole point of making models is trying to figure out what is going on at the population level. Let’s use our simulations (where we can know everything about the population in advance) to see if there are any relationships between the population mean and standard deviation and the sampling distribution mean and standard deviation.
Sampling Distributions of Other Statistics
We can imagine that a sampling distribution of any summary statistic can exist (e.g., variance, standard deviation, median, etc). But will it be similar to the sampling distribution of the mean?