Chapter 12 - Parameter Estimation and Confidence Intervals

12.1 From Hypothesis Testing to Confidence Intervals

In previous chapters our focus has been on using data to evaluate the empty model of the DGP. We created sampling distributions based on the empty model, then asked whether we could reject the empty model based on our data. If the evidence was not strong enough to justify rejecting the empty model, we would keep the empty model as a plausible model. If we rejected the empty model, on the other hand, we would adopt the complex model we had made to fit the data.

The problem with this approach is that it only considers two possible models of the DGP, one in which \(\beta_1\) is 0, and one in which \(\beta_1\) is the same as the \(b_1\) estimate (e.g., 6.05 in the tipping study). But deep down, we know that both models might be wrong.

In the tipping study, we failed to reject the empty model, even though the tables who got the smiley faces tipped 6.05 percentage points more than the other tables. This evidence was not strong enough to cause us to reject the empty model, but does that mean the true \(\beta_1\) is actually 0? It’s possible. But in fact, there are many possible values of \(\beta_1\) that would be consistent with the data.

In this chapter we will use the same sampling distributions we have been using for model comparison, but use them in a more flexible way to ask a different question: What is the range of possible values for the parameter we are trying to estimate? In the case of the tipping study, it’s nice to know that the true \(\beta_1\) in the DGP might be 0, but what else might it be? If our best estimate, based on the data, is 6.05, we want to know how accurate that estimate might be and how much uncertainty we should have about the estimate.

Reviewing the Null Hypothesis Test Using \(b_1\)

Let’s start by reviewing the logic behind the null hypothesis test, i.e., the way we evaluated the empty model. As shown in the picture below, we start by imagining a world where the empty model is true, where there is no effect of smiley face on Tip. We represent this idea by putting a 0 in the red box on the upper blue line, which is where we put our hypotheses about the true \(\beta_1\) in the DGP.

It is important to remember that we don’t know if \(\beta_1 = 0\) or not. We are simply hypothesizing that it’s 0 so that we can then work out the implications that might follow from such a world. Later we will hypothesize other values of \(\beta_1\), moving the red box right and left to represent larger or smaller values of \(\beta_1\).

Based on the assumption that \(\beta_1 = 0\), we used shuffle() to create a sampling distribution (shown as a light blue histogram in the figure above) that shows us the variation in sample \(b_1\)s that would be expected to occur just by chance alone if the empty model is true. (This sampling distribution is approximately normal in shape, and is usually modeled with the t-distribution. We show the t-distribution as a smooth curve overlaid on the histogram.)

Having created a sampling distribution, we next located the sample \(b_1\) (6.05) in relation to the sampling distribution. The sample \(b_1\), which we have represented with a black dot on the bottom blue line, which represents the data, is not something we imagine or hypothesize. It’s the parameter estimate the researchers calculated from the sample data. It is fixed and can’t be moved.

Because the sample \(b_1\) did not fall in the .05 outer tails (our alpha criterion) of this sampling distribution, we decided to not reject the empty model (or null hypothesis). The p-value was approximately .08, meaning that if the empty model were true there would be a .08 chance of getting a sample as extreme as the sample \(b_1\) just by chance.

11.15 Chapter 11 Review Questions 2 12.2 Thinking With Sampling Distributions