Course Outline

list High School / Advanced Statistics and Data Science I (ABC)

Book
  • High School / Advanced Statistics and Data Science I (ABC)
  • High School / Statistics and Data Science I (AB)
  • High School / Statistics and Data Science II (XCD)
  • High School / Algebra + Data Science (G)
  • College / Introductory Statistics with R (ABC)
  • College / Advanced Statistics with R (ABCD)
  • College / Accelerated Statistics with R (XCD)
  • CKHub: Jupyter made easy

12.10 Confidence Interval for \(\beta_0\)

We have spent a lot of time working with the confidence interval for \(b_1\) in the two-group model, the model that we used to explain variation in the tipping experiment. But we can create confidence intervals for other parameters as well.

We don’t typically create confidence intervals around F because the F-distribution is not symmetrical, making the confidence interval harder to interpret. But for any of the parameters we label with a \(\beta\) we can use the same methods to find the confidence interval. Let’s look at a few examples, starting with \(\beta_0\).

In the tipping study, we have put most of our emphasis on the confidence interval for the effect of smiley face on Tip, represented as \(\beta_1\). But we also are estimating another parameter in this two-group model: \(\beta_0\). The complete model we are trying to estimate looks like this:

\[Y_i=\beta_0+\beta_{1}X_i+\epsilon_i\]

The \(\beta_0\) parameter is the mean Tip for the control group in the study. If we fit and then run confint() on this model, we get 95% confidence intervals for both the \(\beta_0\) and \(\beta_1\) parameters.

require(coursekata) # we’ve created the model for you Condition_model <- lm(Tip ~ Condition, data = TipExperiment) # use confint to find the 95% confidence intervals for both parameters # we’ve created the model for you Condition_model <- lm(Tip ~ Condition, data = TipExperiment) # use confint to find the 95% confidence intervals for both parameters confint(Condition_model) ex() %>% check_function("confint") %>% check_result() %>% check_equal()

You’ve seen this output before when we used confint() to get the confidence interval for \(\beta_1\).

                         2.5 %   97.5 %
(Intercept)          22.254644 31.74536
ConditionSmiley Face -0.665492 12.75640

This time we will focus on the line labeled (Intercept), because that shows us the confidence interval for \(\beta_0\). (It’s called the intercept because it’s the predicted value of Tip when \(X_i=0\).)

\(\beta_0\) represents the average tip in the DGP for tables that do not get smiley faces. It’s the population mean for tables in the control group. The confidence interval around \(\beta_0\) acknowledges that although the best point estimate for the control group mean in the DGP is our \(b_0\), we are 95% confident that the true value is between 22.25 and 31.75 percentage points.

What if you wanted to find the confidence interval for \(\beta_0\) for the empty model of Tip? In other words, what would be the average percentage tipped by all tables (both control and smiley face) in the DGP? What is the confidence interval for this average tipping percentage? Again, we can use confint(), which can take in any type of model.

require(coursekata) # here’s code to find the best-fitting empty model of Tip empty_model <- lm(Tip ~ NULL, data = TipExperiment) # use confint with the empty model (instead of Condition_model) confint(Condition_model) # here’s code to find the best-fitting empty model of Tip empty_model <- lm(Tip ~ NULL, data = TipExperiment) # use confint with the empty model (instead of Condition_model) confint(empty_model) ex() %>% check_function("confint") %>% check_result() %>% check_equal()
               2.5 %   97.5 %
(Intercept) 26.58087 33.46459

In the table below we show the confint() output for both the condition model and the empty model.

confint(Condition_model)
confint(empty_model)
                         2.5 %   97.5 %
(Intercept)          22.254644 31.74536
ConditionSmiley Face -0.665492 12.75640
               2.5 %   97.5 %
(Intercept) 26.58087 33.46459

The condition model had two parameters (\(\beta_0\) and \(\beta_1\)) whereas the empty model had only one (\(\beta_0\)). confint() will calculate the confidence intervals for each parameter in the model so it will return different lines of output depending on the number of parameters.

Notice that the confidence interval around \(\beta_0\) from the empty model goes from 26.58 to 33.46, meaning that we can be 95% confident that the true mean tip percent in the DGP is between these two boundaries.These numbers are different from the confidence interval around \(\beta_0\) from the Condition model (22.25 and 31.75).

Responses