Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science: A Modeling Approach
-
segmentPART I: EXPLORING VARIATION
-
segmentChapter 1 - Welcome to Statistics: A Modeling Approach
-
segmentChapter 2 - Understanding Data
-
segmentChapter 3 - Examining Distributions
-
segmentChapter 4 - Explaining Variation
-
segmentPART II: MODELING VARIATION
-
segmentChapter 5 - A Simple Model
-
segmentChapter 6 - Quantifying Error
-
segmentChapter 7 - Adding an Explanatory Variable to the Model
-
segmentChapter 8 - Digging Deeper into Group Models
-
segmentChapter 9 - Models with a Quantitative Explanatory Variable
-
segmentPART III: EVALUATING MODELS
-
segmentChapter 10 - The Logic of Inference
-
segmentChapter 11 - Model Comparison with F
-
segmentChapter 12 - Parameter Estimation and Confidence Intervals
-
12.10 Confidence Interval for Beta0
-
segmentChapter 13 - What You Have Learned
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Advanced Statistics and Data Science I (ABC)
12.10 Confidence Interval for \(\beta_0\)
We have spent a lot of time working with the confidence interval for \(b_1\) in the two-group model, the model that we used to explain variation in the tipping experiment. But we can create confidence intervals for other parameters as well.
We don’t typically create confidence intervals around F because the F-distribution is not symmetrical, making the confidence interval harder to interpret. But for any of the parameters we label with a \(\beta\) we can use the same methods to find the confidence interval. Let’s look at a few examples, starting with \(\beta_0\).
In the tipping study, we have put most of our emphasis on the confidence interval for the effect of smiley face on Tip
, represented as \(\beta_1\). But we also are estimating another parameter in this two-group model: \(\beta_0\). The complete model we are trying to estimate looks like this:
\[Y_i=\beta_0+\beta_{1}X_i+\epsilon_i\]
The \(\beta_0\) parameter is the mean Tip
for the control group in the study. If we fit and then run confint()
on this model, we get 95% confidence intervals for both the \(\beta_0\) and \(\beta_1\) parameters.
require(coursekata)
# we’ve created the model for you
Condition_model <- lm(Tip ~ Condition, data = TipExperiment)
# use confint to find the 95% confidence intervals for both parameters
# we’ve created the model for you
Condition_model <- lm(Tip ~ Condition, data = TipExperiment)
# use confint to find the 95% confidence intervals for both parameters
confint(Condition_model)
ex() %>%
check_function("confint") %>%
check_result() %>%
check_equal()
You’ve seen this output before when we used confint()
to get the confidence interval for \(\beta_1\).
2.5 % 97.5 %
(Intercept) 22.254644 31.74536
ConditionSmiley Face -0.665492 12.75640
This time we will focus on the line labeled (Intercept)
, because that shows us the confidence interval for \(\beta_0\). (It’s called the intercept because it’s the predicted value of Tip
when \(X_i=0\).)
\(\beta_0\) represents the average tip in the DGP for tables that do not get smiley faces. It’s the population mean for tables in the control group. The confidence interval around \(\beta_0\) acknowledges that although the best point estimate for the control group mean in the DGP is our \(b_0\), we are 95% confident that the true value is between 22.25 and 31.75 percentage points.
What if you wanted to find the confidence interval for \(\beta_0\) for the empty model of Tip
? In other words, what would be the average percentage tipped by all tables (both control and smiley face) in the DGP? What is the confidence interval for this average tipping percentage? Again, we can use confint()
, which can take in any type of model.
require(coursekata)
# here’s code to find the best-fitting empty model of Tip
empty_model <- lm(Tip ~ NULL, data = TipExperiment)
# use confint with the empty model (instead of Condition_model)
confint(Condition_model)
# here’s code to find the best-fitting empty model of Tip
empty_model <- lm(Tip ~ NULL, data = TipExperiment)
# use confint with the empty model (instead of Condition_model)
confint(empty_model)
ex() %>%
check_function("confint") %>%
check_result() %>%
check_equal()
2.5 % 97.5 %
(Intercept) 26.58087 33.46459
In the table below we show the confint()
output for both the condition model and the empty model.
|
|
|
|
The condition model had two parameters (\(\beta_0\) and \(\beta_1\)) whereas the empty model had only one (\(\beta_0\)). confint()
will calculate the confidence intervals for each parameter in the model so it will return different lines of output depending on the number of parameters.
Notice that the confidence interval around \(\beta_0\) from the empty model goes from 26.58 to 33.46, meaning that we can be 95% confident that the true mean tip percent in the DGP is between these two boundaries.These numbers are different from the confidence interval around \(\beta_0\) from the Condition
model (22.25 and 31.75).