Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science: A Modeling Approach
-
segmentPART I: EXPLORING VARIATION
-
segmentChapter 1 - Welcome to Statistics: A Modeling Approach
-
segmentChapter 2 - Understanding Data
-
segmentChapter 3 - Examining Distributions
-
segmentChapter 4 - Explaining Variation
-
segmentPART II: MODELING VARIATION
-
segmentChapter 5 - A Simple Model
-
segmentChapter 6 - Quantifying Error
-
segmentChapter 7 - Adding an Explanatory Variable to the Model
-
segmentChapter 8 - Models with a Quantitative Explanatory Variable
-
segmentPART III: EVALUATING MODELS
-
segmentChapter 9 - The Logic of Inference
-
segmentChapter 10 - Model Comparison with F
-
segmentChapter 11 - Parameter Estimation and Confidence Intervals
-
11.6 What Affects the Width of the Confidence Interval
-
segmentChapter 12 - What You Have Learned
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list Statistics and Data Science (ABC)
11.6 What Affects the Width of the Confidence Interval
Because our goal is to gain a more accurate picture of the DGP, it would be better to have a narrower confidence interval than a wider one. If the interval is narrower, then we will have less uncertainty in our parameter estimate, and we can make more accurate predictions about future samples. For this reason, it’s worth thinking for a bit about what determines the width of the confidence interval.
Level of Confidence
We’ve focused a lot on an alpha of .05 (when evaluating the empty model or null hypothesis), and on the corresponding 95% confidence interval. Hopefully we’ve convinced you that those two go together. But .05 and 95% are not the only criteria we could use. We could have a 99% or 90% confidence interval, or any other level.
The desired level of confidence will affect the width of the confidence interval. Given the same data, if we want to have more confidence that the DGP falls within a specified range, we will have to make our confidence interval wider.
Consider this extreme example: if we want to be 100% confident that the true value of
The more confidence we want (99%), the wider the interval will have to be. But how much wider?
Using confint()
For Different Confidence Levels
You can use the confint()
function to calculate the 90% or 99% confidence intervals (or any other level of confidence) by simply adding the argument level = .90
(or .99) to the code below. (The default if you leave off this argument is .95.)
confint(Condition_model,level=.90)
Try calculating the 90% and 99% confidence interval for the Condition
model’s parameters by modifying the code below. Observe: how much wider is the 99% confidence interval?
require(coursekata)
# here we have saved the Condition model
Condition_model <- lm(Tip ~ Condition, data=TipExperiment)
# modify these to find the 90% and 99% CI
confint(Condition_model)
confint(Condition_model)
# here we have saved the Condition model
Condition_model <- lm(Tip ~ Condition, data=TipExperiment)
# modify these to find the 90% and 99% CI
confint(Condition_model, level = .90)
confint(Condition_model, level = .99)
ex() %>% {
check_function(., "confint", index = 1) %>%
check_result() %>%
check_equal()
check_function(., "confint", index = 2) %>%
check_result() %>%
check_equal()
}
The lower bound of the 99% confidence interval is now -2.93, and the upper bound is 15.02. By increasing our confidence, we also increased the size of the confidence interval.
More Exploration of Level of Confidence and Width of the Interval
The unlikely tails are smaller in the sampling distributions used to determine the 99% confidence interval (versus the 95% one). In order to make sure the sample
The animation above shows us that as we move the lower and upper bounds apart (thus moving their respective sampling distributions apart), the tails beyond the sample
Let’s take a closer look at this idea by just looking at the lower bound of the confidence interval for two different levels of confidence.
![]() |
![]() |
Take a look at the left panel of the figure above. To find the lower bound of the 95% confidence interval we constructed a sampling distribution and slid this sampling distribution down until we found the
If we change the desired level of confidence to 99%, the tails that we define as unlikely will be smaller. With a smaller tail (now .005), we need to slide the sampling distribution further down (to the left) until the sample
In order to go from a 95% confidence interval to a 99% interval we need to move the sampling distribution from the upper bound up and from the lower bound down thus moving the sampling distributions farther apart, making the 99% confidence interval wider relative to the 95% confidence interval.
Standard Error
Other than the level of confidence, the other factor that affects the width of the confidence interval is the standard error. The greater the standard error – i.e., the wider the sampling distribution – the wider the confidence interval will be.
We can illustrate this idea in the pictures below. In the first picture, we again have illustrated the confidence interval for Condition
model in the tipping study. We constructed a sampling distribution, then moved it down and up until the sample
Now, if we artificially make the standard error lower (e.g., revising it from 3.3 down to 2.0), you can see in the picture below that the two sampling distributions get narrower. If we don’t move their centers from the previous lower and upper bounds, you can see that the sample
To find the 95% confidence interval, we want to make the sample
In general, therefore, as standard error gets smaller, the confidence interval gets narrower, and as standard error increases, the confidence interval gets wider.
What Affects Standard Error?
There are two things that affect standard error. One is the standard deviation of the outcome variable, in this case Tip
, in the DGP. This is something you have little control over, unless you are designing the outcome measure and can make it less subject to measurement error.
The other thing that has a major effect on standard error is the sample size in the study. We’ve looked at this before when we looked at the effect of increasing the number of tables studied from n=44 to n=88. The larger the sample, the lower the standard error. For this reason, if you want less uncertainty in your estimate of
In the code window below, try your hand at calculating the 95% confidence interval for the original data with 44 tables, and for the doubled data set with 88 tables (TipExp2
). We predict the one calculated from 88 tables will be narrower.
require(coursekata)
TipExp2 <- rbind(TipExperiment, TipExperiment)
# this calculates the confidence interval from the original 44 tables
confint(lm(Tip ~ Condition, data = TipExperiment))
# calculate the confidence interval for TipExp2 containing 88 tables
# this calculates the confidence interval from the original 44 tables
confint(lm(Tip ~ Condition, data = TipExperiment))
# calculate the confidence interval for TipExp2 containing 88 tables
confint(lm(Tip ~ Condition, data = TipExp2))
ex() %>% {
check_function(., "confint", index = 1) %>%
check_result() %>%
check_equal()
check_function(., "confint", index = 2) %>%
check_result() %>%
check_equal()
}