list

# Statistics and Data Science: A Modeling Approach

## 10.5 Interpreting Confidence Intervals

To summarize where we are: confidence intervals are constructed around our parameter estimate—in the case of the empty model, the sample mean. What a 95% confidence interval tells us is that there is a 95% likelihood that the interval contains the true parameter (so, the mean of the population). The interval is typically symmetrical with respect to the sample mean, extending the same distance below the sample mean as it does above it.

The size of a confidence interval tells us how much fluctuation there is in our parameter estimate. It can be expressed in the original units of measurement (e.g., mm) or in terms of number of standard errors above and below the mean. The larger the standard error, the wider the confidence interval. We also realized that the confidence interval (because it is dependent on the standard error) is determined in part by 1) our degrees of freedom, and 2) the standard deviation of the population.

### Units of the Confidence Interval

The actual size of the 95% confidence interval is in the units of the estimate. In the case of the empty model of thumb length, the 95% confidence interval is shown below.

confint(Empty.model)
               2.5 %   97.5 %
(Intercept) 58.72794 61.47938

Try computing the 95% confidence interval for PoundsLost by housekeepers in the MindsetMatters data frame. Remember, the confidence interval is computed based on model estimates, so fit and print the empty model first.

 packages <- c("mosaic", "Lock5withR", "Lock5Data", "supernova", "ggformula", "okcupiddata") lapply(packages, library, character.only = T) MindsetMatters$PoundsLost <- MindsetMatters$Wt2 - MindsetMatters$Wt   # Fit and save the empty model for PoundsLost # Print your empty model # Compute the confidence interval around this estimate   # Fit and save the empty model for PoundsLost Empty.model <- lm(PoundsLost ~ NULL, data = MindsetMatters) # Print your empty model Empty.model # Compute the confidence interval around this estimate confint(Empty.model)   ex() %>% { check_object(., "Empty.model") %>% check_equal() check_output_expr(., "Empty.model") check_function(., "confint") %>% check_result() %>% check_equal() }  You'll want to use lm() and confint() DataCamp: ch10-17 Compute the margin of error (in pounds) around the estimate of $$b_{0}$$ using this confidence interval.  packages <- c("mosaic", "Lock5withR", "Lock5Data", "supernova", "ggformula", "okcupiddata") lapply(packages, library, character.only = T) MindsetMatters <- MindsetMatters %>% mutate(PoundsLost = Wt2 - Wt)   # This saves the empty model Empty.model <- lm(PoundsLost ~ NULL, data = MindsetMatters) # Compute the margin of error   # This saves the empty model Empty.model <- lm(PoundsLost ~ NULL, data = MindsetMatters) # There are many ways to calculate the margin of error # One way: confint(Empty.model)[] - mean(MindsetMatters$PoundsLost) # Another way: (confint(Empty.model)[] - confint(Empty.model)[]) / 2   ex() %>% check_output_expr("confint(Empty.model)[] - mean(MindsetMatters\$PoundsLost)") 
Remember, you can find the margin of error by dividing the distance of the confidence interval by 2 or by subtracting the mean from either the upper or lower bound of the confidence interval.
DataCamp: ch10-18

 0.631077

You can think of this confidence interval (-1.7 to -.44) as telling us about the variability in our estimate. Even though the average pounds lost in the sample of housekeepers was -1.07 pounds, we are reasonably confident that the true population mean could be as low as -1.7 and as high as -.44.