Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science: A Modeling Approach
-
segmentPART I: EXPLORING VARIATION
-
segmentChapter 1 - Welcome to Statistics: A Modeling Approach
-
segmentChapter 2 - Understanding Data
-
segmentChapter 3 - Examining Distributions
-
segmentChapter 4 - Explaining Variation
-
segmentPART II: MODELING VARIATION
-
segmentChapter 5 - A Simple Model
-
segmentChapter 6 - Quantifying Error
-
segmentChapter 7 - Adding an Explanatory Variable to the Model
-
7.6 Graphing Residuals From the Model
-
segmentChapter 8 - Digging Deeper into Group Models
-
segmentChapter 9 - Models with a Quantitative Explanatory Variable
-
segmentPART III: EVALUATING MODELS
-
segmentChapter 10 - The Logic of Inference
-
segmentChapter 11 - Model Comparison with F
-
segmentChapter 12 - Parameter Estimation and Confidence Intervals
-
segmentChapter 13 - What You Have Learned
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Advanced Statistics and Data Science I (ABC)
7.6 Graphing Residuals From the Model
You might wonder, why are we bothering to generate and save residuals? There are a lot of reasons but one short answer is: it helps us to understand the error around our model, and can suggest ways of improving the model.
Just as the first thing we do when looking at a data set is to examine the distributions of the variables, it is good to get in the habit of examining the distributions of residuals after we fit a new model.
In the following window, we have provided the code to create histograms of Thumb
in a facet grid by Gender
. Try modifying it to generate histograms of Gender_resid
in a facet grid by Gender
. Compare the histograms of residuals from the Gender_model
with histograms of thumb length.
require(coursekata)
# this creates the residuals from the Gender_model
Gender_model <- lm(Fingers$Thumb ~ Fingers$Gender)
Fingers$Gender_resid <- resid(Gender_model)
# this creates histograms of Thumb for each Gender
# modify it to create histograms of Gender_resid for each Gender
gf_histogram(~Thumb, data = Fingers) %>%
gf_facet_grid(Gender ~ .)
# this creates the residuals from the Gender_model
Gender_model <- lm(Fingers$Thumb ~ Fingers$Gender)
Fingers$Gender_resid <- resid(Gender_model)
# this creates histograms of Thumb for each Gender
# modify it to create histograms of Gender_resid for each Gender
gf_histogram(~Gender_resid, data = Fingers) %>%
gf_facet_grid(Gender ~ .)
ex() %>% {
check_or(.,
check_function(., "gf_histogram") %>% {
check_arg(., "object") %>% check_equal()
check_arg(., "data") %>% check_equal()
},
override_solution(., "gf_histogram(Fingers, ~ Gender_resid)") %>%
check_function("gf_histogram") %>% {
check_arg(., "object") %>% check_equal()
check_arg(., "gformula") %>% check_equal()
}
)
check_function(., "gf_facet_grid") %>%
check_arg("...") %>%
check_equal(incorrect_msg = "Make sure you keep the code to create a grid faceted by `Gender`")
}
Here we’ve depicted the histograms of Thumb
by Gender
(in teal) next to the histograms of Gender_resid
by Gender
(in darker gray).
Thumb
|
Gender_resid
|
---|---|
|
|
The residuals of the Gender_model
represent the variation leftover after taking out the part of the variation that can be explained by Gender
. The figures below show the mean Thumb
length and mean Gender_resid
of the two Gender
groups.
mean Thumb of each group
|
mean Gender_resid of each group
|
---|---|
|
|