Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science II
-
segmentPART I: EXPLORING AND MODELING VARIATION
-
segmentChapter 1 - Exploring Data with R
-
segmentChapter 2 - From Exploring to Modeling Variation
-
segmentChapter 3 - Modeling Relationships in Data
-
segmentPART II: COMPARING MODELS TO MAKE INFERENCES
-
segmentChapter 4 - The Logic of Inference
-
segmentChapter 5 - Model Comparison with F
-
segmentChapter 6 - Parameter Estimation and Confidence Intervals
-
segmentPART III: MULTIVARIATE MODELS
-
segmentChapter 7 - Introduction to Multivariate Models
-
segmentChapter 8 - Multivariate Model Comparisons
-
segmentChapter 9 - Models with Interactions
-
segmentChapter 10 - More Models with Interactions
-
10.8 Thinking of Factorial Models in Terms of Intercepts and Slopes
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Statistics and Data Science II (XCD)
10.8 Thinking of Factorial Models in Terms of Intercepts and Slopes
In earlier examples of multivariate models (e.g., those with one categorical and one continuous predictor, or those with two continuous predictors), we used y-intercepts and slopes to differentiate between additive and interaction models. For additive models, the y-intercepts were allowed to differ, but the slopes were constrained to be equal. These models looked like parallel lines. For interaction models, both the y-intercepts and slopes were allowed to vary (i.e., the lines were allowed to be non-parallel).
In factorial models, like the one predicting tip_percent
with condition and gender (both categorical variables), we don’t typically draw lines that have slopes. (Some people even think it’s wrong to do so!) But doing so can help us, by analogy, deepen our understanding of the distinction between additive and interaction models.
In the figure below we re-graphed the same data from the tipping experiment, but this time connected the model predictions for each gender (female and male) with a different line. On the left we have represented the model predictions of the additive model, and on the right, the interaction model.
Additive Model | Interaction Model |
---|---|
|
|
Previously, we had established that interaction models can be thought of as models with multiple “lines”, one for each value of a predictor variable, with each line having its own y-intercept and slope. We can use this same idea to think about interaction models with categorical predictors.
Using the ANOVA Table to Compare the Interaction Model to the Additive Model
Finally, let’s look at the ANOVA table for the interaction model to get a quantitative comparison of the interaction model to the additive model. In the code window below, generate the ANOVA table for the interaction model. Also include the argument verbose = FALSE
to generate a slightly less wordy ANOVA table.
require(coursekata)
# no models have been created for you
# generate the ANOVA table for the interaction model (remember to use verbose=FALSE)
# no models have been created for you
# generate the ANOVA table for the interaction model (remember to use verbose=FALSE)
supernova(lm(tip_percent ~ condition * gender, data = tip_exp), verbose = FALSE)
# alternatively: supernova(lm(tip_percent ~ gender * condition, data = tip_exp), verbose = FALSE)
ex() %>% check_or(
check_function(., "supernova") %>%
check_result() %>%
check_equal(),
override_solution(., "supernova(lm(tip_percent ~ gender * condition, data = tip_exp), verbose = FALSE)") %>%
check_function("supernova") %>%
check_result() %>%
check_equal()
)
Analysis of Variance Table (Type III SS)
Model: tip_percent ~ condition * gender
SS df MS F PRE p
---------------- | --------- -- -------- ------ ------ -----
Model | 3194.544 3 1064.848 11.394 0.2868 .0000
condition | 421.742 1 421.742 4.513 0.0504 .0365
gender | 291.965 1 291.965 3.124 0.0355 .0807
condition:gender | 660.006 1 660.006 7.062 0.0767 .0094
Error | 7943.854 85 93.457
---------------- | --------- -- -------- ------ ------ -----
Total | 11138.398 88 126.573
We can see from the ANOVA table that the interaction term accounts for about .08 of the error remaining after fitting the additive model. This is the additional explanatory power we get for spending the additional degree of freedom.
Moreover, the p-value for the condition:gender
effect is quite low: .009. This means that the likelihood of getting a PRE for the interaction term as high as .08 (as observed) if the additive model is true in the DGP is less than 1 percent. We will probably want to adopt the interaction model.