Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science: A Modeling Approach
-
segmentPART I: EXPLORING VARIATION
-
segmentChapter 1 - Welcome to Statistics: A Modeling Approach
-
segmentChapter 2 - Understanding Data
-
segmentChapter 3 - Examining Distributions
-
segmentChapter 4 - Explaining Variation
-
segmentPART II: MODELING VARIATION
-
segmentChapter 5 - A Simple Model
-
segmentChapter 6 - Quantifying Error
-
segmentChapter 7 - Adding an Explanatory Variable to the Model
-
segmentChapter 8 - Digging Deeper into Group Models
-
segmentChapter 9 - Models with a Quantitative Explanatory Variable
-
segmentPART III: EVALUATING MODELS
-
segmentChapter 10 - The Logic of Inference
-
segmentChapter 11 - Model Comparison with F
-
segmentChapter 12 - Parameter Estimation and Confidence Intervals
-
segmentPART IV: MULTIVARIATE MODELS
-
segmentChapter 13 - Introduction to Multivariate Models
-
segmentChapter 14 - Multivariate Model Comparisons
-
segmentChapter 15 - Models with Interactions
-
segmentChapter 16 - More Models with Interactions
-
16.8 Thinking of Factorial Models in Terms of Intercepts and Slopes
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list College / Advanced Statistics with R (ABCD)
16.8 Thinking of Factorial Models in Terms of Intercepts and Slopes
In earlier examples of multivariate models (e.g., those with one categorical and one continuous predictor, or those with two continuous predictors), we used y-intercepts and slopes to differentiate between additive and interaction models. For additive models, the y-intercepts were allowed to differ, but the slopes were constrained to be equal. These models looked like parallel lines. For interaction models, both the y-intercepts and slopes were allowed to vary (i.e., the lines were allowed to be non-parallel).
In factorial models, like the one predicting tip_percent
with condition and gender (both categorical variables), we don’t typically draw lines that have slopes. (Some people even think it’s wrong to do so!) But doing so can help us, by analogy, deepen our understanding of the distinction between additive and interaction models.
In the figure below we re-graphed the same data from the tipping experiment, but this time connected the model predictions for each gender (female and male) with a different line. On the left we have represented the model predictions of the additive model, and on the right, the interaction model.
Additive Model | Interaction Model |
---|---|
|
|
Previously, we had established that interaction models can be thought of as models with multiple “lines”, one for each value of a predictor variable, with each line having its own y-intercept and slope. We can use this same idea to think about interaction models with categorical predictors.
Using the ANOVA Table to Compare the Interaction Model to the Additive Model
Finally, let’s look at the ANOVA table for the interaction model to get a quantitative comparison of the interaction model to the additive model. In the code window below, generate the ANOVA table for the interaction model. Note that we haven’t created a model for you. You can create one and then run supernova()
or just create the model right within the supernova()
function. Also, you can try including the argument verbose = FALSE
in the supernova()
function to generate a slightly less wordy ANOVA table.
require(coursekata)
# no models have been created for you
# generate the ANOVA table for the interaction model
# no models have been created for you
# generate the ANOVA table for the interaction model
supernova(lm(tip_percent ~ condition * gender, data = tip_exp), verbose = FALSE)
# alternatively: supernova(lm(tip_percent ~ gender * condition, data = tip_exp), verbose = FALSE)
ex() %>% check_or(
check_function(., "supernova") %>%
check_result() %>%
check_equal(),
override_solution(., "supernova(lm(tip_percent ~ gender * condition, data = tip_exp), verbose = FALSE)") %>%
check_function("supernova") %>%
check_result() %>%
check_equal()
)
Analysis of Variance Table (Type III SS)
Model: tip_percent ~ condition * gender
SS df MS F PRE p
---------------- | --------- -- -------- ------ ------ -----
Model | 3194.544 3 1064.848 11.394 0.2868 .0000
condition | 421.742 1 421.742 4.513 0.0504 .0365
gender | 291.965 1 291.965 3.124 0.0355 .0807
condition:gender | 660.006 1 660.006 7.062 0.0767 .0094
Error | 7943.854 85 93.457
---------------- | --------- -- -------- ------ ------ -----
Total | 11138.398 88 126.573
We can see from the ANOVA table that the interaction term accounts for about .08 of the error remaining after fitting the additive model. This is the additional explanatory power we get for spending the additional degree of freedom.
Moreover, the p-value for the condition:gender
effect is quite low: .009. This means that the likelihood of getting a PRE for the interaction term as high as .08 (as observed) if the additive model is true in the DGP is less than 1 percent. We will probably want to adopt the interaction model.