Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science II
-
segmentPART I: EXPLORING AND MODELING VARIATION
-
segmentChapter 1 - Exploring Data with R
-
segmentChapter 2 - From Exploring to Modeling Variation
-
segmentChapter 3 - Modeling Relationships in Data
-
segmentPART II: COMPARING MODELS TO MAKE INFERENCES
-
segmentChapter 4 - The Logic of Inference
-
segmentChapter 5 - Model Comparison with F
-
segmentChapter 6 - Parameter Estimation and Confidence Intervals
-
segmentPART III: MULTIVARIATE MODELS
-
segmentChapter 7 - Introduction to Multivariate Models
-
segmentChapter 8 - Multivariate Model Comparisons
-
segmentChapter 9 - Models with Interactions
-
9.8 Comparing the Interaction Model to the Additive Model (Part 2)
-
segmentChapter 10 - More Models with Interactions
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Statistics and Data Science II (XCD)
9.8 Comparing the Interaction Model to the Additive Model (Part 2)
Interaction Term Row (condition: base_0
)
The rows underneath the Model row (condition
, base_0
, and condition:base_0
) are interpreted in the same way as they were for additive models. Each row compares the full model (in this case the interaction model) with a simpler model that excludes just the term on that row (what we called a targeted model comparison in the previous chapters).
The interaction model row (labeled condition:base_0
in the ANOVA table) provides a targeted comparison of the interaction model to the simpler, additive model, which includes everything except the interaction term. This row can help us decide whether or not to include the interaction term in the final model, and specifically, whether the additional PRE we get by adding the interaction term could simply be the result of random chance.
Analysis of Variance Table (Type III SS)
Model: later_anxiety ~ condition + base_0 + condition * base_0
SS df MS F PRE p
---------------- | ------- -- ------- ------ ------ -----
Model | 477.433 3 159.144 34.991 0.5675 .0000
condition | 86.527 1 86.527 19.025 0.1921 .0000
base_0 | 232.976 1 232.976 51.224 0.3904 .0000
condition:base_0 | 7.922 1 7.922 1.742 0.0213 .1907
Error | 363.853 80 4.548
---------------- | ------- -- ------- ------ ------ -----
Total | 841.286 83 10.136
What does it mean to compare the interaction model to the additive model? Imagine that we have fit the additive model and explained all the error we can with that model. The remaining unexplained error is the SS Error (372) found in the ANOVA table for the additive model. The interaction row shows us that adding the interaction term to the model further reduces this error by 7.992.
The PRE of the interaction row tells us the proportion of the remaining error (372) that is reduced by adding the interaction term to the model (7.922). If we divide (7.922/372) we get the PRE of .02, as reported on the interaction row of the table.
Let’s now consider the p-value for the interaction term, which is .19. The p-value helps us compare two models of the DGP. In this row, the two models of the DGP being compared are the interaction model of the DGP and the additive model of the DGP.
This p-value reinforces our intuition that maybe it’s not worth including the interaction in the final model. The p-value of .19 means that if in the DGP the true effect of the interaction is equal to 0 (i.e., \(\beta_3=0\)), then there is a .19 probability that we would observe a PRE of .02 or larger for the interaction effect just by chance. Since this PRE (and the F, too) could well have occurred just by chance, we won’t bother including the interaction term in our final model.
Interpreting the base_anxiety
and condition
Rows in the ANOVA Table
Finally, we turn our attention to the base_anxiety
and condition
rows in the ANOVA table for the interaction model. Bottom line: these do not have the same interpretation as they do for the additive model. In fact, it’s typically best to just ignore these rows for interaction models.
Analysis of Variance Table (Type III SS)
Model: later_anxiety ~ condition + base_0 + condition * base_0
SS df MS F PRE p
---------------- | ------- -- ------- ------ ------ -----
Model | 477.433 3 159.144 34.991 0.5675 .0000
condition | 86.527 1 86.527 19.025 0.1921 .0000
base_0 | 232.976 1 232.976 51.224 0.3904 .0000
condition:base_0 | 7.922 1 7.922 1.742 0.0213 .1907
Error | 363.853 80 4.548
---------------- | ------- -- ------- ------ ------ -----
Total | 841.286 83 10.136
Why should we ignore them? The short answer is that the models being compared on these rows are weird.
For example, the condition
row compares the full interaction model (the complex model) with a simpler one that includes all terms of the interaction model except condition
(represented in the notation below as Dog_i
):
Complex model: \(\text{later}_i = b_0 + b_1\text{Dog}_i + b_2\text{base0}_i + b_3\text{Dog}_i * \text{base0}_i + e_i\)
Simple model: \(\text{later}_i = b_0 + b_1\text{base0}_i + b_2\text{Dog}_i * \text{base0}_i + e_i\)
Notice that the simple model in this case still has a term that includes condition – the interaction term (\(\text{Dog}_i*\text{base0}_i\)) – even though the main effect of condition
has been removed. The comparison is weird because the effect of condition isn’t entirely removed from the simple model, which makes that row of the table very difficult to interpret.
In the ANOVA tables for additive models, the rows for each variable have clear meaning: they tell us how much PRE is lost by removing each variable from the model. But for interaction models, the rows don’t have such a clear meaning, and so it’s fine to ignore them.