Course Outline

list High School / Statistics and Data Science II (XCD)

Book
  • High School / Advanced Statistics and Data Science I (ABC)
  • High School / Statistics and Data Science I (AB)
  • High School / Statistics and Data Science II (XCD)
  • College / Statistics and Data Science (ABC)
  • College / Advanced Statistics and Data Science (ABCD)
  • College / Accelerated Statistics and Data Science (XCDCOLLEGE)
  • Skew the Script: Jupyter

9.2 Additive versus Non-Additive Models

What makes a model additive?

Jitter plot of later_anxiety predicted by base_anxiety. The points have been colored based on condition (Control vs Dog). The regression lines for the additive model are overlaid.

In the plot above, the data are represented by dots (each dot represents a person). The additive model is represented by the two regression lines, one for each of the two conditions (Dog and Control). The lines represent the model predictions for later_anxiety based on both condition and base_anxiety.

What makes this model additive can be seen in the graph: the two regression lines are parallel to each other. In a non-additive model, by contrast, the two lines would not be parallel (as shown in the figure below).

Additive Model Non-Additive Model

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the regression lines for the additive model are represented. The lines are parallel.

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the model lines for the non-additive model are represented. The lines are not parallel and funnel slightly away from each other as base_anxiety goes up.

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the regression lines for the additive model are represented. The lines are parallel and are shown with three arrows along the beginning, middle, and end of the lines pointing from the top line for the control condition to the bottom line for the dog condition. The arrows are all the same length.

In the additive model, the predicted effect of condition on later_anxiety is the same, regardless of patients’ base_anxiety on arriving at the ER. No matter the base anxiety (e.g., 2, 5, 8), having a therapy dog results in a roughly 2-point reduction in later anxiety (shown by the vertical black lines above), which corresponds to the lm() function’s parameter estimate for conditionDog of -2.03.

Likewise, each additional point of base anxiety has the same effect on later anxiety regardless of which group a patient is in, which is another way of saying that the slopes of the two regression lines are the same: 0.81.

The parallel lines show that the difference between the dog and control groups is constant across all levels of base anxiety. Models are additive when the effect of one predictor variable on an outcome does not change based on values of a second predictor variable.

By writing the R code lm(later_anxiety ~ condition + base_anxiety, data = er) we are telling R to fit an additive model to the data. The additivity (you can think of it as parallelism) is a characteristic of the model; it may or may not be true of the data or of the DGP.

The Additive Model in GLM Notation

The additivity of the model is also represented in the GLM equation:

\[\text{later}_i=b_0+b_1\text{Dog}_i+b_2\text{base}_i\]

(Note we’ve abbreviated \(\text{later_anxiety}\) and \(\text{base_anxiety}\) into \(\text{later}\) and \(\text{base}\) to make the equations look a little shorter. We’ve similarly shortened the dummy variable \(\text{conditionDog}\) to \(\text{Dog}\). \(\text{Dog}\) is coded 1 if the patient was in the dog therapy condition, 0 if not.)

Notice that there is a part of this equation that represents the y-intercept and a part that represents the slope.

\[\text{later}_i=\underbrace{b_0+b_1\text{Dog}_i}_{\text{y-intercept}}+\underbrace{b_2}_{\text{slope}}\text{base}_i\]

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the regression lines for the additive model are represented. The lines are parallel. The intercept of the top line for the control condition is labeled b-sub-zero, and the slope of both lines is labeled as b-sub-2. The intercept of the bottom line for the dog condition is labeled as b-sub-zero plus b-sub-one.

Because the model is additive, the regression lines are constrained to be parallel. For this reason, a single parameter estimate (\(b_2\)) is all that is needed to represent the slope of both lines.

The group difference is represented by the different intercepts of the two parallel lines. For patients in the control group the intercept is \(b_0\). For patients in the dog group the intercept is \(b_0+b_1\). The average difference between the two conditions is \(b_1\), and it is not affected by base anxiety.

\[\text{later}_i = b_0 + \underbrace{b_1\text{Dog}_i}_{\substack{\text{adjustment to} \\ \text{y-intercept} \\ \text{when Dog}_i = 1 }} + b_2\text{base}_i\]

Imagining a Non-Additive Model

You now have some sense of what an additive model is. But what is the alternative to an additive model? One alternative is what generally is referred to as an interaction model.

Additive Model Interaction Model

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the regression lines for the additive model are represented. The lines are parallel and are shown with three arrows along the beginning, middle, and end of the lines pointing from the top line for the control condition to the bottom line for the dog condition. The arrows are all the same length.

Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the model lines for the interaction model are represented. The lines are not parallel and funnel slightly away from each other as base_anxiety goes up. The model lines are shown with three arrows along the beginning, middle, and end of the lines pointing from the top line for the control condition to the bottom line for the dog condition. The arrows are not all the same length; the first arrow has a small distance between the two lines and the third arrow has the largest distance between the two lines.


Jitter plot of later_anxiety predicted by base_anxiety, but without any data points. Only the model lines for the interaction model are represented. There are arrows showing the distance between the two model lines with labels explaining that the reductions near the intercept are smaller than the ones that are further away from the intercept.

More specifically, the interaction model depicted above predicts that for each unit increase in base anxiety, having a therapy dog assigned will result in a bigger reduction in later anxiety (shown by how the vertical black lines increase in size, left to right).

Definition of an interaction: An interaction model is one in which the relationship of one predictor to the outcome variable depends on the value of a second predictor.

In a model like this one, with one quantitative and one categorical predictor, the interaction of the two predictors will result in two regression lines that are not parallel to each other.

If we choose to go with the interaction model, and if someone asks us, “What’s the relationship between base anxiety and later anxiety?”, we would have to answer like this: “It depends on which condition they were in.” This is the way it works with interactions.

Responses