Course Outline

segmentGetting Started (Don't Skip This Part)

segmentStatistics and Data Science: A Modeling Approach

segmentPART I: EXPLORING VARIATION

segmentChapter 1  Welcome to Statistics: A Modeling Approach

segmentChapter 2  Understanding Data

segmentChapter 3  Examining Distributions

segmentChapter 4  Explaining Variation

segmentPART II: MODELING VARIATION

segmentChapter 5  A Simple Model

segmentChapter 6  Quantifying Error

segmentChapter 7  Adding an Explanatory Variable to the Model

segmentChapter 8  Digging Deeper into Group Models

segmentChapter 9  Models with a Quantitative Explanatory Variable

9.3 Interpreting the Parameter Estimates for a Regression Model

segmentPART III: EVALUATING MODELS

segmentChapter 10  The Logic of Inference

segmentChapter 11  Model Comparison with F

segmentChapter 12  Parameter Estimation and Confidence Intervals

segmentFinishing Up (Don't Skip This Part!)

segmentResources
list High School / Advanced Statistics and Data Science I (ABC)
9.3 Interpreting the Parameter Estimates for a Regression Model
Previously, we used the lm()
function to fit the Height
model of Thumb
and saved it as Height_model
:
Height_model < lm(Thumb ~ Height, data = Fingers)
Let’s now look at the parameter estimates for this model and see how to interpret them. Use the code block below to print out the parameter estimates for the height model.
library(coursekata)
# saves the Height model
Height_model < lm(Thumb ~ Height, data = Fingers)
# print it out
# saves the Height model
Height_model < lm(Thumb ~ Height, data = Fingers)
# print it out
Height_model
ex() %>% check_output_expr("Height_model")
Call:
lm(formula = Thumb ~ Height, data = Fingers)
Coefficients:
(Intercept) Height
3.3295 0.9619
The Intercept
corresponds to \(b_0\) and the Height
coefficient corresponds to \(b_1\). We can write our fitted model as:
\[Thumb_i=3.33 + 0.96Height_i+e_i\]
Or, equivalently, using GLM notation, it can be written:
\[Y_i=3.33 + 0.96X_i+e_i\]
\(b_0\), which is 3.33, is the yintercept. It’s the predicted \(Y_i\) (Thumb
) when \(X_i\) (Height
) equals 0.
Neither a height of 0 inches nor a thumb length of 3.33 mm are possible. Not all predictions from a regression model make sense. We should always be thinking about which values of the predictors, and which predictions, are reasonable.
How Regression Models Make Predictions
We can use the Height
model to predict the thumb length of students of different heights (just like we used the Height2Group
model to predict the thumb length of short and tall groups of students).
Recall that thumb length (and predicted thumb length) are expressed in millimeters. \(b_0\) (3.33) is the predicted thumb length in millimeters for a student with a height of 0 inches. If we stretch out the xaxis to include 0, we would expect the regression line to cross the yaxis at 3.33. (Notice, however, that in the plot below that there are no actual students who are 0 inches in height, for obvious reasons!)
The \(b_1\) estimate (0.96) is the slope: for every 1 unit increase in Height
, our model predicts a 0.96 unit increase in Thumb
. The fact that height is measured in inches and thumb length in millimeters is not a problem; the regression line is a function (the \(b_0 + b_1Height_i\) part) that takes in inches and then makes a prediction in millimeters. This means that students who are 1 inch taller are predicted by our model to have thumbs that are 0.96 millimeters longer (on average). Here’s a visual representation:
default scale  zooming in 



The predicted thumb length of a student who is 71 inches tall is 64.83 mm. This is the value of \(Y\) (Thumb
) on the regression line when \(X\) (Height
) is 71, as visualized below:
Regression Coefficients are Not Symmetrical
When you fit a regression model, it matters which variable is the outcome and which is the explanatory variable. For example, if you fit the model Thumb ~ Height
you won’t get the same yintercept and slope you would if you fit the model Height ~ Thumb
.
Coefficients: (Intercept) Height

Coefficients: (Intercept) Thumb

The reason for this is that the units, and the distributions of the variables, are different. If the outcome is Thumb
, then the slope is the adjustment to predicted thumb length for a oneinch increase in height. But if the outcome is height
, then the slope is the adjustment to predicted height length for a onemillimeter increase in thumb length. These are two entirely different things.