Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science II
-
segmentPART I: EXPLORING AND MODELING VARIATION
-
segmentChapter 1 - Exploring Data with R
-
segmentChapter 2 - From Exploring to Modeling Variation
-
segmentChapter 3 - Modeling Relationships in Data
-
segmentPART II: COMPARING MODELS TO MAKE INFERENCES
-
segmentChapter 4 - The Logic of Inference
-
segmentChapter 5 - Model Comparison with F
-
segmentChapter 6 - Parameter Estimation and Confidence Intervals
-
segmentPART III: MULTIVARIATE MODELS
-
segmentChapter 7 - Introduction to Multivariate Models
-
7.4 Interpreting the Parameter Estimates for a Multivariate Model
-
segmentChapter 8 - Multivariate Model Comparisons
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Statistics and Data Science II (XCD)
7.4 Interpreting the Parameter Estimates for a Multivariate Model
Using the Parameter Estimates to Make Predictions
We use the parameter estimates to make predictions in the same way as we did before, but this time we adjust our prediction based on two variables: what neighborhood a home is in (Neighborhood
) and the amount of living space in the home (HomeSizeK
).
Here is the best fitting model that we found on the previous page:
We use the parameter estimates to make predictions from the multivariate model in the same way we did for single predictor models: each estimate (other than
The model, which is a function, generates a predicted home price (in 1000s of dollars) by starting with the intercept (
Interpreting the Parameter Estimates
So far, the parameter estimates from the multivariate model might seem pretty similar to the parameter estimates from single-predictor models. But the estimates themselves are not exactly the same. This is because the estimates have a slightly different meaning in the multivariate model than they do in the single-predictor models.
In the two-predictor model, each home’s value is assumed to be a function of both Neighborhood
and HomeSizeK
. But these two things are not independent of each other. To help explain what we mean, take a look at the faceted histogram below. You’ve seen it before, but this time we colored two of the homes red.
These two homes happen to be the two most expensive homes in the Smallville
data frame. But what makes them expensive? Is it the neighborhood they are in, or is it that they also are among the larger homes in the data frame? The answer is that it probably is a little of both.
Because the two variables are related to each other (i.e., Downtown tends to have larger houses than Eastside), the parameter estimate for the single-predictor HomeSizeK
model (Neighborhood
.
When we add Neighborhood
into the model (making a multivariate model), R takes some of that adjustment that was originally attributed completely to HomeSizeK
and attributes it to Neighborhood
. Take a look at the
HomeSizeK
Model:
Because the parameter estimate for HomeSizeK
in the multivariate model knows that Neighborhood
is also in the model, it is a little less extreme in the multivariate model than in the single-predictor model.
To return to our cooking analogy, in the single-predictor HomeSizeK
model, home size did all the work in seasoning the predictions. But in the model with both HomeSizeK
and Neighborhood
, neighborhood will add a little bit of the seasoning that was previously added by home size. Because parameter estimates moderate the contributions of each variable, they are sometimes called “weights”.
|
|
In the single-predictor model, the