Statistics and Data Science: A Modeling Approach

11.1 PRE and F Ratio Revisited

Constructing confidence intervals around parameters is a perfectly fine way to test the difference between two models. But in this chapter we are going to explore a new method, the F test. The F test is probably the most widespread method for comparing statistical models, and by now you have all the concepts you need to understand how it works.

Let’s start by going back to the model we just fit, the one we saved in an object called Condition.model. In the DataCamp window below, add code to run supernova() to get the ANOVA table for Condition.model.

require(tidyverse) require(mosaic) require(Lock5Data) require(okcupiddata) require(supernova) Condition.model <- lm(Tip ~ Condition, data = TipExperiment) # add code to get the supernova table for this model Condition.model <- lm(Tip ~ Condition, data = TipExperiment) # add code to get the supernova table for this model supernova(Condition.model) ex() %>% check_output_expr("supernova(Condition.model)")
DataCamp: ch11-5

 Analysis of Variance Table
 Outcome variable: Tip
 Model: lm(formula = Tip ~ Condition, data = TipExperiment)
                              SS df     MS     F    PRE     p
 ----- ----------------- ------- -- ------ ----- ------ -----
 Model (error reduced) |  402.02  1 402.02 3.305 0.0729 .0762
 Error (from model)    | 5108.95 42 121.64                   
 ----- ----------------- ------- -- ------ ----- ------ -----
 Total (empty model)   | 5510.98 43 128.16

In Chapters 7 and 8 we introduced supernova() and spent considerable time developing the concept of PRE, and a little less time developing the F ratio.

PRE, or Proportional Reduction in Error, indicates the percentage of total variation in the outcome variable that can be explained by the more complex model (in this case, the two-group model we called Condition.model). The F ratio is closely related to PRE, though it takes into account the number of degrees of freedom used to fit the model. Both are ways of quantifying the strength of the relationship between the explanatory and outcome variables. You can think of F as a measure of the strength of a relationship (like PRE) per parameter included in the model.

The fact that the two-group model has a positive PRE is pretty much a given. Our data shows that adding Condition to the model helps explain .07 of the error that remained unexplained. But is that .07 meaningful? Is that a lot? The only way PRE would be equal to 0 is if there were no mean difference at all between the two groups in the sample. In that case, knowing which group a table was in (Smiley Face or Control) would add no predictive value, and thus result in 0 reduction in error.

But just as finding a difference between the two means does not by itself rule out the possibility that the true difference in the DGP could be 0, the same is true of PRE. Just because in the sample distribution the two-group model reduces error by .07, that doesn’t necessarily rule out the possibility that the true PRE in the population might be 0.

A sampling distribution of \(b_1\) help us put the observed difference of means in context. So far, we have generated sampling distribution by bootstrapping—resampling from a population with individuals represented in our sample. This is a hypothetical DGP with a true mean difference that is exactly the same as our sample.

But now we want to generate a sampling distribution to help us answer this question: could the mean difference observed by the researchers have occurred just by chance if the true mean difference in the DGP was 0?

Previously, we learned methods of simulation and bootstrapping to help us implement various DGPs that we dreamed up. How would we implement a situation where there was no relationship, that is, a \(\beta_1=0\), between Condition and Tip? One way to break any relationship in the data is to shuffle the values of Condition so that the Tip would be randomly related to whether there was a smiley face, or nothing, on the check.