Course Outline

list High School / Advanced Statistics and Data Science I (ABC)

Book
  • High School / Advanced Statistics and Data Science I (ABC)
  • High School / Statistics and Data Science I (AB)
  • High School / Statistics and Data Science II (XCD)
  • High School / Algebra + Data Science (G)
  • College / Introductory Statistics with R (ABC)
  • College / Advanced Statistics with R (ABCD)
  • College / Accelerated Statistics with R (XCD)
  • CKHub: Jupyter made easy

2.5 Measurement

Measurement is the process of turning variation in the world into data. When we measure, we assign numbers or category labels to some sample of cases in order to represent some attribute or dimension along which the cases vary.

Let’s make this more concrete by looking at some more measurements, in a data set called Fingers. A sample of college students filled in an online survey in which they were asked a variety of basic demographic questions. They also were asked to measure the length of each finger on their right hand.

require(coursekata) Fingers <- Fingers %>% mutate_if(is.factor, as.numeric) %>% arrange(desc(Sex)) %>% {.[1, "FamilyMembers"] <- 2; . } %>% {.[1, "Height"] <- 62; . } # A way to look at a data frame is to type its name # Look at the data frame called Fingers # A way to look at a data frame is to type its name # Look at the data frame called Fingers Fingers ex() %>% check_output_expr("Fingers")

You’ll notice that trying to look at the whole data frame can be very cumbersome, especially for larger data sets.

require(coursekata) Fingers <- Fingers %>% mutate_if(is.factor, as.numeric) %>% arrange(desc(Sex)) %>% {.[1, "FamilyMembers"] <- 2; . } %>% {.[1, "Height"] <- 62; . } # Remember the head() command? # Use it to look at the first six rows of Fingers # Remember the head() command? # Use it to look at the first six rows of Fingers head(Fingers) ex() %>% check_output_expr("head(Fingers)", missing_msg = "Did you call `head()` with `Fingers`?")
  Sex RaceEthnic FamilyMembers SSLast Year Job MathAnxious Interest GradePredict Thumb Index Middle Ring Pinkie Height Weight
1   2          3             2     NA    3   1           4        1          3.3 66.00  79.0   84.0 74.0   57.0     62    188
2   2          3             4      9    2   2           5        3          4.0 58.42  76.2   91.4 76.2   63.5     70    145
3   2          3             2      3    2   2           2        3          4.0 70.00  80.0   90.0 70.0   65.0     69    175
4   2          1             5      7    2   1           1        3          3.7 59.00  83.0   87.0 79.0   64.0     72    155
5   2          5             2      9    3   1           5        3          4.0 64.00  76.0   89.0 76.0   69.0     70    180
6   2          3             7   7037    3   1           5        2          3.3 67.00  83.0   95.0 86.0   75.0     71    145

The command head() shows you the first six rows of a data frame, but if you wanted to look at a different number of rows, you can just add in a number at the end like this.

require(coursekata) Fingers <- Fingers %>% mutate_if(is.factor, as.numeric) %>% arrange(desc(Sex)) %>% {.[1, "FamilyMembers"] <- 2; . } %>% {.[1, "Height"] <- 62; . } # Try it and see what happens head(Fingers, 3) # Try it and see what happens head(Fingers, 3) ex() %>% check_function("head") %>% check_arg("n") %>% check_equal()
  Sex RaceEthnic FamilyMembers SSLast Year Job MathAnxious Interest GradePredict Thumb Index Middle Ring Pinkie Height Weight
1   2          3             2     NA    3   1           4        1          3.3 66.00  79.0   84.0 74.0   57.0     62    188
2   2          3             4      9    2   2           5        3          4.0 58.42  76.2   91.4 76.2   63.5     70    145
3   2          3             2      3    2   2           2        3          4.0 70.00  80.0   90.0 70.0   65.0     69    175

Notice that to answer these questions, you need to know something about how these numbers were measured. You need to know: Was Height measured with inches? What number represents which Sex? Does FamilyMembers include the person answering the question? (Sex can be a controversial variable1 but in the case of the Fingers data set, students answered this question by selecting one of two categories.)

We will be talking a lot about what measurements mean throughout the class. But before we go on, let’s learn one more way to take a quick look at a data frame.

require(coursekata) Fingers <- Fingers %>% mutate_if(is.factor, as.numeric) %>% arrange(desc(Sex)) %>% {.[1, "FamilyMembers"] <- 2; . } %>% {.[1, "Height"] <- 62; . } # Try using tail() to look at the last 6 rows of the Fingers data frame. # Try using tail() to look at the last 6 rows of the Fingers data frame. tail(Fingers) ex() %>% check_function("tail") %>% check_result() %>% check_equal()
    Sex RaceEthnic FamilyMembers SSLast Year Job MathAnxious Interest GradePredict Thumb Index Middle Ring Pinkie Height Weight
152   1          4             7      6    3   1           5        2          3.0    59    69     79   72     56   67.5    193
153   1          4             7      3    3   1           5        2          3.0    50    71     78   75     57   65.5    145
154   1          4             8   2354    2   2           3        2          2.7    64    70     76   70     51   59.0    114
155   1          4             3    789    1   1           4        2          2.7    50    70     85   74     55   64.0    165
156   1          3             8      0    3   2           4        2          3.7    57    67     73   65     55   63.0    125
157   1          1             6     NA    2   1           5        3          3.3    56    69     76   72     60   72.0    133

  1. Many people use sex and gender interchangeably, but in truth, they’re distinct concepts. Sex is a classification based on biological characteristics, including DNA and anatomy. Gender refers to the socially constructed roles, behaviors, expressions, and identities of girls, women, boys, men, and gender diverse people. There is some evidence to suggest that both sex and gender are not made up of binary categories but instead expressed on a spectrum. Many people’s bodies possess a combination of physical characteristics typically thought of as biologically “male” or “female.” It’s been estimated that babies with intersex traits may be as high as 2% of live births (Blackless et al., 2000). However, sex is often measured as a binary and categorical variable in publicly available dataframes included in this textbook. This may change as researchers develop new methods of measuring sex.

Responses