Course Outline
-
segmentGetting Started (Don't Skip This Part)
-
segmentStatistics and Data Science: A Modeling Approach
-
segmentPART I: EXPLORING VARIATION
-
segmentChapter 1 - Welcome to Statistics: A Modeling Approach
-
segmentChapter 2 - Understanding Data
-
segmentChapter 3 - Examining Distributions
-
segmentChapter 4 - Explaining Variation
-
segmentPART II: MODELING VARIATION
-
segmentChapter 5 - A Simple Model
-
segmentChapter 6 - Quantifying Error
-
segmentChapter 7 - Adding an Explanatory Variable to the Model
-
segmentChapter 8 - Digging Deeper into Group Models
-
segmentChapter 9 - Models with a Quantitative Explanatory Variable
-
segmentPART III: EVALUATING MODELS
-
segmentChapter 10 - The Logic of Inference
-
segmentChapter 11 - Model Comparison with F
-
segmentChapter 12 - Parameter Estimation and Confidence Intervals
-
segmentChapter 13 - What You Have Learned
-
segmentFinishing Up (Don't Skip This Part!)
-
segmentResources
list High School / Advanced Statistics and Data Science I (ABC)
1.5 Save Your Work In R Objects
Have you ever had an experience where you have forgotten to save your work? It’s a terrible feeling. Saving your work is also important in R. In R, we don’t just do calculations and look at the results on the R console. We usually save the results of the calculations somewhere we can find them later.
Pretty much anything, including the results of any R function, can be saved in an R object. This is accomplished by using an assignment operator, which looks kind of like an arrow (<-
). You can make up any name you want for an R object. Most combinations of upper case letters, lower case letters, numbers, or even a period or underscore can be used in the names of R objects, so long as you start the name with a letter.
Here’s a simple example to show how it’s done. Let’s make up a name for an R object; we will call it my_favorite_number
. Then let’s think of what our favorite number is (say, 20), and save it in the R object. Go ahead and run the code below to see how this works.
# This code will assign the number 20 to the R object my_favorite_number
my_favorite_number <- 20
# you can revise the code to use your actual favorite number (if it's not 20).
# This code will assign the number 20 to the R object my_favorite_number
my_favorite_number <- 20
# you can revise the code to use your actual favorite number (if it's not 20).
ex() %>% {
check_object(., "my_favorite_number")
}
Notice that after you run the code my_favorite_number <- 20
nothing happens. That’s because you saved the number 20 in my_favorite_number
, but you didn’t tell R to print it out. Go back and add this line of code to the window above, then run it again:
my_favorite_number
Now it not only saves your favorite number, but prints it out. Notice that you don’t need to use the print()
function to print the contents of an R object; you can just type the name of the object.
Now remember, R is case sensitive. Try assigning 5 to num
and 10 to NUM
.
# Assign 5 to num and 10 to NUM
num <-
NUM <-
# Write the name of the object that contains 10 and then press the <Run> button
# Doing so prints out the contents of that object
# Assign 5 to num and 10 to NUM
num <- 5
NUM <- 10
NUM
# Write the name of the object that contains 10 and then press the <Run> button
# Doing so prints out the contents of that object
msg_undefined <- "Make sure to define both variables: num and NUM."
msg_incorrect <- "Make sure you assign the correct value to each variable."
msg_not_print <- "Don't forget to print out the object that contains 10."
ex() %>% {
check_object(., 'num', msg_undefined) %>% check_equal(msg_incorrect)
check_object(., 'NUM', msg_undefined) %>% check_equal(msg_incorrect)
check_output_expr(., "NUM", missing_msg = msg_not_print)
}
NOTE: When you save an R object in one of the code windows it will only be saved until you leave the page. If you re-load the page later it won’t be there.
Vectors
We’ve used R objects so far to store a single number. But in statistics we are dealing with variation, which by definition means more than one—and sometimes many—numbers. An R object can also store a whole set of numbers, called a vector. You can think of a vector as a list of numbers (or values).
The R function c()
can be used to combine a list of individual values into a vector. You could think of the “c” as standing for “combine.” So in the following code we have created two vectors (we just named them my_vector
and my_vector_2
) and put a list of values into each vector.
# Here is the code to create two vectors my_vector and my_vector_2. We just made up those names.
# Run the code and see what happens
my_vector <- c(1,2,3,4,5)
my_vector_2 <- c(10,10,10,10,10)
# Now write some code to print out these two vectors in the R console. Run the code and see what happens.
# Run the code and see what happens
my_vector <- c(1,2,3,4,5)
my_vector_2 <- c(10,10,10,10,10)
# Now write some code to print out these two vectors in the R console. Run the code and see what happens.
my_vector # or print(my_vector)
my_vector_2 # or print(my_vector_2)
ex() %>% {
check_object(., 'my_vector')
check_object(., 'my_vector_2')
check_output_expr(., "my_vector")
check_output_expr(., "my_vector_2")
}
If you ask R to perform an operation on a vector, it will assume that you want to work with the whole vector, not just one of the numbers.
So if you want to multiply each number in my_vector
by 100, then you can just write my_vector * 100
. Try it in the code window below.
my_vector <- c(1, 2, 3, 4, 5)
# write code to multiply each number in my_vector by 100
my_vector <- c(1, 2, 3, 4, 5)
# write code to multiply each number in my_vector by 100
my_vector * 100
ex() %>% {
check_object(., "my_vector") %>% check_equal()
check_operator(., "*") %>% check_result() %>% check_equal()
}
Notice that when you do a calculation with a vector, you’ll get a vector of numbers as the answer, not just a single number.
After you multiply my_vector
by 100, what will happen if you print out my_vector
? Will you get the original vector (1,2,3,4,5), or one that has the hundreds (100,200,300,400,500)? Try running this code to see what happens.
# Run the code below to see what happens
my_vector <- c(1,2,3,4,5)
my_vector * 100
# This will print out my_vector
my_vector
# Run the code below to see what happens
my_vector <- c(1,2,3,4,5)
my_vector * 100
# This will print out my_vector
my_vector
ex() %>% {
check_object(., "my_vector") %>% check_equal(incorrect_msg = "Make sure not to change the contents of my_vector")
check_operator(., "*") %>% check_result() %>% check_equal(incorrect_msg = "Make sure to keep the line my_vector * 100")
check_output_expr(., "my_vector", missing_msg = "Did you print my_vector?")
}
Remember, R will do the calculations, but if you want something saved, you have to assign it somewhere. Try writing some code to compute my_vector * 100
and then assign the result back into my_vector
. If you do this, it will replace the old contents of my_vector
with the new contents (i.e., the product of my_vector
and 100).
require(coursekata)
my_vector <- c(1,2,3,4,5)
# This creates `my_vector` and stores 1, 2, 3, 4, 5 in it
my_vector <- c(1,2,3,4,5)
# Now write code to save `my_vector * 100` back into `my_vector`
my_vector <-
# This creates `my_vector` and stores 1, 2, 3, 4, 5 in it
my_vector <- c(1,2,3,4,5)
# Now write code to save `my_vector * 100` back into `my_vector`
my_vector <- my_vector * 100
ex() %>% {
check_operator(., "*") %>% check_result() %>% check_equal()
check_object(., "my_vector") %>% check_equal()
}
There may be times when you just want to know one of the values in a vector, not all of the values. We can index a position in the vector by using brackets with a number in it like this: [1]
. So if we wanted to print out the contents of the first position in my_vector
, we could write my_vector[1]
.
require(coursekata)
my_vector <- c(1,2,3,4,5)
my_vector <- my_vector * 100
# Write code to get the 4th value in my_vector
# Write code to get the 4th value in my_vector
my_vector[4]
ex() %>% check_output_expr("my_vector[4]", missing_msg = "Have you used `[4]` to print out the 4th number in `my_vector`?")
Many functions will take in a vector as the input. For example, try using sum()
to total up the five values saved in my_vector
. Note that we have already saved some values in my_vector
for you.
require(coursekata)
my_vector <- c(100,200,300,400,500)
# Use sum() to total up the values in my_vector
# Use sum() to total up the values in my_vector
sum(my_vector)
ex() %>% {
check_object(., "my_vector")
check_function(., "sum", not_called_msg = "don't forget to use the sum() function") %>% check_result() %>% check_equal(incorrect_msg = "did you call sum() on my_vector?")
}
We will learn about other R objects that help us organize and visualize data as we go along in the class.