Course Outline

list High School / Advanced Statistics and Data Science I (ABC)

Book
  • High School / Advanced Statistics and Data Science I (ABC)
  • High School / Statistics and Data Science I (AB)
  • High School / Statistics and Data Science II (XCD)
  • High School / Algebra + Data Science (G)
  • College / Introductory Statistics with R (ABC)
  • College / Advanced Statistics with R (ABCD)
  • College / Accelerated Statistics with R (XCD)
  • CKHub: Jupyter made easy

1.5 Save Your Work In R Objects

Have you ever had an experience where you have forgotten to save your work? It’s a terrible feeling. Saving your work is also important in R. In R, we don’t just do calculations and look at the results on the R console. We usually save the results of the calculations somewhere we can find them later.

Pretty much anything, including the results of any R function, can be saved in an R object. This is accomplished by using an assignment operator, which looks kind of like an arrow (<-). You can make up any name you want for an R object. Most combinations of upper case letters, lower case letters, numbers, or even a period or underscore can be used in the names of R objects, so long as you start the name with a letter.

Here’s a simple example to show how it’s done. Let’s make up a name for an R object; we will call it my_favorite_number. Then let’s think of what our favorite number is (say, 20), and save it in the R object. Go ahead and run the code below to see how this works.

# This code will assign the number 20 to the R object my_favorite_number my_favorite_number <- 20 # you can revise the code to use your actual favorite number (if it's not 20). # This code will assign the number 20 to the R object my_favorite_number my_favorite_number <- 20 # you can revise the code to use your actual favorite number (if it's not 20). ex() %>% { check_object(., "my_favorite_number") }

Notice that after you run the code my_favorite_number <- 20 nothing happens. That’s because you saved the number 20 in my_favorite_number, but you didn’t tell R to print it out. Go back and add this line of code to the window above, then run it again:

my_favorite_number

Now it not only saves your favorite number, but prints it out. Notice that you don’t need to use the print() function to print the contents of an R object; you can just type the name of the object.

Now remember, R is case sensitive. Try assigning 5 to num and 10 to NUM.

# Assign 5 to num and 10 to NUM num <- NUM <- # Write the name of the object that contains 10 and then press the <Run> button # Doing so prints out the contents of that object # Assign 5 to num and 10 to NUM num <- 5 NUM <- 10 NUM # Write the name of the object that contains 10 and then press the <Run> button # Doing so prints out the contents of that object msg_undefined <- "Make sure to define both variables: num and NUM." msg_incorrect <- "Make sure you assign the correct value to each variable." msg_not_print <- "Don't forget to print out the object that contains 10." ex() %>% { check_object(., 'num', msg_undefined) %>% check_equal(msg_incorrect) check_object(., 'NUM', msg_undefined) %>% check_equal(msg_incorrect) check_output_expr(., "NUM", missing_msg = msg_not_print) }

NOTE: When you save an R object in one of the code windows it will only be saved until you leave the page. If you re-load the page later it won’t be there.

Vectors

We’ve used R objects so far to store a single number. But in statistics we are dealing with variation, which by definition means more than one—and sometimes many—numbers. An R object can also store a whole set of numbers, called a vector. You can think of a vector as a list of numbers (or values).

The R function c() can be used to combine a list of individual values into a vector. You could think of the “c” as standing for “combine.” So in the following code we have created two vectors (we just named them my_vector and my_vector_2) and put a list of values into each vector.

# Here is the code to create two vectors my_vector and my_vector_2. We just made up those names. # Run the code and see what happens my_vector <- c(1,2,3,4,5) my_vector_2 <- c(10,10,10,10,10) # Now write some code to print out these two vectors in the R console. Run the code and see what happens. # Run the code and see what happens my_vector <- c(1,2,3,4,5) my_vector_2 <- c(10,10,10,10,10) # Now write some code to print out these two vectors in the R console. Run the code and see what happens. my_vector # or print(my_vector) my_vector_2 # or print(my_vector_2) ex() %>% { check_object(., 'my_vector') check_object(., 'my_vector_2') check_output_expr(., "my_vector") check_output_expr(., "my_vector_2") }

If you ask R to perform an operation on a vector, it will assume that you want to work with the whole vector, not just one of the numbers.

So if you want to multiply each number in my_vector by 100, then you can just write my_vector * 100. Try it in the code window below.

my_vector <- c(1, 2, 3, 4, 5) # write code to multiply each number in my_vector by 100 my_vector <- c(1, 2, 3, 4, 5) # write code to multiply each number in my_vector by 100 my_vector * 100 ex() %>% { check_object(., "my_vector") %>% check_equal() check_operator(., "*") %>% check_result() %>% check_equal() }

Notice that when you do a calculation with a vector, you’ll get a vector of numbers as the answer, not just a single number.

After you multiply my_vector by 100, what will happen if you print out my_vector? Will you get the original vector (1,2,3,4,5), or one that has the hundreds (100,200,300,400,500)? Try running this code to see what happens.

# Run the code below to see what happens my_vector <- c(1,2,3,4,5) my_vector * 100 # This will print out my_vector my_vector # Run the code below to see what happens my_vector <- c(1,2,3,4,5) my_vector * 100 # This will print out my_vector my_vector ex() %>% { check_object(., "my_vector") %>% check_equal(incorrect_msg = "Make sure not to change the contents of my_vector") check_operator(., "*") %>% check_result() %>% check_equal(incorrect_msg = "Make sure to keep the line my_vector * 100") check_output_expr(., "my_vector", missing_msg = "Did you print my_vector?") }

Remember, R will do the calculations, but if you want something saved, you have to assign it somewhere. Try writing some code to compute my_vector * 100 and then assign the result back into my_vector. If you do this, it will replace the old contents of my_vector with the new contents (i.e., the product of my_vector and 100).

require(coursekata) my_vector <- c(1,2,3,4,5) # This creates `my_vector` and stores 1, 2, 3, 4, 5 in it my_vector <- c(1,2,3,4,5) # Now write code to save `my_vector * 100` back into `my_vector` my_vector <- # This creates `my_vector` and stores 1, 2, 3, 4, 5 in it my_vector <- c(1,2,3,4,5) # Now write code to save `my_vector * 100` back into `my_vector` my_vector <- my_vector * 100 ex() %>% { check_operator(., "*") %>% check_result() %>% check_equal() check_object(., "my_vector") %>% check_equal() }

There may be times when you just want to know one of the values in a vector, not all of the values. We can index a position in the vector by using brackets with a number in it like this: [1]. So if we wanted to print out the contents of the first position in my_vector, we could write my_vector[1].

require(coursekata) my_vector <- c(1,2,3,4,5) my_vector <- my_vector * 100 # Write code to get the 4th value in my_vector # Write code to get the 4th value in my_vector my_vector[4] ex() %>% check_output_expr("my_vector[4]", missing_msg = "Have you used `[4]` to print out the 4th number in `my_vector`?")

Many functions will take in a vector as the input. For example, try using sum() to total up the five values saved in my_vector. Note that we have already saved some values in my_vector for you.

require(coursekata) my_vector <- c(100,200,300,400,500) # Use sum() to total up the values in my_vector # Use sum() to total up the values in my_vector sum(my_vector) ex() %>% { check_object(., "my_vector") check_function(., "sum", not_called_msg = "don't forget to use the sum() function") %>% check_result() %>% check_equal(incorrect_msg = "did you call sum() on my_vector?") }

We will learn about other R objects that help us organize and visualize data as we go along in the class.

Responses