1  Intro to R

R is a programming language for statistical computation and data visualization. It is widely used by stacticians, data scientists, and researchers and has a rich ecosystem of packages and libraries to facilitate data manipulation, visualization, statistical modeling, and more. R is free to download, use, and modify.

R can be installed to your computer, but in class we will use R entirely in the browser. No installation necessary!

Let’s get started by looking at some code. Press the “Run Code” button or click inside the text-editor and press Ctrl or Cmd + Enter to execute the code. It may take a few seconds for your browser to load the R software.

How many times was a 2 rolled?

To create a bar graph, we can use the barplot() function:

Exercise 1.1  

  1. Recall that roll_count contains a table counting the number of 1s, 2s, 3s, etc. whereas dice_rolls contains each roll. Try replacing roll_count with dice_rolls in the bar graph. What does this bar graph show?
  2. Hit the “Start Over” button to reset the code box. Suppose after these first 20 rolls, you roll an additional 5, 5, 2, 1, 6. Add this to the dice_rolls variable and then run both code boxes again.
How many 5s have been rolled after adding these new rolls?

1.1 Syntax and Functions

You may have noticed some symbols like # or <- or c(...) in the code. These are called syntax and I will summarize new syntax and functions at the end of each section.

  • # denotes a comment, this is a term in programming to mean text that is not computed, but merely serves to describe the adjacent code. For instance, the following two lines are identical as far as R is concerned—the comment is only read by us and not by R.
1 + 1 # Computes 1 + 1
1 + 1
  • <- is called an assignment operator and is used to store previous values into a variable that can be used later on. For instance, we might write “if \(x = 3\) and \(y = 4\) then \(x * y = 12\).” In R, this looks like

Exercise 1.2 In the box above, let \(x = 3, y = 4, z = -2, w = 1\) and compute \(x + y * z - w\).

  • c(1, 2, 3, 4, ...) is R’s notation for a list. The c stands for “combine” or “collect.”
  • table(...) takes a list of data and returns a table counting the number of occurrences of each value.
  • barplot(...) creates a bar graph