1  R, RStudio, and Quarto

Learn R live!

Prefer to learn via live instruction? Register for my Introduction to R for Data Analysis seminar via Instats on January 15-16 2025.

1.1 What is R?

R is a “programming language” that once we can speak (write) it, we can use it to tell our computer to do things with our data. Just like learning a real language, learning a programming language involves learning an entirely new vocabulary along with the grammar rules that glues it all together. .

Many of you may be familiar with other software programs, like Excel, SAS, SPSS, STATA, or MATLAB. The main issues with these software programs, is that they are proprietary and they cost money.

R, however, is what is called an “open source” programming language. This means that it is completely free, and also that anyone can be an R developer. The result is that there are massive diverse communities of people who have come together to contribute to the R programming language, helping turn it into the powerful programming tool that it is today.

1.2 What is RStudio?

The RStudio “IDE”, which is the computer application that most people use to write their R code (RStudio is to R code what a Word Doc is to text).

Positron

I should probably note here that Posit, the company behind RStudio, has recently developed a new application or “IDE” called Positron that will likely eventually supersede RStudio, but Positron still in “Beta” mode and, for now, my recommendation is to stick with RStudio. Rest assured, I’ll update this page once I feel like Positron is ready for general use.

1.3 Downloading R and RStudio

If you will be using R “locally” (i.e., on your own computer, rather than in the cloud), then you will need to download R from the CRAN website (https://cran.rstudio.com/). While this will download an “R” application in which you can technically write R code, I recommend that you instead write your R code inside a separate desktop application called RStudio. You can download RStudio from the Posit website https://posit.co/downloads/.

Updating R and RStudio

Even if you already have R and RStudio on your computer, I recommend that you re-download them to ensure that you have the latest versions.

It is good practice to keep up-to-date with the latest versions. In general, I recommend actively re-installing the latest new versions of R and RStudio at least every 6 months or so.

Alternatively, if you prefer not to (or cannot) download applications from the web onto your computer, you can use R and RStudio directly in your web browser with Posit cloud.

1.4 A tour of RStudio

Since we will be using R inside RStudio, let’s start with a quick tour of RStudio.

Whether you’re using RStudio “locally” on your own computer, or in the cloud, when you open RStudio, it should look something like this:

If you go ahead and select File > New File > Quarto Document, type “My first R” in the “Title” bar and your name in the “Author” bar, then your RStudio should look something like this:

Congratulations, you’ve just created your first R document (the top-left panel that just appeared in your RStudio application)! Technically, this is called a quarto document, but you can think of it like a Word document in which you are going to write both regular text and R code text to tell a data-driven story! We will talk about the contents of this document in a moment.

If you’ve used RStudio before, you might have re-arranged the four panels that you see in the image above, but your version should have the same general features as in the image above:

  • A document panel (the top-left panel in the image), which is where the document that you’re currently writing in lives.

  • A console panel (The bottom-left panel in the image), which is where we can run the code that we write.

  • An environment panel (the top-right panel in the image), which will show the “objects” that exist in your R environment. We haven’t run any code yet, so this is empty.

  • The files panel (which is also the plot panel and viewer panel), which shows the files in the current local “directory” (the folder on your computer). When you first open RStudio, this is typically your computer’s home page.

Note that the size of each panel can be changed by dragging the border between two adjacent panels.

The most important panels at the moment are the documents and console panels on the left, so let’s take a closer look at these.

1.5 Quarto documents

The top-left documents panel contains the document that you’re currently working in. There are several types of documents that you could create in which you could write your R code, but I recommend using quarto documents.

Quarto versus R Markdown

If you’re already familiar with R Markdown, quarto is just its modern successor and at first glance, quarto is pretty much the same as R Markdown, with minor syntactic differences. Don’t worry, if you already have a bunch of old R Markdown documents, there isn’t much to be gained by converting them to quarto files, but I’d recommend that any new files you create be quarto files rather than R Markdown. If you’ve never heard of R Markdown, feel free to forget I even mentioned it.

Quarto documents allow you to combine text and code, so that rather than having your R code be lonely on its own, your code (and its output) can instead live nestled in between narrative text that describes the analysis that you’re conducting and summarizes the results you obtain.

Quarto documents are mind-blowingly versatile, and while they are mostly used to create simple html or pdf documents, they can also be used to make websites (like mine) and books (like this one!)

Whenever I am conducting an analysis using R, I conduct my analysis in a quarto document. In fact, every chapter in this book is just a quarto document!

Since we want to practice reproducible data science, it is important that we keep detailed records of the code we wrote that led us to our data-driven answers. Quarto provides us with an easy way of doing that. Moreover, because you can surround your code with text narrative, it can be used to communicate your analysis and results to other people: quarto lets us feed two birds with one seed!

To start a new quarto document inside RStudio:

  • Hit the “New file” icon, New file icon, in the top-left-hand corner of RStudio (or File > New File > Quarto Document) and select “Quarto document”. The following window should pop up:

Then

  • Choose a title (e.g., “An analysis of global life expectancy”), and make yourself the author.

  • Hit the “Create” button to create your file.

This will open up a new quarto template document in the documents panel.

The template text in your new quarto document contains a very summary of how quarto documents work. Take a moment to read it.

The regular white-background text is just text like in a Word Document. The text at the top of the quarto document surrounded by dashes --- is called the YAML header, provides several parameters and options for your quarto document, such as the title, the author, and the rendered output format (more on this in a moment).

The grey boxes with {r} are called “code chunks” and these are where we will write your R code.

1.5.1 Rendering quarto documents

Note the instructions in the template quarto document “When you click the Render button a document will be generated that includes both content and the output of embedded code.” Take the document’s advice: click the “Render” button with a blue arrow, which is circled in orange below and save your document when prompted as “analysis.qmd” or something like that:

Hopefully, what happened when you hit “Render” is that some text appeared very quickly in your bottom-left “console” panel and your web browser opened up with a new (html) webpage with whatever title you provided that looks something like this:

If you’re using RStudio in the cloud (or you have different settings to me), you may have instead found that the window opened in the “Viewer” panel of your RStudio application. If no window opened anywhere, navigate to the location on your computer where you saved your quarto document (“analysis.qmd”) and see if a new file “analysis.html” has appeared. If it has, open it in your web browser.

What happened when we hit the “Render” button? Hitting “Render” renders your interactive quarto (analysis.qmd) document as a static HTML (analysis.html) file. This is like saving your interactive word document file that you can edit as a static pdf file that you cannot edit.

Compare the original quarto (analysis.qmd) document in RStudio with the rendered page (analysis.html) in your web browser. What differences do you notice? Which one can you modify?

Rendering quarto as PDF and Microsoft Word documents

By default, quarto documents will be rendered as HTML files, but they can also be rendered to PDF and Microsoft Word files! You can do this by changing the format: html in the “yaml” text at the top of your quarto document (right underneath the “title” and “author” definitions) to format: pdf or format: docx, respectively.

However, note that to render a quarto document to a PDF file, you will need to have an application called LaTeX installed on your computer (see the exercise below).

If you switched to format: pdf, we recommend switching back to format: html for the rest of this lesson.

1.5.2 “Visual” mode versus “Source” mode

There are currently two modes that your interactive quarto document (i.e., the version in RStudio, not the rendered HTML document) can be in.

The “analysis.qmd” file in “visual mode” looks like this:

If your quarto document is in visual mode, it will be a lot more like a Word Document, where you will see boldface text, headings, italics, links, etc.

In this visual mode format, much of the underlying quarto syntax is hidden from you.

Alternatively, if you view this same “analysis.qmd” quarto document in “Source” mode, you will be looking at the underlying quarto (markdown) syntax. The “analysis.qmd” file in source mode looks like this:

Notice that there is no boldface text, or headings, etc. Instead there are raw text symbols to represent these things. For instance, surrounding text by two asterisks (**) creates boldface text and preceding some text with pound symbols (##) creates headings (the more pound symbols used, the more lower-level the heading). This text syntax is called “Markdown” syntax.

Can you identify whether your quarto document is currently in source or visual mode?

You can toggle your quarto document between source and visual mode using the “Source” and “Visual” buttons in the top-left corner of your quarto document in RStudio.

Whether you prefer source or visual mode will come down to a personal preference. I personally prefer working with the source mode where I can see the underlying Markdown syntax that defines the text formatting and R code chunks, but I know many people who prefer the visual mode.

1.5.2.1 Markdown syntax

In case you’re interested in learning a little bit more about Markdown syntax, switch your document over to the “Source” mode by hitting “Source” in the top left corner of your document.

Re-render your document by hitting the “Render” button, and based on the rendered static html page that will open in your web browser, let’s try to make some sense of the Markdown syntax used in the original quarto (.qmd) document in RStudio.

Can you see what the ## syntax is doing (if you can’t see the ## syntax in your quarto document in RStudio, ensure that you are viewing the quarto document using “Source” rather than “Visual” in the top-right corner of the document)? The pound symbols are markdown syntax for creating headers: # will create a top-level header (this is the same level as the overall document title), ## will create a level-2 header, ### will create a level-3 header, etc.

Notice that the word “Render” is shown in bold in the rendered html file. By looking at the .qmd file, can you figure out what the markdown syntax is for creating bold-face text?

To learn more about markdown syntax, see https://www.markdownguide.org/basic-syntax/.

If you want to play around with the Markdown formatting syntax, add some additional markdown features to your analysis.qmd file (E.g., a sub-section heading, some italics, or extra bold text), and re-render your quarto html output by hitting the “Render” button. Take note of how the changes you made were rendered in the static HTML version of your document.

1.6 Where to write your code

1.6.1 Writing R code in code chunks in a quarto document

I recommend that you write 99% of your R code in a quarto document, specifically, your R code should live in the grey boxes with {r}–these are called “code chunks”.

Hopefully when you were comparing your interactive quarto document with the rendered HTML document, one difference that you noticed that the rendered document also showed the “output” of the two R code chunks, which contained the R code 1 + 1 and 2 + 2 (the output of these two code chunks were 2 and 4, respectively.)

The image below shows how this interactive code chunk looks in the quarto document (in source mode):

```{r}
1 + 1
```

Which you can compare with the following image that shows how the static code chunk looks in the rendered HTML document:

When a quarto document is rendered into a HTML document, the code in each of the code chunks is compiled and the result or “output” (in this case the result is 2) is printed underneath the chunk.

You can provide “options” to code chunks, which are specified with a point symbol followed by a vertical bar on the first line(s) of the quarto chunk #|, such as #| echo: false, which will tell quarto that it should hide the R code, but still show the output. So if I have this in my interactive quarto document in RStudio:

```{r}
#| echo: false
1 + 1
```

I will only see this in the rendered HTML document (i.e., the 1 + 1 code is hidden):

[1] 2

1.6.2 Writing and running R code in the console

Admittedly, it would be really annoying if every time you wanted to see the output of your code, you had to render the entire quarto document and look at the output in the HTML document. No one has that much patience.

Fortunately, you can check the output of your code by running it in the console panel, which is usually directly underneath your quarto document.

If you click on the console panel (and scroll down to the bottom), you should see an arrow > with a blinking cursor | symbol. This means that the console is waiting for you to type your code.

In your console in RStudio, after the > symbol, type 2 + 2 and then hit return (Enter).

Your console should have produced the output/result of your code (4) underneath, and a new arrow > with a blinking cursor should have appeared underneath, indicating that R is ready for some more code like this:

So if you can just write your code directly in the console, why bother with the quarto document at all?

The problem with only ever writing your R code in the R console is that once you quit RStudio, there will be no record of the code that you ran.

1.6.3 Best practices: quarto vs the console

Best practices for writing and saving R code involves writing your code in R chunks within a quarto document, and then running that code in the console (and then later rendering your quarto document after you’ve written a bunch of code).

This might sound convoluted, but fortunately, you don’t have to write your code in two places. Once you write some code in a code chunk in your quarto document, you will notice a green “play” (right arrow) button at the right end of the code chunk. If you hit that, you will see one of two things happen:

  1. Chunk output inline: The output of your code will appear directly underneath your code chunk inside your interactive quarto document.

  1. Chunk output in console: Your code be magically transported to the console, where it will be automatically run and the output will be shown in the console.

There is also a nice keyboard shortcut for running the line of code on which your cursor is: Command+Enter on a Mac or Ctrl+Enter on Windows and Linux.

You can change the settings for where your output appears by selecting the settings button (circled in the image below) and choosing “Chunk output inline” for the first option or “chunk output in console” for the second option.

I strongly prefer the second option of “chunk output in console”, but R users are notoriously divided on where they like their output to appear (inline or in the console).

My suggestion is try both options for a few hours and see which one sparks joy.

Regardless of where your output appears (inline or in the console), I strongly suggest that you write all your code in a quarto document (rather than directly in the console). Writing your code in a quarto document will save your code and results in a reproducible way AND let you create a document that you can use to communicate your findings to other people.

1.6.4 Creating new code chunks in quarto

In a quarto document you can open a new R code chunk by writing three backticks followed by the letter “r” inside some curly parentheses and you need to “close” the code chunk with three backticks. Anything between these two lines (inside the grey box) is treated as R code, and anything outside these two lines is regular text.

```{r}


```

Since it’s a pain to type all that every time you want to create a new code chunk, you can use one of two shortcuts to create a new code chunk:

  1. Hit the “New Chunk” button, New file icon in RStudio, or

  2. Use the Option+Command+i keyboard shortcut.

Give it a go yourself: In your quarto document, create three new code chunks, one in each of the three ways described above. Conduct some basic mathematical calculations in each code chunk and run the code using any of the approaches described in the previous section (your output should appear either inline or in the console, depending on the settings). Then render your quarto document and look at the html output.

Next, add a chunk option to hide the code #| echo: false to one of your chunks (this echo: false chunk option must go on the first line of the chunk) and re-run your code (nothing should be different) and re-render your document and look at the output. What has changed?

Common issue: I can’t find my rendered document

If you can’t find your rendered HTML document, find the location where you saved your quarto document and find the .html file with the same name. Open this file in your web browser – this is your rendered document. If it doesn’t update, when you render your quarto document, there might be an error in your quarto document (check the “Background Jobs” tab in the console panel to see if there are any error messages).

Common issue: I can’t run my code

R is very particular about the syntax for the code chunks. There must be no spaces in front of the backticks that define the code chunk and your chunk options that start with #| must be on the first line of the code chunk directly under the backticks and not have any spaces before it.

If your code shortcuts don’t work, start super hard at your code chunk and see if there is anything wrong!