Skip to Content
Learn
Intro to Visualization with R
Associating the Data

Before we go any further, let’s stop to understand when the data gets bound to the visualization:

  • Data is bound to a ggplot2 visualization by passing a data frame as the first argument in the ggplot() function call. You can include the named argument like ggplot(data=df_variable) or simply pass in the data frame like ggplot(data frame).
  • Because the data is bound at this step, this means that the rest of our layers, which are function calls we add with a + plus sign, all have access to the data frame and can use the column names as variables.

For example, assume we have a data frame sales with the columns cost and profit. In this example, we assign the data frame sales to the ggplot() object that is initailized:

viz <- ggplot(data=sales) + geom_point(aes(x=cost, y=profit)) viz # renders plot

In the example above:

  • The ggplot object or canvas was initialized with the data frame sales assigned to it
  • The subsequent geom_point layer used the cost and profit columns to define the scales of the axes for that particular geom. Notice that it simply referred to those columns with their column names.
  • We state the variable name of the visualization ggplot object so we can see the plot.

Note: There are other ways to bind data to layers if you want each layer to have a different dataset, but the most readable and popular way to bind the dataframe happens at the ggplot() step and your layers use data from that dataframe.

Instructions

1.

Create a new variable named viz and assign it the value of a new ggplot object that you create by invoking the ggplot() call and assigning it the dataframe movies as the data argument. After you’ve defined viz you need to state the variable name on a new line in order to see it.

Click run and watch your code render an empty canvas. Even though no data is displayed, the data is bound to the viz ggplot object!

Folder Icon

Take this course for free

Already have an account?