Data Cleaning in R

Since we want

• Each variable as a separate column
• Each row as a separate observation

We would want to reshape a table like:

Account Checking Savings
“12456543” 8500 8900
“12283942” 6410 8020
“12839485” 78000 92000

Into a table that looks more like:

Account Account Type Amount
“12456543” “Checking” 8500
“12456543” “Savings” 8900
“12283942” “Checking” 6410
“12283942” “Savings” 8020
“12839485” “Checking” 78000
“12839485” “Savings” 920000

We can use tidyr’s `gather()` function to do this transformation. `gather()` takes a data frame and the columns to unpack:

``````df %>%
gather('Checking','Savings',key='Account Type',value='Amount')``````

The arguments you provide are:

• `df`: the data frame you want to gather, which can be piped into `gather()`
• `Checking` and `Savings`: the columns of the old data frame that you want to turn into variables
• `key`: what to call the column of the new data frame that stores the variables
• `value`: what to call the column of the new data frame that stores the values

### Instructions

1.

The `students` data frame from the previous exercise has been loaded into the notebook for you. Save the columns names to `original_col_names` and print it.

2.

There is a column for the scores on the `fractions` exam, and a column for the scores on the `probability` exam.

We want to make each row an observation, so we want to transform this table to look like:

“First Student” “fractions” score%
“First Student” “probability” score%
“Second Student” “fractions” score%
“Second Student” “probability” score%

Use `gather` to create a new table (still called `students`) that follows this structure. Then view the `head()` of students.

3.

Save the columns names of the updated `students` data frame to `gathered_col_names` and print it.

4.

The dplyr function `count()` takes a data frame and a column as arguments and returns a table with counts of the unique values in the named column.

Find the count of each unique value in the `exam` column. Save the result to `exam_counts` and view `exam_counts`.