Functions in R

This article is a short introduction on how functions work in R, and how we can apply them across a set of values. We will explain what parameters and return statements are, how functions are called, and most importantly how to define our own functions. We also introduce the apply() family of functions that allow us to call a function on an entire data structure.

A function is a collection of several lines of code. By calling the function’s name, we execute the code inside of it. We can call the function over and over again, with different arguments (inputs). The point of a function is to make code reusable, modular, and easier to read.

R comes with many built-in functions that make data analysis easier. For example, the function mean() takes in a numerical R object and gives us back the average of the values in that numerical object. If we had a vector measurements with numbers and wanted to find the average of all those measurements, we could call the function like mean(measurements).

measurements <- c(30, 67, 40, 44, 25, 31) mean(measurements) # output: [1] 39.5

Some more basic built-in R functions include seq(), c(), and matrix(). There are also functions that come with different libraries like dplyr. Each function has its own name and expected arguments. Each of those functions has a definition somewhere, which we can search for in R documentation.

Defining a Function

We can define our own functions like so:

function_name <- function(parameter_1, parameter_2, ....) { # do something with parameter_1 and parameter_2 return(some_value) }

A parameter is the name of an expected input in the function’s definition. Parameter names are used in the function definition as a placeholder for the input value. The return statement defines where the code inside the function ends and gives an output. See the example below where we define a function that converts feet to meters.

First, let’s choose a descriptive name for the function like… feet_to_meters. We expect the function to take in one number representing how many feet. So let’s call that parameter feet. Inside the function, we should take the value in that parameter named feet, and multiply it by 0.3048 to get meters. The function will return meters.

feet_to_meters <- function(feet) { meters <- feet * 0.3048 return(meters) }

Now in the same file that we wrote this function in, we could call it multiple times. We call the values that are passed into the function arguments. Below, 30 and 40 are arguments.

feet_to_meters(30) # output: [1] 9.144 feet_to_meters(40) # output: [1] 12.192