Chapter 1 ■ Introduction to R Programming
11
The function(x) x**2 expression defines the function, and anywhere you need a function, you can
write the function explicitly like this. Assigning the function to a name lets you use the name to refer to the
function, just like assigning any other value, like a number or a string to a name, will let you use the name for
the value.
Functions you write yourself works just like any function already part of R or part of an R package. With
one exception, though: you will not have documentation for your own functions unless you write it, and that
is beyond the scope of this chapter (but covered in Chapter 11).
The square function just does a simple arithmetic operation on its input. Sometimes you want the
function to do more than a single thing. If you want the function to do several operations on its input, you
need several statements for the function, and in that case you need to give it a “body” of several statements,
and such a body has to go in curly brackets.
square_and_subtract <- function(x, y) {
squared <- x ** 2
squared - y
}
square_and_subtract(1:5, rev(1:5))
## [1] -4 0 6 14 24
(Check the documentation for rev to see what is going on here. Make sure you understand what this
example is doing.)
In this simple example, we didn’t really need several statements. We could just have written the
function as:
square_and_subtract <- function(x, y) x ** 2 - y
As long as there is only a single expression in the function, we don’t need the curly brackets. For more
complex functions you will need it, though.
The result of a function—what it returns as its value when you call it—is the last statement or expression
(there really isn’t any difference between statements and expressions in R; they are the same thing). You can
make the return value explicit, though, using the return() expression.
square_and_subtract <- function(x, y) return(x ** 2 - y)
This is usually only used when you want to return a value before the end of the function—and to see
examples of this, you really need control structures, so you will have to wait a little bit to see an example—so
it isn’t used as much as in many other languages.
One important point here, though, if you are used to programming in other languages: the return()
expression needs to include the parentheses. In most programming languages, you could just write:
square_and_subtract <- function(x, y) return x ** 2 - y
This doesn’t work for R. Try it, and you will get an error.
Chapter 1 ■ Introduction to R Programming
12
Vectorized Expressions and Functions
Many functions work with vectorized expressions just as arithmetic expressions. In fact, any function you
write that is defined just using such expressions will work on vectors, just like the square function.
This doesn’t always work. Not all functions take a single value and return a single value, and in those
cases, you cannot use them in vectorized expressions. Take for example the function sum, which adds all the
values in a vector you give it as an argument (check ?sum now to see the documentation).
sum(1:4)
## [1] 10
This function summarizes its input into a single value. There are many similar functions, and naturally,
these cannot be used element-wise on vectors.
Whether a function works on vector expressions or not depends on how it is defined. Most functions in
R either work on vectors or summarizes vectors like sum. When you write your own functions, whether the
function works element-wise on vectors or not depends on what you put in the body of the function. If you
write a function that just does arithmetic on the input, like square, it will work in vectorized expressions. If
you write a function that does some summary of the data, it will not. For example, if we write a function to
compute the average of its input like this:
average <- function(x) {
n <- length(x)
sum(x) / n
}
average(1:5)
## [1] 3
This function will not give you values element-wise. Pretty obvious, really. It gets a little more
complicated when the function you write contains control structures, which we will get to in the next
section. In any case, this would be a nicer implementation since it only involves one expression:
average <- function(x) sum(x) / length(x)
Oh, one more thing: don’t use this average function to compute the mean value of a vector. R already
has a function for that, mean, that deals much better with special cases like missing data and vectors of length
zero. Check out ?mean.
A Quick Look at Control Structures
While you get very far just using expressions, for many computations you need more complex programming.
Not that it is particularly complex, but you do need to be able to select a choice of what to do based on
data—selection or if statements—and ways of iterating through data—looping or for statements.
If statements work like this:
if ()
Chapter 1 ■ Introduction to R Programming
13
If the Boolean expression evaluates to true, the expression is evaluated; if not, it will not.
# this won't do anything
if (2 > 3) "false"
# this will
if (3 > 2) "true"
## [1] "true"
For expressions like these, where we do not alter the program state by evaluating the expression, there
isn’t much of an effect in evaluating the if expression. If we, for example, assign it to a variable, there will be
an effect.
x <- "foo"
if (2 > 3) x <- "bar"
x
## [1] "foo"
if (3 > 2) x <- "baz"
x
## [1] "baz"
If you want to have effects for both true and false expressions, you have this:
if () else
if (2 > 3) "bar" else "baz"
## [1] "baz"
If you want newlines in if statements, whether you have an else part or not, you need curly brackets.
This won’t work:
if (2 > 3)
x <- "bar"
But this will:
if (2 > 3) {
x <- "bar"
}
An if statement works like an expression.
if (2 > 3) "bar" else "baz"
This evaluates to the result of the expression in the if or the else part.
x <- if (2 > 3) "bar" else "baz"
x
## [1] "baz"
Comments
Post a Comment