In this section of my R manual, the most basic of the basics are introduced. Attention will be paid to basic calculations, still the basis of every refined statistical analysis. Furthermore, storing data and using stored data in functions is introduced.
Calculations in R
R can be used as a fully functional calculator. When R is started, some licensing information is shown, as well as a prompt ( > ). When commands are typed and ENTER is pressed, R starts working and returns the outcome of the command. Probably the most basic command that can be entered is a basic number. Since it is not stated what to do with this number, R simply returns it. Some special numbers have names. When these names are called, the corresponding number is returned. Finally, next to numbers, text can be handled by R as well.
3
3 * 2
3 ^ 2
pi
apple
"apple"
In the box above six commands were entered to the R prompt. These commands can be entered one by one, or pasted to the R-console all at once. After these commands are entered one by one, the screen looks like this:
> 3
[1] 3
> 3 * 2
[1] 6
> 3 ^ 2
[1] 9
>
> pi
[1] 3.141593
> apple
Error: object "apple" not found
> "apple"
[1] "apple"
On the first row, we see after the prompt ( > ) our first ‘command’. The row below is used by R to give us the result. The indication [1] means, that it is the first outcome that is first printed on that row. This may seem quite obvious (which it is), but can become very useful when working with larger sets of data.
The next few rows show the results of a few basic calculations. Nothing unexpected here. When PI is called for, the expected number appears. This is because R has a set of these numbers available, called constants. When an unknown name is called, an error message is given. R does not know a number, or anything else for that matter, called apple. When the word apple is bracketed (” “), it is seen as a character string and returned just like the numbers are.
Combining values
In statistics, we tend to use more than just single numbers. R is able to perform calculation on sets of data in exactly the same way as is done with single numbers. This is shown below. Several numbers can be combined into one ‘unit’ by using the c() command. C stands for concatenate. So, as the two first commands below try to achieve, we can combine both numbers as well as character strings. As said, we can use these ranges of data in our calculation. When we do so, shorter ranges of data are iterated to match the length of the longer / longest range of data.
c(3,4,3,2)
c("apple", "pear", "banana")
3 * c(3,4,3,2)
c(1,2) * c(3,4,3,2)
c(1,2,1,2) * c(3,4,3,2)
In the output below, we see that the first two commands lead to the return of the created units of combined data. As above, we see the [1]-indicator, while four or three items are returned. This is because R indicates the number / index of the first item on a row only. When we multiply the range of four number by a single number (3), all the individual numbers are multiplied by that number. In de final two command-lines, two numbers are multiplied by four number. This results in a 2-fold iteration of the two numbers. So, the result of the two last commands are the same.
> c(3,4,3,2)
[1] 3 4 3 2
> c("apple", "pear", "banana")
[1] "apple" "pear" "banana"
>
> 3 * c(3,4,3,2)
[1] 9 12 9 6
> c(1,2) * c(3,4,3,2)
[1] 3 8 3 4
> c(1,2,1,2) * c(3,4,3,2)
[1] 3 8 3 4
Storing data
The results of our calculations can be stored in object, often called variables. Using his capability saves us a lot of time typing in our data. It also allows for more complex calculations, as we will see later. We can assign single or multiple values to an object by the assign operator: <- . This operator can be used in opposite direction (->) as well. When only an object, which has a value assigned to it, is entered to the console, its’ contents are shown.
x <- 3
x
x -> z
z
2*x
y &<- c(3,4,3,2)
y
x*y
z <- c("apple", "pear", "banana")
z
The syntax above leads to the output shown below. In the first row, the value '3' is assigned to object x, which was unknown to R before. So, in many cases, objects do not have to be defined before data is assigned. When the object 'x' is entered to the console, the value that was assigned to it is returned. Next, the value of object 'x' is assigned to object 'z'. Note that the assign-operator is in opposite direction here, but it functions in exactly the same way (expect the direction of assignment, of course).
Next, it is shown that not only single values can be assigned to objects, but ranges of values as well. When we have more than one object with values assigned to it, these object can be used to perform calculations, as is shown by multiplying x by y.
The final example shows us two things. First of all: not only numbers can be assigned to objects, but character strings as well. Secondly, we assign these character strings to an object that was already containing other values. We see now, that the old values are overwritten by the new values.
> x <- 3
> x
[1] 3
> x -> z
> z
[1] 3
> 2*x
[1] 6
> y <- c(3,4,3,2)
> y
[1] 3 4 3 2
> x*y
[1] 9 12 9 6
>
> z <- c("apple", "pear", "banana")
> z
[1] "apple" "pear" "banana"
Functions and stored data
Many of the object we create in R can be entered into the multitude of functions that are available. A very straightforward function in mean(). As we can see in the syntax and the output below, this function behaves exactly the same when a range of values or an object with that range of values is entered. We also learn from these examples that the results of functions can be stored in objects as well.
mean(c(3,4,3,2))
y <- c(3,4,3,2)
mean(y)
m <- mean(y)
m
> mean(c(3,4,3,2))
[1] 3
> mean(y)
[1] 3
> m <- mean(y)
> m
[1] 3