# Chapter 1 Introduction

This tutorial offers a first introduction into R. R is available as freeware from https://www.r-project.org, where one can also find a wealth of information and documentation.

This tutorial assumes that R is already properly installed on your computer. It is further assumed that the reader has some basic knowledge about statistics, equivalent to an introductory course in statistics. This tutorial introduces the R software for statistical analyses, and not the statistical analyses themselves. This tutorial occasionally mentions differences with SPSS, but the tutorial is also intended for novice users of statistical software.

## 1.1 What is R ?

Perhaps surprisingly, R is several things at once:

- a program for statistical analyses

- a calculator

`## [1] 24`

- a programming language

```
# function to convert hertz to semitones, relative to `base`, by Mark Liberman
h2st <- function( h, base=110 ) {
semi1 <- log(2^(1/12)) # log of frequency ratio of 1 semitone
return( ( log(h)-log(base) ) / semi1 ) } # compare with above
```

The assignment operator `<-`

is further explained in section 3.1 below.

The hash `#`

indicates comment, which is not processed.

## 1.2 What is RStudio?

Most users will use R in combination with the program RStudio (https://www.rstudio.com). RStudio can be regarded as a wrapper around R, actively assisting you with many housekeeping tasks. (For users familiar with SPSS, this is somewhat similar to the SPSS graphical user interface wrapped around the SPSS “engine”). After opening, RStudio displays three or four panels or panes, with multiple tabs (in bold) within each pane.

The left panel, or lower left panel, has a tab named **Console** which importantly contains the current R session. You can input R commands there (try typing `date()`

and press Enter) and see the output (warning and error messages are displayed in red).

In the top right panel, the tab **Environment** lists all objects in the workspace (see explanation below), and the tab **History** lists your previously entered R commands.

In the bottom right panel, the tab **Files** shows files in your current folder or directory, **Plots** contains plots produced by R/RStudio, and **Help** gives you access to help information.

You could work your way through most of this booklet using only the **Console** tab of RStudio, but most users find R+RStudio far easier to work with than R by itself.

## 1.3 Object-oriented philosophy

R works in an object-oriented way. This means that
*objects* are the most important things in R , and *not*
the actions we perform with these objects. Let’s use a culinary example
to illustrate this. In order to obtain pancakes, a cook needs flour,
milk, eggs, some mixing utensils, a pan, oil, and a fire. An
object-oriented approach places primary focus on these six objects. If
the relations between these are properly specified, then a good pancake
will result. Provided that the necessary objects (ingredients) are
available, the quasi-R syntax could be as follows:

```
batter <- mixed( flour, milk/2 ) # mix flour and half of milk
batter <- mixed( batter, egg*2 ) # add 2 eggs
batter <- mixed( batter, milk/2, use=whisk) # add other half of milk
while (enough(batter)) # FALSE if insufficient for next
pancake <- baked( batter, in=oil, with=pan, temp=max(fire) )
```

This example illustrates that R is indeed a full
programming language (but see footnote^{1}).
In fact, there is no recipe, in the
traditional sense. This “pancake” script merely specifies the relations
between the ingredients and the result. Note that some relations are
recursive: batter can be both input and output of the mixing operation.
Also note that the `mixed`

relation takes an
optional argument `use=whisk`

, which will produce
a fatal error message if there is no whisk in the kitchen. Such
arguments, however, allow for greater flexibility of the
`mixed`

relation. Likewise, we might specify
`baked(in=grease)`

if there is no oil in the
kitchen. The only requirement for the object supplied as
`in`

argument is that one can bake in it, so this
object must have some attribute
`goodforbaking==TRUE`

.

For contrast, we might imagine how the pancake recipe would be formulated in a more traditional, procedure-oriented approach. Ingredients and a spoon are again assumed to be provided.

```
MIX batter = flour + milk/2 . # what utensil?
MIX batter = batter + eggs .
MIX batter = batter + milk/2 .
BAKE batter IN oil .
BAKE batter IN water . # garbage in garbage out
```

The programmer of this recipe has defined the key procedures `MIX`

and
`BAKE`

, and has stipulated boundary conditions such as utensils and
temperatures. Optional arguments are allowed for the `BAKE`

command, but
only within the limits set by the programmer (see footnote 2).

So far, you may have thought that the difference between the two recipes was semantic rather than pragmatic. To demonstrate the greater flexibility of an object-oriented approach, let us consider the following variant of the recipe, again in quasi-R syntax:

```
# batter is done
while (number(pancakes)<2) # first bake 2 pancakes
pancake <- baked(batter,in=oil,with=pan,temp=max(fire))
feed(pancake,child) # feed one to hungry spectator
# define new function, data ’x’ split into ’n’ pieces
chopped <- function( x, n=1000 ) { return( split(x,n) ) }
pieces <- chopped(pancake) # new data object, array of 1000 pieces
batter <- mixed(batter,pieces) # mix pancake pieces into batter
# etc
```

Such complex relations between objects are quite difficult to specify,
if there are strong a priori limits to what one can `MIX`

or `BAKE`

.
Thus, object-oriented programs such as R allow for
greater flexibility than procedure-oriented programs.

If you are a user of the `Praat`

software (http://www.praat.org) then you are already familiar with this basic idea.
`Praat`

has an object window, listing the known objects.
These objects are the output of previous operations (e.g. Create, Read,
ToSpectrum), as well as input for subsequent operations (e.g. Write,
Draw). The classes or types of these objects are pre-defined (Sound,
Spectrum, Periodicity, etc). R takes the same idea even
further: users may create their own *classes* of data objects (e.g. a new
class `SuperData`

) and may create their own methods or relations to work with
such objects.^{2}

This object-oriented philosophy results in a different behavior than observed in procedure-oriented software:

There is an important difference in philosophy between S (and hence R) and the other main statistical systems. In S a statistical analysis is normally done as a series of steps, with intermediate results being stored in objects. Thus whereas SAS and SPSS will give copious output from a regression or discriminant analysis, R will give minimal output and store the results in a fit object for subsequent interrogation by further R functions.

from: https://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics