# For Loops

Introduction

Within coding and programming for `R`, `For Loops` (or hereafter referred to as `Loops`) allows the repeated execution of specificed commands. This is useful if you are handling a large amount of data and wish to execute the same function, adaptation or change across all/some of your data, without having to code the process line by line, which can save you significant amounts of time in the long run!

Due to the noted complexity of `Loops`, it was believed producing this additional tab & tutorial, would aid your learning beyond the lectures and practicals in handling, using and engaging positively with `Loops`. For more information on `Loops` please see: Chapter 6 in A Beginner’s Guide to R, this Springer Textbook is a free downloadable resource which covers a huge variety of foundation topics and will be providing the foundation for this tutorial.

Conceptual Example

Lets start with a basic conceptual example (from R Bloggers), say for example you would like to print the phrase “The Year is 20XX” with XX being replaced by the years between 2010 and 2020. This could easily be achieved through writing ten individuals lines of code:

``print(paste("The year is", 2010))``
``##  "The year is 2010"``
``print(paste("The year is", 2011))``
``##  "The year is 2011"``
``print(paste("The year is", 2012))``
``##  "The year is 2012"``
``print(paste("The year is", 2013))``
``##  "The year is 2013"``
``print(paste("The year is", 2014))``
``##  "The year is 2014"``
``print(paste("The year is", 2015))``
``##  "The year is 2015"``
``print(paste("The year is", 2016))``
``##  "The year is 2016"``
``print(paste("The year is", 2017))``
``##  "The year is 2017"``
``print(paste("The year is", 2018))``
``##  "The year is 2018"``
``print(paste("The year is", 2019))``
``##  "The year is 2019"``
``print(paste("The year is", 2020))``
``##  "The year is 2020"``

Or could be written using the `for()` function, repeatedly executing the same command again and again.

``````for (year in 2010:2020){
print(paste("The year is", year))
}``````
``````##  "The year is 2010"
##  "The year is 2011"
##  "The year is 2012"
##  "The year is 2013"
##  "The year is 2014"
##  "The year is 2015"
##  "The year is 2016"
##  "The year is 2017"
##  "The year is 2018"
##  "The year is 2019"
##  "The year is 2020"``````

This basic conceptual example, is able to demonstrate to us, that the `for()` function is comprised clearly of two sections.

Let us consider the first section:

``for (year in 2010:2020)``

This section contains three components:

• The function: `for()`,
• The parameter: `in`,
• The values: `year` & `2010:2020`

Meaning as a result, these three components can be interpreted as: for value in value. Where in this case, it can be interpreted as: for year in year array.

Let us now consider the second section:

``````{
print(paste("The year is", year))
}``````

This section is more general, and can be any function which uses the value in the `for()` function itself.

In this case, we can see that here `year` is included.

Practical Example

Let us now consider a more practical example.

If we consider the `ggplot` dataset `mpg`. Say we would like to know how many of the cars have cylinders under/over/has a specific value. One method (although not necessarily the most straightforward) is using a `loop`.

``````for(i in mpg\$cyl){
print(i == 5)
}``````

Breaking this `loop` down, what this does is after recognising what you are looking for (the value cylinders () within the dataset mpg), it compares each observation within this dataset to the parameters set (in this case, whether the number of cylinders is exactly 5), for which it then prints TRUE or FALSE depending on the result.

If you run this code, you will be able to observe the distribution in the printed values.

However: this can be seen to present as having limited value, since as the researcher you are able to only interact with this in a limited way. As such it is possible to extend the `loop` to allocate these values to new values.

So, let us extend this previous example to have these outcome results be saved as a seperate dataframe.

``````cyl.out <- rep(NA, count(mpg))

for(i in cyl.out){
cyl.out <- (mpg\$cyl == 4)
}``````

For this, as you can observe the syntax itself changes, and so that for each blank unit within the empty dataframe (`cyl.out`), the outcome of the statement is each cars cylinder value higher than 4, should replace the NA value. This in itself is complex, however means that you can complete this repetative task with ease.