Master [R]
1. Data types, data structures and indexing
1.1 Basics
Object, assignment, functions, how to comment and get help
x
x <- 2
x + 2
x * 3 # we can use R as a calculator!
log(1) # Functions help us execute things; we usually have to provide arguments
?log # Don't know how a function works? Ask for help!
1.2 Data types
Numeric (integers or doubles)
my_numeric <- c(x, x+2 , x *3) # Naming objects tips
Character
my_character <- c("blue", "bleu", "azul", "hyacinthum")
Boolean
is_french <- c(FALSE, TRUE, FALSE, FALSE)
1.3 Data classes
Vectors
Ordered collection of elements.
my_character
What happens when we add 1 to a logical vector?
is_french + 1
Matrix
dim(my_vector) <- c(lines, columns) # reshapes my_vector
my_character2 <- my_character
dim(my_character2) <- c(2,2)
my_character2
Lists
list(name_element1 = element1, name_element2 = element2, name_element3 = element3)
my_list <- list(word = my_character, french = is_french)
my_list
Data frames
data.frame(vector1, vector2) # bind vectors with the same length
my_df <- data.frame(word = my_character, french = is_french)
my_df
1.4 Dimensions
dim(matrix)
length(object)
Can you guess the value of the length of my_df
?
dim(my_df)
length(my_df)
dim(my_list)
length(my_list)
1.5 Indexing
vector[index]
matrix[row, column]
list[[element]]
my_character[2]
my_df[2,1]
my_list[[1]]
my_list[[1]][2]
1.6 Slicing
my_matrix <- matrix(c(34, 9, 6, 5, 3, 50, 43, 27, 98, 100), nrow=5)
my_matrix[my_matrix[,1]>5,]
2. Files
Absolute paths
C:/Users/RonBumblefootThal/Documents/RFolder/MyFirstProject/Draft/IDon'tKnowWhatI'mDoing/etc.R
Relative Paths
~/I_love_my_project/CoolCode.R
2.1 Working directories
dir()
setwd("data-trek-2020")
getwd()
list.files()
2.2 Save/write files
Data frame example
soa_tour <- data.frame(country = c("USA", "UK", "FRA", "GER", "BRA"),
frequency = c(34, 9, 6, 5, 3),
continents = c("north_america", "europe", "europe",
"europe", "south_america"))
write.csv(object, path)
write.csv(soa_tour, file="data/clean/soa_tour.csv")
list.files(data/clean)
2.3 Load/read files
From your PC
object <- read.csv(path)
soa_tour <- read.csv(file="data/clean/soa_tour.csv")
From url
object <- read.csv(url("http://remote.repo/data/file.csv"))
Metabolic rates data: http://sciencecomputing.io/data/metabolicrates.csv
metabolic_rates <- read.csv(url("http://sciencecomputing.io/data/metabolicrates.csv"))
From url to your PC, then read
download.file(url, destfile)
download.file(url = "http://sciencecomputing.io/data/metabolicrates.csv",
destfile = "data/raw/metabolicrates.csv")
metabolic_rates <- read.csv("data/raw/metabolicrates.csv")
3. Control Flow
You already apply control flow when you decide how to go to work during winter.
For example:
You take the metro if it's snowy
You take the metro if it's cold
You walk every other time
Now, let's put that into code!
3.1 Conditional evaluation
Simple if statement
Structure:
if (condition is true) {
do expression
}
Example:
weather <- "snowy"
if (weather == "snowy") {
print("Take the metro!")
}
weather <- "clear"
if (weather == "snowy") {
print("Take the metro!")
}
If/else statement
Structure:
if (condition) {
expression 1
} else {
expression 2
}
Example:
if (weather == "snowy") {
print("Take the metro!")
} else {
print("Let's walk!")
}
Nested if /else statement
if (condition 1) {
expression 1
if (condition 2) {
expression 2
}
}
Example:
temperature <- -15
if (weather == "snowy") {
print("Take the metro!")
} else {
if (temperature < -20) {
print("Take the metro!")
} else {
print("Let's walk!")
}
}
Best practice:
if (weather == "snowy") {
print("Take the metro!")
} else if (temperature < -20) {
print("Take the metro!")
} else {
print("Let's walk!")
}
Adding a condition:
if (weather == "snowy" | temperature < -20) {
print("Take the metro!")
} else {
print("Let's walk")
}
3.2 For loops
Simple for loops
Using for loops, you can then plan your schedule for a few days.
What we had:
weather <- "snowy"
temperature <- -15
But what about this?
weather_vec <- c("snowy", "cloudy", "snowy", "clear", "rainy")
temperature_vec <- c(-15, -23, -2, -40, 5)
Does the same code work?
if (weather_vec == "snowy" | temperature_vec < -20) {
print("Take the metro!")
} else {
print("Let's walk")
}
Iterations
for (i in iterations) {
content of the for loop
}
for(i in 1:5){
print(temperature_vec[i] + 2)
}
More generally:
length(temperature_vec)
for(i in 1:length(temperature_vec)){
print(temperature_vec[i] + 2)
}
If statement inside for loops
Structure:
for (i in iterations) {
if (condition) {
expression1
} else {
expression2
}
}
Example:
# Previous statement
if(weather_vec == "snowy" | temperature_vec < -20){
print("Take the metro!")
} else {
print("Just walk")
}
for (i in 1:length(weather_vec)){
# Previous statement
if(weather_vec == "snowy" | temperature_vec < -20){
print("Take the metro!")
} else {
print("Just walk")
}
}
# Will this work?
for (i in 1:length(weather_vec)){
if(weather_vec == "snowy" | temperature_vec < -20){
print("Take the metro!")
} else {
print("Just walk")
}
}
3.3 Extras
Some logical operators
Comparisons:
less than (<)
more than (>)
less than or equal to (<=)
more than or equal to (>=)
equal to (==)
not equal to (!=)
Logic:
not x (! x)
x or y (x | y)
x and y (x & y)
4. Functions
4.1 Syntax and arguments
Basic syntax
functionname <- function(argument1, argument2) { # Name and arguments
result = expression # What the function does
return(result) # What the function returns
}
temp_difference <- function(temp1, temp2) { # Name and arguments
result = temp2 - temp1 # What the function does
return(result) # What the function returns
}
# Apply on values
temp_difference(-5, -15)
# Apply on variables
temperature <- c(-15, -23)
temp_difference(temperature[1], temperature[2])
Calling (personal) functions within functions
absolute_temp_difference <- function(var1, var2) {
result <- temp_difference(var1, var2)
abs_result <- abs(result)
return(abs_result)
}
absolute_temp_difference(temperature[1], temperature[2])
4.2 Scope
Variables can exist either in global or local scope.
Remember, the element to return in our function was called `abs_result`.
# What will this return, outside the function?
result
Here is a second example for ecologists who like to count living things:
# global variables
trees <- 4
squirrels <- 10
count_living_things <- function() {
birds <- 5 # local variable
squirrels <- 20
return(c(birds, squirrels, trees))
}
count_living_things() # global and local variables returned
birds # does not exist in global scope
squirrels
4.3 Integration
Combining functions and control flow
Let's come back to our previous example about transportation according to the weather.
Here is the forecast for the week and the weekend:
# Week forecast
weather_week <- c("snowy", "cloudy", "snowy", "clear", "rainy")
temperature_week <- c(-15.0, -23.0, -2.0, -40.0, 5.0)
# Weekend forecast
weather_weekend <- c("snowy", "rainy")
temperature_weekend <- c(-3.0, 2.0)
Now, let's build a function that will work with either the week or weekend forecasts.
It will look like:
transportation <- function {
for (all days of the week/weekend) {
if (snowy or cold) {
take metro
} else {
walk
}
}
for (i in 1:length(weather_vec)) {
if (weather_vec[i] == "snowy" || temperature_vec[i] < -15.0) {
print("Take the metro")
} else {
print("Just walk")
}
}
choose_transportation <- function(weather, temperature) {
for (i in 1:length(weather)) {
if (weather[i] == "snowy" || temperature[i] < -15.0) {
print("Take the metreo!")
} else {
print("Just walk")
}
}
}
# Plan for the week
choose_transportation(weather_week, temperature_week)
# Plan for the weekend
choose_transportation(weather_weekend, temperature_weekend)
4.4 Exercise - Planning the week
Exercise to integrate the following:
Functions
Control flow
Files
Write a function to read a file if it exists, downloading it first if it does not exist.
Apply the
choose_transportation
function to the data in the file
# Pseudocode
function (file, url)
if (file exists)
read file
else
download file
read file
library("R.utils", quietly = TRUE)
# Useful functions
?file.exists # library "R.Utils"
?read.csv
?download.file
Forecast data url: http://bit.ly/dt-forecast
read_if_exists <- function(filename, url) {
if (file.exists(filename)) {
read.csv(filename)
} else {
download.file(url, filename)
read_if_exists(filename, url)
}
}
filename = "forecast.csv"
url = "http://bit.ly/dt-forecast"
read_if_exists(filename, url)
choose_transportation(forecast$weather, forecast$temperature)
Extras
Default values
# Define function
add_and_multiply <- function(var1, var2, var3 = 1) {
result <- (var1 + var2) * var3
return(result)
}
multiply(1.0, 2.0, 2.0)