\(MOSAIC_{data}\)

Block 0: Introduction to R, Part 1

Oliver Nakoinz, Lizzie Scholtus, Néhémie Strupler

2022-07-11
License: CC BY-SA 4.0

Program

Block 0: Introduction to R

Very short introduction

What is R?

R is a language and environment for statistical computing and graphics

Three dimensions of R

Why R?

Seven levels of using R

  1. Asking a colleague to run R for you
  2. Applying build in functions
  3. Combining build in functions
  4. Writing own algorithms
  5. Writing own functions and objects
  6. Developing efficient code
  7. Writing own packages

Three styles of using R

  1. Freehand
  2. Script based
  3. Literate Programming

Freehand

- flexible
- impressive
- badly documented

Terminal

Script based

Texteditor

Scripts: Some rules

 ##################################################################
 ## Didactic R-Script for Modelling Summer School
 ## ===============================================================
 ## Project: Modelling Summer School
 ## Author: O. Nakoinz, C. Filet & F. Faupel
 ## Version: 01
 ## Date of last changes: 10.09.2018
 ## Data: some.data
 ## Author of data: author.data
 ## Purpose: didactic
 ## Content: 1. preparation, 2. data import, ...
 ## Description: The script include ...
 ## Licence data: -
 ## Licence Script: GPL (http://www.gnu.org/licenses/gpl-3.0.html)
 ##################################################################

Scripts: Some rules

# This is a comment
a <- c(3,6,7,4,9,7,3,3,3)       # This is a vector
median(a)                       # This is the "median" function
## [1] 4

Scripts: Some rules

Scripts: Some rules

Folder Structure

Literate Programming

RMarkdown

Rule No 1

Do not try remembering all details, focus on the most important aspects

Knowledge means, knowing where it is written. (attributed to Albert Einstein)

Educated is who knows where to find what he does not know. (attributed to Georg Simmel)

Reading Code

Calculating with R: oparators

5 + 3
## [1] 8
5 - 3
## [1] 2
5 * 3
## [1] 15
5 / 3
## [1] 1.666667
5 ^ 3
## [1] 125
5 %% 3
## [1] 2
5 %/% 3
## [1] 1
Operator Description
+ addition
- subtraction
* multiplication
/ division
^ or ** exponentiation
x %% y modulus (x mod y) 5 %% 2 is 1
x %/% y integer division 5 %/% 2 is 2

read and analyze code!

Calculating with R: oparators

4 == 6
## [1] FALSE
4 != 6
## [1] TRUE
4 < 6
## [1] TRUE
4 <= 6
## [1] TRUE
4 > 6
## [1] FALSE
4 >= 6
## [1] FALSE
!TRUE
## [1] FALSE
(4 < 6) | (4 > 6)
## [1] TRUE
(4 < 6) & (4 > 6)
## [1] FALSE
isTRUE(26*2>4)
## [1] TRUE
Operator Description
== exactly equal to
!= not equal to
< less than
<= less than or equal to
> greater than
>= greater than or equal to
!x Not x
x y
x & y x AND y
isTRUE(x) test if X is TRUE

Calculating with R: functions

sqrt(25)
## [1] 5
sin(3.14)
## [1] 0.001592653
log(25)
## [1] 3.218876
log10(25)
## [1] 1.39794
abs(-345.3356)
## [1] 345.3356
round(2.43135)
## [1] 2
ceiling(2.5)
## [1] 3
floor(2.5)
## [1] 2

Calculating with R: functions

sin(3.14)
## [1] 0.001592653
#sin(3,14)
sin(c(3,14))
## [1] 0.1411200 0.9906074

R-objects: variables

x <- 5
x
## [1] 5
sqrt(x)
## [1] 2.236068
y <- 7
5 -> x

Working directory

wd_lin <- "/home/xxxx/xxxx"
wd_win <- "D:\\xxx\\xxxx"
# set(wd_lin)

RStudio

Exercise

Part I ‐ Markdown

Markdown

Markdown

Markdown

  1. numbered list item: 1. numbered list item

block quotes: > block quotes

a b c
1 2 3
2 3 4

| a| b| c|
|--:|--:|--:|
| 1| 2| 3|
| 2| 3| 4|

Markdown and RStudio

Loading the package makes it available with the RStudio Addins.

library(remedy)

YAML Header

A Markdown file starts with a YAML header.

---
title: "Untitled"
output: html_document
---

YAML Header

---
title: "Untitled"
author: '[Prof. Dr. Oliver Nakoinz](http://oliver.nakoinz.gitlab.io/OliverNakoinz/)'
date: "WS 2020/2021"
bibliography: ../7lib/71citations/lit.bib
csl: ../7lib/72csl/iso690-author-date-cs.csl
lang: de-DE
otherlangs: en-GB
output:
  slidy_presentation:
    highlight: tango
    pandoc_args:
    - --css
    - stycss/styles.css
    footer: "Oliver Nakoinz"
    df_print: kable
fontsize: 14pt
font-family: 'Helvetica'
widescreen: true
---

Bibtex

  @book{xie2018r,
  title     = {R Markdown: The Definitive Guide},
  author    = {Xie, Y. and Allaire, J.J. and Grolemund, G.},
  isbn      = {9781138359338},
  series    = {Chapman and Hall/CRC the R Series},
  url       = {https://bookdown.org/yihui/rmarkdown/},
  year      = {2018},
  publisher = {Taylor \& Francis, CRC Press}
}

Bibtex

library(bibtex)
plist_v <- c("base",
             "sp",
             "sf")
write.bib(plist_v,
          file = 'lit_packages')

RMarkdown

```{r code chunks name}

a <- c(2, 3, 4)

sum(a)

```

a <- c(2, 3, 4)
sum(a)
## [1] 9

Tables

Markdown

| a| b| c|
|--:|--:|--:|
| 1| 2| 3|
| 2| 3| 4|

a b c
1 2 3
2 3 4

R

df <- data.frame(a = c(1, 2),
                 b = c(2, 3),
                 c = c(3, 4))
knitr::kable(df)
a b c
1 2 3
2 3 4

RMarkdown

Exercise

Reading code

Objects & Functions

Data types

22
## [1] 22
TRUE
## [1] TRUE
2i + 2
## [1] 2+2i
"Text"
## [1] "Text"
as.raw(65)
## [1] 41
as.Date("2018-11-15")
## [1] "2018-11-15"

Data structures

Data structures

Data structures: vector

A vector contains an ordered set of elements of the same type.

Vectors are defined with the function c (for concatenate).

x <- c(3, 5, 2, 2, 5, 8, 5, 2)
y <- c(6, 7, 2, 4, 5, 2, 9, 1)

vector

Data structures: vector

Vectors can be defined as sequence and many functions are available for manipulating vectors.

read and analyze code!

x <- c(3, 5, 2, 2, 5, 8, 5, 2)
y <- c(6, 7, 2, 4, 5, 2, 9, 1)

z  <- seq(1:5)
z2 <- seq(along = y)
z3 <- rep(3, 5)
x_sort <- sort(x)
x_rev  <- rev(x)
x_revsort <- rev(sort(x))
x_sortdec <- sort(x,
                  decreasing = T)

Data structures: vector

Parts of vectors (and other structures) can be addressed with indices.

x <- c(3, 5, 2, 2, 5, 8, 5, 2)

x[4]
## [1] 2
x[c(2, 4)]
## [1] 5 2
x[3:6]
## [1] 2 2 5 8
x[x>4]
## [1] 5 5 8 5

Data structures: vector

Vectors can be passed as parameter to functions. Some functions return information on the vectors.

x <- c(3, 5, 2, 2, 5, 8, 5, 2)

min(x)
## [1] 2
range(x)
## [1] 2 8
length(x)
## [1] 8

Data structures: matrix

x <- c(3, 5, 2, 2, 5, 8, 5, 2)
y <- c(6, 7, 2, 4, 5, 2, 9, 1)

zx <- matrix(x, 2, 4)
zy <- matrix(y, 2, 4)
zx + zy
##      [,1] [,2] [,3] [,4]
## [1,]    9    4   10   14
## [2,]   12    6   10    3

Data structures: array

(rarely used)

a <- array(data = 1:12,
           dim = c(3, 2, 2)
           )
a
## , , 1
## 
##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6
## 
## , , 2
## 
##      [,1] [,2]
## [1,]    7   10
## [2,]    8   11
## [3,]    9   12

Data structures: dataframe

x <- c(3, 5, 2, 2, 5, 8, 5, 2)
y <- c(6, 7, 2, 4, 5, 2, 9, 1)

df <- data.frame(x, y)
df$x
## [1] 3 5 2 2 5 8 5 2
df[, 1]
## [1] 3 5 2 2 5 8 5 2
df[1,]
x y
3 6
which(x == 2)
## [1] 3 4 8

Data structures: factor

f <- factor (c("Nicole", "Sabire",
"Clemens", "Jos", "Nicole"))
f
## [1] Nicole  Sabire  Clemens Jos     Nicole 
## Levels: Clemens Jos Nicole Sabire
unclass(f)
## [1] 3 4 1 2 3
## attr(,"levels")
## [1] "Clemens" "Jos"     "Nicole"  "Sabire"
levels(f)
## [1] "Clemens" "Jos"     "Nicole"  "Sabire"

Data structures: list

l <- list(c("Nicole",
            "Sabire",
            "Clemens",
            "Jos"),
          as.Date("2018-11-15"),
          x)
l[[1]][2]
## [1] "Sabire"

Data structures

http://venus.ifca.unican.es/Rintro/dataStruct.html

Indices etc.

Column

df$x
df[, 2]
df[,c(1, 2)]
df[, c("x", "y")]
dplyr::select(df, x)

Row

df[4, ]
df[4, ]$x
df[4:7, ]
df[c(2, 4, 6),]
df[x>2, ]
dplyr::filter(df, y > 2)

Functions

x <- c(3, 5, 2, 2, 5, 8, 5, 2)
y <- c(6, 7, 2, 4, 5, 2, 9, 1)

sin(3)
## [1] 0.14112
mean(x)
## [1] 4
rbind(x, y)
##   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## x    3    5    2    2    5    8    5    2
## y    6    7    2    4    5    2    9    1
cbind(x, y)
##      x y
## [1,] 3 6
## [2,] 5 7
## [3,] 2 2
## [4,] 2 4
## [5,] 5 5
## [6,] 8 2
## [7,] 5 9
## [8,] 2 1
rnorm(5,
      mean = 7,
      sd = 1)
## [1] 7.605434 6.291699 8.035295 5.382489 5.543796

Write your own functions

add5 <- function(a){
    b <- a + 5
    return(b)
}
add5(7)
## [1] 12

The most important functions in R

help(sin)
?rbind
# shortcut for 'help()'
??read
str(x)
##  num [1:8] 3 5 2 2 5 8 5 2
summary(x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       2       2       4       4       5       8
ls()
##  [1] "a"         "add5"      "df"        "f"         "l"         "wd_lin"   
##  [7] "wd_win"    "x"         "x_rev"     "x_revsort" "x_sort"    "x_sortdec"
## [13] "y"         "z"         "z2"        "z3"        "zx"        "zy"
dir()
##  [1] "MOSAICdata_block0_1.html"            
##  [2] "MOSAICdata_block0_1.rmd"             
##  [3] "MOSAICdata_block0_1b.html"           
##  [4] "MOSAICdata_block0_1b.rmd"            
##  [5] "MOSAICdata_block0_2.html"            
##  [6] "MOSAICdata_block0_2.rmd"             
##  [7] "MOSAICdata_block0_2b.html"           
##  [8] "MOSAICdata_block0_2b.rmd"            
##  [9] "MOSAICdata_block0_3.html"            
## [10] "MOSAICdata_block0_3.rmd"             
## [11] "MOSAICdata_block0_3b.html"           
## [12] "MOSAICdata_block0_3b.rmd"            
## [13] "Rintro1.Rmd"                         
## [14] "Rintro2.Rmd"                         
## [15] "s01.html"                            
## [16] "s01.Rmd"                             
## [17] "s02.html"                            
## [18] "s02.Rmd"                             
## [19] "s03.html"                            
## [20] "s03.Rmd"                             
## [21] "s04.html"                            
## [22] "s04.Rmd"                             
## [23] "s05.html"                            
## [24] "s05.Rmd"                             
## [25] "s06.html"                            
## [26] "s06.Rmd"                             
## [27] "s07.html"                            
## [28] "s07.Rmd"                             
## [29] "s08.html"                            
## [30] "s08.Rmd"                             
## [31] "s09.html"                            
## [32] "s09.Rmd"                             
## [33] "s1_Kursfolienvorlage_Einfuehrung.Rmd"
## [34] "s1_Kursfolienvorlage.Rmd"            
## [35] "s10.html"                            
## [36] "s10.Rmd"                             
## [37] "s11.html"                            
## [38] "s11.Rmd"                             
## [39] "s12.Rmd"                             
## [40] "stycss"                              
## [41] "titelfolie.html"                     
## [42] "titelfolie.pdf"                      
## [43] "titelfolie.Rmd"

Exercise

Break

References

Adler 2010: J. Adler, R in a nutshell [deutsche ausgabe] (Köln 20101.).
Allaire, J. et al. 2021a: J. Allaire/Y. Xie/J. McPherson/J. Luraschi/K. Ushey/A. Atkins/H. Wickham/J. Cheng/W. Chang/R. Iannone, Rmarkdown: Dynamic documents for r (2021). https://github.com/rstudio/rmarkdown.
Allaire, J. et al. 2021b: J. Allaire/Y. Xie/C. Dervieux/R Foundation/H. Wickham/Journal of Statistical Software/R. Vaidyanathan/Association for Computing Machinery/C. Boettiger/Elsevier/K. Broman/K. Mueller/B. Quast/R. Pruim/B. Marwick/C. Wickham/O. Keyes/M. Yu/D. Emaasit/T. Onkelinx/A. Gasparini/M.-A. Desautels/D. Leutnant/MDPI/Taylor and Francis/O. Öğreden/D. Hance/D. Nüst/P. Uvesten/E. Campitelli/J. Muschelli/A. Hayes/Z. N. Kamvar/N. Ross/R. Cannoodt/D. Luguern/D. M. Kaplan/S. Kreutzer/S. Wang/J. Hesselberth, Rticles: Article formats for r markdown (2021). https://cran.r-project.org/package=rticles.
Barnier 2021: J. Barnier, Rmdformats: HTML output formats and templates for ’rmarkdown’ documents (2021). https://cran.r-project.org/package=rmdformats.
Crawley 2014: M. J. Crawley, Statistics: An introduction using R (20142nd). http://www.bio.ic.ac.uk/research/crawley/statistics/.
Daróczi 2015: G. Daróczi, Mastering data analysis with r (2015). https://www.packtpub.com/big-data-and-business-intelligence/mastering-data-analysis-r.
Ekstrom 2011: C. T. Ekstrom, The R primer (Boca Raton, FL 2011). http://www.crcpress.com/product/isbn/9781439862063.
Francois 2020: R. Francois, Bibtex: Bibtex parser (2020). https://cran.r-project.org/package=bibtex.
Gandrud 2013: C. Gandrud, Reproducible research with R and RStudio. Chapman & hall/CRC the r series (Boca Raton, FL 2013). https://www.taylorfrancis.com/books/9781466572843.
Hothorn/Everitt 2014: T. Hothorn/B. S. Everitt, A handbook of statistical analyses using R (Boca Raton, Florida, USA 20143rd). http://www.crcpress.com/product/isbn/9781482204582.
Kabacoff 2010: R. Kabacoff, R in action (2010). http://www.manning.com/kabacoff.
Knell 2013: R. J. Knell, Introductory R: A beginner’s guide to data visualisation and analysis using R (2013). http://www.introductoryr.co.uk.
Kohl 2015: M. Kohl, Einführung in die statistische Datenanalyse mit R (London 2015).
Leemis 2016: L. Leemis, Learning base r (2016). http://www.amazon.com/learning-base-lawrence-mark-leemis/dp/0982917481.
Mittal 2011: H. Mittal, R graphs cookbook (2011). https://www.packtpub.com/r-graph-cookbook/book.
Müller 2020: K. Müller, Here: A simpler way to find your files (2020). https://cran.r-project.org/package=here.
Murray 2013: S. Murray, Learn R in a day (2013). http://www.amazon.com/learn-r-day-steven-murray-ebook/dp/b00gc2lkok/ref=cm_cr_pr_pb_t.
Murrell 2011: P. Murrell, R graphics, second edition. Chapman & hall/CRC the r series (Boca Raton, FL 2011). http://www.crcpress.com/product/isbn/9781439831762.
Nakoinz/Knitter 2016: O. Nakoinz/D. Knitter, Modelling human behaviour in landscapes: Basic concepts and modelling elements. Quantitative archaeology and archaeological modelling (2016). http://www.springer.com/de/book/9783319295367.
Quick 2010: J. M. Quick, The statistical analysis with R beginners guide (2010). https://www.packtpub.com/statistical-analysis-with-r-beginners-guide/book.
R Core Team 2020: R Core Team, R: A language and environment for statistical computing (Vienna, Austria 2020). https://www.r-project.org/.
Rahlf 2017: T. Rahlf, Data visualisation with r (New York 2017). http://www.datavisualisation-r.com.
Stowell 2012: S. Stowell, Instant R: An introduction to R for statistical analysis (2012). http://www.instantr.com/book.
Stowell 2014: S. Stowell, Using r for statistics (2014). http://www.apress.com/9781484201404.
Teetor 2011a: P. Teetor, R cookbook (2011). http://oreilly.com/catalog/9780596809157.
Teetor 2011b: P. Teetor, 25 recipes for getting started with R (2011). http://oreilly.com/catalog/9781449303228.
Wickham/Grolemund 2017: H. Wickham/G. Grolemund, R for data science (2017). http://r4ds.had.co.nz/.
Xie 2014: Y. Xie, Knitr: A comprehensive tool for reproducible research in R. In: V. Stodden/F. Leisch/R.D. Peng (eds.), Implementing reproducible computational research (2014). http://www.crcpress.com/product/isbn/9781466561595.
Xie 2015: Y. Xie, Dynamic documents with R and knitr (Boca Raton, Florida 20152nd). https://yihui.org/knitr/.
Xie et al. 2018a: Y. Xie/J. J. Allaire/G. Grolemund, R markdown: The definitive guide. Chapman and hall/CRC the r series (2018). https://bookdown.org/yihui/rmarkdown/.
Xie et al. 2018b: Y. Xie/J. J. Allaire/G. Grolemund, R markdown: The definitive guide (Boca Raton, Florida 2018). https://bookdown.org/yihui/rmarkdown.
Xie 2021a: Y. Xie, Knitr: A general-purpose package for dynamic report generation in r (2021). https://yihui.org/knitr/.
Xie 2021b: Y. Xie, Bookdown: Authoring books and technical documents with r markdown (2021). https://github.com/rstudio/bookdown.

  1. fn1↩︎