
In the DS-6030 Statistical Learning course, we will use

  • the tidyverse packages for data loading and processing
  • the tidymodel packages for model building and validation

Compared to the classical base-R packages covered in An Introduction to Statistical Learning (James et al. 2021), these packages offer many advantages that will make working with data easier and more streamlined.


The tidyverse is a collection of packages that share a common design philosophy and are designed to work together. To load the tidyverse, use the following command:

## ── Attaching core tidyverse packages ─────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<>) to force all conflicts to become errors

You will see that this loads a number of packages. The most important ones are:

  • ggplot2 for plotting
  • dplyr for data manipulation
  • readr for data import
  • tibble for improved data frames
  • tidyr for getting data into tidy form
  • purrr for functional programming
  • stringr for string manipulation
  • forcats for categorical/factor data


The tidymodels package is developed by Max Kuhn who now works at RStudio / posit. It was first released in 2018 and is still under active development. It is an ecosystem of packages that share a common design philosophy and are designed to work together. The packages include

  • parsnip for model specification
  • recipes for data preprocessing
  • rsample for resampling
  • yardstick for model evaluation
  • tune for hyperparameter tuning
  • workflows for modeling workflows
  • tidyposterior for Bayesian modeling

The tidymodels packages are designed to work with the tidyverse and tidydata principles. The packages are designed to be modular and extensible.

Getting Help


  • Install R and RStudio
  • Make use of Projects in RStudio


James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics. New York, NY: Springer US.
Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. “R for Data Science (2e).”