Chapter 9 Measuring performance of regression models

Measuring regression performance using `yardstick`

Figure 9.1: Measuring regression performance using yardstick

The yardstick package from tidymodels contains a comprehensive collection of performance metrics for regression and classification models. There are 16 metrics for regression models in the yardstick package. You can use these metrics even if you use other packages for modeling.

The package is loaded automatically when you load tidymodels.

Code
library(tidymodels)

9.1 Build a regression model

As described in Chapter 8, we build a model for the mtcars dataset.

Code
# prepare 
data <- datasets::mtcars %>% 
    as_tibble(rownames="car") %>%
    mutate(
        vs = factor(vs, labels=c("V-shaped", "straight")),
        am = factor(am, labels=c("automatic", "manual")),
    )

# fit model
formula <- mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
model <- linear_reg() %>% 
    set_engine("lm") %>% 
    fit(formula, data=data)

9.2 Calculate performance metrics

The yardstick package contains 16 metrics for regression models. The metrics() function calculates the most important metrics for a given model.

Code
metrics(augment(model, data), truth=mpg, estimate=.pred)
## # A tibble: 3 × 3
##   .metric .estimator .estimate
##   <chr>   <chr>          <dbl>
## 1 rmse    standard       2.15 
## 2 rsq     standard       0.869
## 3 mae     standard       1.72

We use the parsnip::augment function to add the predictions to the original dataset. For a regression model, this adds prediction (.pred) and residuals (.resid).6 The truth argument specifies the name of the column with the true values, here mpg. The estimate argument specifies the name of the column with the predicted values, here .pred.

The metrics() function calculates the following metrics:

  • rmse: root mean squared error (RMSE)
  • rsq: R-squared (R2)
  • mae: mean absolute error (MAE)

This should be enough for most purposes. If you want to calculate other metrics, you can use the yardstick::metric_set() function. This function takes a list of metrics and returns a function that calculates all metrics in the list. For example, if your dataset has a few outlier values, it can be useful to look at robust metrics like MAE. Here, we combine mae and huber_loss into a custom metric set.

Code
robust_metric <- metric_set(mae, huber_loss)
robust_metric(augment(model, data), truth=mpg, estimate=.pred)
## # A tibble: 2 × 3
##   .metric    .estimator .estimate
##   <chr>      <chr>          <dbl>
## 1 mae        standard        1.72
## 2 huber_loss standard        1.30

Todo:

Further information:

Code

The code of this chapter is summarized here.

Code
knitr::opts_chunk$set(echo=TRUE, cache=TRUE, autodep=TRUE, fig.align="center")
knitr::include_graphics("images/model_workflow_validate.png")
library(tidymodels)
# prepare 
data <- datasets::mtcars %>% 
    as_tibble(rownames="car") %>%
    mutate(
        vs = factor(vs, labels=c("V-shaped", "straight")),
        am = factor(am, labels=c("automatic", "manual")),
    )

# fit model
formula <- mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
model <- linear_reg() %>% 
    set_engine("lm") %>% 
    fit(formula, data=data)
metrics(augment(model, data), truth=mpg, estimate=.pred)
robust_metric <- metric_set(mae, huber_loss)
robust_metric(augment(model, data), truth=mpg, estimate=.pred)