APTS: Statistical Machine Learning Computer Lab 2

There has been code provided throughout the course in live code blocks, where you get snippets of code to perform particular tasks by directly leveraging individual packages for particular methods. Therefore, we will not repeat such exercises in the labs and instead we will focus on the full software pipeline which makes running machine learning analyses much easier. Arguably the two top ML pipelines in R (R Core Team, 2021) today are tidymodels (Kuhn and Wickham, 2020) and mlr3 (Lang et al., 2019). The first lab focused on tidymodels and this second lab now looks at mlr3.

Data sets

Before beginning with mlr3, we will ensure you can get the two data sets that will be used for the examples here, which will be suitable for basic binary and multilabel classification models.

The first is a credit scoring dataset: an important application in banking. The data consist of 4454 observations of 14 variables, each a loan that was given together with the status of whether the loan was “good” (ie repaid) or went “bad” (ie defaulted).

# Uncomment and run the following command first if you do not have the modeldata package
# install.packages("modeldata")
data("credit_data", package = "modeldata")

Use some of the exploration techniques from Lab 1 to explore the dataset before you start modelling.

# Try some plotting functions to explore the data here

Due to technical issues, the alternative dataset has been removed, apologies for the reduced variety in the lab. A solution will be sought for future APTS courses.

MLR 3

MLR takes a quite different approach to building pipelines than Tidymodels did. Neither is “right” and you will probably find you prefer one style over the other, which is totally fine and quite a personal choice. In order to make that choice, it’s good to see both, so today it’s MLR’s turn! Note that MLR 3 is a huge ecosystem and we barely scratch the surface today: there is a fantastic online book by Becker et al. (2021) available, which is highly recommended reading if you decide to use MLR 3 for your applications. You also find the three “cheatsheets” useful for at-a-glance reference: MLR3 cheatsheet, MLR3 tuning cheatsheet, and MLR3 pipelines cheatsheet.

mlr3 is another meta-package for building machine learning software pipelines, with the aim of automating a lot of the repetitive tasks we commonly need to do. This both saves time and makes them less error prone, because we’re relying on software stacks that have been extensively tested. Note that MLR is slightly more mature software at this stage having been around for quite a while and undergone a big revamp with version 3, but Tidymodels is rapidly catching up and is backed by the fabulous RStudio/Posit team, so in the medium to longer term both will have similar levels of code maturity and stability.

# Uncomment and run the following command first if you do not have the mlr3 package
# install.packages("mlr3")

# Load mlr3, which loads a suite of other packages for us
library("mlr3")

MLR breaks machine learning problems up into steps. The most basic ones are where you:

define tasks
define learners
train learners on tasks
predict from learners
evaluate performance

Basic modelling run

An MLR task defines the dataset and identifies what the response is (all other variables being taken to be the features).

To define a task for the credit data, we therefore provide the data frame and specify that we are predicting the Status variable.

# Define a task, which is a dataset together with target variable for prediction
# We wrap the data in an na.omit to avoid issues with missingness, see later for
# better options
task_credit <- TaskClassif$new(id = "credit",
                               backend = na.omit(credit_data),
                               target = "Status")
task_credit

## 
## ── <TaskClassif> (4039x14) ─────────────────────────────────────────────────────
## • Target: Status
## • Target classes: bad (positive class, 25%), good (75%)
## • Properties: twoclass
## • Features (13):
##   • int (9): Age, Amount, Assets, Debt, Expenses, Income, Price, Seniority,
##   Time
##   • fct (4): Home, Job, Marital, Records

Then, we can see what learners MLR has built in.

# This variable shows available learning algorithms
as.data.table(mlr_learners)

## Key: <key>
##                    key                              label task_type
##                 <char>                             <char>    <char>
## 1:       classif.debug   Debug Learner for Classification   classif
## 2: classif.featureless Featureless Classification Learner   classif
## 3:       classif.rpart                Classification Tree   classif
## 4:          regr.debug       Debug Learner for Regression      regr
## 5:    regr.featureless     Featureless Regression Learner      regr
## 6:          regr.rpart                    Regression Tree      regr
##                                           feature_types   packages
##                                                  <list>     <list>
## 1:     logical,integer,numeric,character,factor,ordered       mlr3
## 2: logical,integer,numeric,character,factor,ordered,...       mlr3
## 3:               logical,integer,numeric,factor,ordered mlr3,rpart
## 4:     logical,integer,numeric,character,factor,ordered mlr3,stats
## 5: logical,integer,numeric,character,factor,ordered,... mlr3,stats
## 6:               logical,integer,numeric,factor,ordered mlr3,rpart
##                                                                   properties
##                                                                       <list>
## 1: hotstart_forward,internal_tuning,marshal,missings,multiclass,twoclass,...
## 2: featureless,importance,missings,multiclass,selected_features,twoclass,...
## 3:         importance,missings,multiclass,selected_features,twoclass,weights
## 4:                                                          missings,weights
## 5:                 featureless,importance,missings,selected_features,weights
## 6:                             importance,missings,selected_features,weights
##            predict_types
##                   <list>
## 1:         response,prob
## 2:         response,prob
## 3:         response,prob
## 4: response,se,quantiles
## 5: response,se,quantiles
## 6:              response

This seems rather few! This is because additional packages are used to add features such as new learners. Many staple learners are in the mlr3learners add on package and further probabilistic learners are in mlr3proba. These update mlr_learners with a lot of additional options.

# Load more learners in supporting packages
library("mlr3learners")

as.data.table(mlr_learners)

## Key: <key>
##                     key                               label task_type
##                  <char>                              <char>    <char>
##  1:   classif.cv_glmnet GLM with Elastic Net Regularization   classif
##  2:       classif.debug    Debug Learner for Classification   classif
##  3: classif.featureless  Featureless Classification Learner   classif
##  4:      classif.glmnet GLM with Elastic Net Regularization   classif
##  5:        classif.kknn                  k-Nearest-Neighbor   classif
##  6:         classif.lda        Linear Discriminant Analysis   classif
##  7:     classif.log_reg                 Logistic Regression   classif
##  8:    classif.multinom        Multinomial Log-Linear Model   classif
##  9: classif.naive_bayes                         Naive Bayes   classif
## 10:        classif.nnet         Single Layer Neural Network   classif
## 11:         classif.qda     Quadratic Discriminant Analysis   classif
## 12:      classif.ranger                       Random Forest   classif
## 13:       classif.rpart                 Classification Tree   classif
## 14:         classif.svm              Support Vector Machine   classif
## 15:     classif.xgboost           Extreme Gradient Boosting   classif
## 16:      regr.cv_glmnet GLM with Elastic Net Regularization      regr
## 17:          regr.debug        Debug Learner for Regression      regr
## 18:    regr.featureless      Featureless Regression Learner      regr
## 19:         regr.glmnet GLM with Elastic Net Regularization      regr
## 20:           regr.kknn                  k-Nearest-Neighbor      regr
## 21:             regr.km                             Kriging      regr
## 22:             regr.lm                        Linear Model      regr
## 23:           regr.nnet         Single Layer Neural Network      regr
## 24:         regr.ranger                       Random Forest      regr
## 25:          regr.rpart                     Regression Tree      regr
## 26:            regr.svm              Support Vector Machine      regr
## 27:        regr.xgboost           Extreme Gradient Boosting      regr
##                     key                               label task_type
##                                            feature_types
##                                                   <list>
##  1:                              logical,integer,numeric
##  2:     logical,integer,numeric,character,factor,ordered
##  3: logical,integer,numeric,character,factor,ordered,...
##  4:                              logical,integer,numeric
##  5:               logical,integer,numeric,factor,ordered
##  6:               logical,integer,numeric,factor,ordered
##  7:     logical,integer,numeric,character,factor,ordered
##  8:                       logical,integer,numeric,factor
##  9:                       logical,integer,numeric,factor
## 10:               logical,integer,numeric,factor,ordered
## 11:               logical,integer,numeric,factor,ordered
## 12:     logical,integer,numeric,character,factor,ordered
## 13:               logical,integer,numeric,factor,ordered
## 14:                              logical,integer,numeric
## 15:                              logical,integer,numeric
## 16:                              logical,integer,numeric
## 17:     logical,integer,numeric,character,factor,ordered
## 18: logical,integer,numeric,character,factor,ordered,...
## 19:                              logical,integer,numeric
## 20:               logical,integer,numeric,factor,ordered
## 21:                              logical,integer,numeric
## 22:             logical,integer,numeric,character,factor
## 23:               logical,integer,numeric,factor,ordered
## 24:     logical,integer,numeric,character,factor,ordered
## 25:               logical,integer,numeric,factor,ordered
## 26:                              logical,integer,numeric
## 27:                              logical,integer,numeric
##                                            feature_types
##                          packages
##                            <list>
##  1:      mlr3,mlr3learners,glmnet
##  2:                          mlr3
##  3:                          mlr3
##  4:      mlr3,mlr3learners,glmnet
##  5:        mlr3,mlr3learners,kknn
##  6:        mlr3,mlr3learners,MASS
##  7:       mlr3,mlr3learners,stats
##  8:        mlr3,mlr3learners,nnet
##  9:       mlr3,mlr3learners,e1071
## 10:        mlr3,mlr3learners,nnet
## 11:        mlr3,mlr3learners,MASS
## 12:      mlr3,mlr3learners,ranger
## 13:                    mlr3,rpart
## 14:       mlr3,mlr3learners,e1071
## 15:     mlr3,mlr3learners,xgboost
## 16:      mlr3,mlr3learners,glmnet
## 17:                    mlr3,stats
## 18:                    mlr3,stats
## 19:      mlr3,mlr3learners,glmnet
## 20:        mlr3,mlr3learners,kknn
## 21: mlr3,mlr3learners,DiceKriging
## 22:       mlr3,mlr3learners,stats
## 23:        mlr3,mlr3learners,nnet
## 24:      mlr3,mlr3learners,ranger
## 25:                    mlr3,rpart
## 26:       mlr3,mlr3learners,e1071
## 27:     mlr3,mlr3learners,xgboost
##                          packages
##                                                                           properties
##                                                                               <list>
##  1:                             multiclass,offset,selected_features,twoclass,weights
##  2:        hotstart_forward,internal_tuning,marshal,missings,multiclass,twoclass,...
##  3:        featureless,importance,missings,multiclass,selected_features,twoclass,...
##  4:                                               multiclass,offset,twoclass,weights
##  5:                                                              multiclass,twoclass
##  6:                                                              multiclass,twoclass
##  7:                                                          offset,twoclass,weights
##  8:                                                      multiclass,twoclass,weights
##  9:                                                              multiclass,twoclass
## 10:                                                      multiclass,twoclass,weights
## 11:                                                              multiclass,twoclass
## 12: hotstart_backward,importance,missings,multiclass,oob_error,selected_features,...
## 13:                importance,missings,multiclass,selected_features,twoclass,weights
## 14:                                                              multiclass,twoclass
## 15:       hotstart_forward,importance,internal_tuning,missings,multiclass,offset,...
## 16:                                                 offset,selected_features,weights
## 17:                                                                 missings,weights
## 18:                        featureless,importance,missings,selected_features,weights
## 19:                                                                   offset,weights
## 20:                                                                                 
## 21:                                                                                 
## 22:                                                                   offset,weights
## 23:                                                                          weights
## 24:        hotstart_backward,importance,missings,oob_error,selected_features,weights
## 25:                                    importance,missings,selected_features,weights
## 26:                                                                                 
## 27:       hotstart_forward,importance,internal_tuning,missings,offset,validation,...
##                                                                           properties
##             predict_types
##                    <list>
##  1:         response,prob
##  2:         response,prob
##  3:         response,prob
##  4:         response,prob
##  5:         response,prob
##  6:         response,prob
##  7:         response,prob
##  8:         response,prob
##  9:         response,prob
## 10:         response,prob
## 11:         response,prob
## 12:         response,prob
## 13:         response,prob
## 14:         response,prob
## 15:         response,prob
## 16:              response
## 17: response,se,quantiles
## 18: response,se,quantiles
## 19:              response
## 20:              response
## 21:           response,se
## 22:           response,se
## 23:              response
## 24: response,se,quantiles
## 25:              response
## 26:              response
## 27:              response
##             predict_types

# ... whilst this just gives the names
mlr_learners

## <DictionaryLearner> with 27 stored values
## Keys: classif.cv_glmnet, classif.debug, classif.featureless,
##   classif.glmnet, classif.kknn, classif.lda, classif.log_reg,
##   classif.multinom, classif.naive_bayes, classif.nnet, classif.qda,
##   classif.ranger, classif.rpart, classif.svm, classif.xgboost,
##   regr.cv_glmnet, regr.debug, regr.featureless, regr.glmnet, regr.kknn,
##   regr.km, regr.lm, regr.nnet, regr.ranger, regr.rpart, regr.svm,
##   regr.xgboost

There are a variety of learners now. The start of their name indicates their functionality, including one additional category we did not separate out in our definition on the course:

classif for classification problems
regr for regression problems
dens for density estimation
surv for survival analysis (time to event)

You can get a nicer detailed view in RStudio by wrapping this in the View() command:

# Get more details on learners
View(as.data.table(mlr_learners))

In the View tab, pay particular attention to the feature_types and properties columns. This gives us information about the capabilities of different learners. For example, we know that we have numeric and factor data from the exploration above, so only those learners listing these under feature_types can handle our data natively — for other learners we have some work to do. Likewise, we know that we have missing values, so learners with missings listed under properties could handle our data without using na.omit or any other additional work.

Having defined the task, the next step is to define the learner, let’s say logistic regression here. We pass the lrn function the name from the table to access that learner.

# Define a logistic regression model
learner_lr <- lrn("classif.log_reg")
learner_lr

## 
## ── <LearnerClassifLogReg> (classif.log_reg): Logistic Regression ───────────────
## • Model: -
## • Parameters: use_pred_offset=TRUE
## • Packages: mlr3, mlr3learners, and stats
## • Predict Types: [response] and prob
## • Feature Types: logical, integer, numeric, character, factor, and ordered
## • Encapsulation: none (fallback: -)
## • Properties: offset, twoclass, and weights
## • Other settings: use_weights = 'use'

We now proceed to train this learner on the credit data task.

# Train the model
learner_lr$train(task_credit)

Once the learner is trained, we can use the same object to predict on the same data (ie in-sample training prediction)

# Perform prediction
pred <- learner_lr$predict(task_credit)
pred

## 
## ── <PredictionClassif> for 4039 observations: ──────────────────────────────────
##  row_ids truth response
##        1  good     good
##        2  good     good
##        3   bad      bad
##      ---   ---      ---
##     4037   bad     good
##     4038  good     good
##     4039  good     good

Finally, we can for example assess the training/apparent accuracy and confusion matrix by accessing this object. The confusion matrix is special and accessed directly in the prediction object that was returned. For other measures, we pass a measure object to the score function and that measure will be computed on the predictions. Just as we accessed learners via lrn, we access measure objects via msr.

# Evaluate some measures of error
pred$score(msr("classif.acc"))

## classif.acc 
##   0.8120822

pred$confusion

##         truth
## response  bad good
##     bad   474  207
##     good  552 2806

… and all the measures supported by MLR3 can be found by querying the mlr_measures object. A detailed reference of what these are as available at (https://mlr3measures.mlr-org.com/reference/index.html)[https://mlr3measures.mlr-org.com/reference/index.html]

# This variable shows available error measures
mlr_measures

## <DictionaryMeasure> with 63 stored values
## Keys: aic, bic, classif.acc, classif.auc, classif.bacc, classif.bbrier,
##   classif.ce, classif.costs, classif.dor, classif.fbeta, classif.fdr,
##   classif.fn, classif.fnr, classif.fomr, classif.fp, classif.fpr,
##   classif.logloss, classif.mauc_au1p, classif.mauc_au1u,
##   classif.mauc_aunp, classif.mauc_aunu, classif.mauc_mu,
##   classif.mbrier, classif.mcc, classif.npv, classif.ppv, classif.prauc,
##   classif.precision, classif.recall, classif.sensitivity,
##   classif.specificity, classif.tn, classif.tnr, classif.tp,
##   classif.tpr, debug_classif, internal_valid_score, oob_error,
##   regr.bias, regr.ktau, regr.mae, regr.mape, regr.maxae, regr.medae,
##   regr.medse, regr.mse, regr.msle, regr.pbias, regr.pinball, regr.rmse,
##   regr.rmsle, regr.rqr, regr.rsq, regr.sae, regr.smape, regr.srho,
##   regr.sse, selected_features, sim.jaccard, sim.phi, time_both,
##   time_predict, time_train

What happens if you try to get the apparent error using Brier score loss? By looking at the help file for Learner or otherwise, can you see how to compute this error?

# Try computing the Brier score loss ... what is wrong?
# Look at the help file for Learners and see how to rectify this
?Learner

# SOLUTION

# We need to set the prediction type to probabilistic
learner_lr$predict_type <- "prob"
# The prediction function automatically gathers everything necessary: ground
# truth, predicted response label, and probability
pred <- learner_lr$predict(task_credit)
pred

## 
## ── <PredictionClassif> for 4039 observations: ──────────────────────────────────
##  row_ids truth response  prob.bad prob.good
##        1  good     good 0.2486010 0.7513990
##        2  good     good 0.1132317 0.8867683
##        3   bad      bad 0.5613161 0.4386839
##      ---   ---      ---       ---       ---
##     4037   bad     good 0.3711673 0.6288327
##     4038  good     good 0.3149378 0.6850622
##     4039  good     good 0.2173202 0.7826798

# Then compute the Brier score
pred$score(msr("classif.bbrier"))

## classif.bbrier 
##      0.1329752

We can do the same for a random forest using ranger (Wright and Ziegler, 2017), this time with all the code at once for readability (we do not need to recreate the task).

# Uncomment and run the following command first if you do not have the ranger package
# install.packages("ranger")

# Redo everything for a random forest model
learner_rf <- lrn("classif.ranger")

learner_rf$train(task_credit)

pred_rf <- learner_rf$predict(task_credit)

pred_rf$score(msr("classif.acc"))

## classif.acc 
##   0.9997524

pred_rf$confusion

##         truth
## response  bad good
##     bad  1025    0
##     good    1 3013

Doing it properly!

Above we did the minimal fitting run and just estimated the apparent error to get a feel for the design of MLR 3. We now go into full detail, doing all steps properly.

First, we will redefine the task, this time note two changes: we do not wrap the data in an na.omit call (MLR can handle missingness as part of the pipeline, see later) and we also specifically identify what we mean by a “positive” case (for later calculation of true positive, etc – note that tidymodels uses the order of the factors to determine this).

# Redefinet the task, this time not getting rid of missing data here and 
# specifying what constitutes a positive case
credit_task <- TaskClassif$new(id = "BankCredit",
                               backend = credit_data,
                               target = "Status",
                               positive = "bad")

Next, lets examine the data splitting techniques supported by MLR:

# Let's see what resampling strategies MLR supports

# The final column are the defaults
as.data.table(mlr_resamplings)

## Key: <key>
##            key                         label        params iters
##         <char>                        <char>        <list> <int>
## 1:   bootstrap                     Bootstrap ratio,repeats    30
## 2:      custom                 Custom Splits                  NA
## 3:   custom_cv Custom Split Cross-Validation                  NA
## 4:          cv              Cross-Validation         folds    10
## 5:     holdout                       Holdout         ratio     1
## 6:    insample           Insample Resampling                   1
## 7:         loo                 Leave-One-Out                  NA
## 8: repeated_cv     Repeated Cross-Validation folds,repeats   100
## 9: subsampling                   Subsampling ratio,repeats    30

# ... whilst this just gives the names
mlr_resamplings

## <DictionaryResampling> with 9 stored values
## Keys: bootstrap, custom, custom_cv, cv, holdout, insample, loo,
##   repeated_cv, subsampling

# To see help on any of them, prefix the key name with mlr_resamplings_
?mlr_resamplings_cv

To then use one of these resampling methods, just as we accessed learners via lrn, and measure objects via msr, we access resampling via rsmp.

You can also see from the cross validation help file, that we could access individual training and testing folds via the `$train_set()

# The rsmp function constructs a resampling strategy, taking the name given
# above and allowing any options listed there to be chosen
cv <- rsmp("cv", folds = 3)
# We then instantiate this resampling scheme on the particular task we're
# working on
cv$instantiate(credit_task)

# You can see from the documentation that you could access individual folds
# training and testing data via:
cv$train_set(1)
cv$test_set(1)

Again, as with tidymodels we don’t want to create for loops etc, but will rely instead on the capabilities of the package to automate repetitive tasks.

Lets now create a few different models. Recall from the learners we listed above, we saw two models that support both factors and missings, so we could try these first:

classif.featureless is a so-called baseline classifier … it basically just predicts the most common response all the time ignoring the features! It is often a good idea to include this because if you don’t beat it then there is something very wrong!
classif.rpart does classification trees using the methodology we saw in the lectures.

In both cases we’re going to ask for probabilistic prediction (the default just predicts the label). Also, since we’re now running multiple models it can be helpful to give them an id so that we can more easily identify them.

lrn_baseline <- lrn("classif.featureless", predict_type = "prob", id = "baseline")
lrn_cart <- lrn("classif.rpart", predict_type = "prob", id = "tree")

We can see if these have any options or hyperparameters we can change too:

# Have a look at what options and hyperparameters the model possesses
lrn_baseline$param_set

## <ParamSet(1)>
##        id    class lower upper nlevels default  value
##    <char>   <char> <num> <num>   <num>  <list> <list>
## 1: method ParamFct    NA    NA       3    mode   mode

lrn_cart$param_set

## <ParamSet(10)>
##                 id    class lower upper nlevels        default  value
##             <char>   <char> <num> <num>   <num>         <list> <list>
##  1:             cp ParamDbl     0     1     Inf           0.01 [NULL]
##  2:     keep_model ParamLgl    NA    NA       2          FALSE [NULL]
##  3:     maxcompete ParamInt     0   Inf     Inf              4 [NULL]
##  4:       maxdepth ParamInt     1    30      30             30 [NULL]
##  5:   maxsurrogate ParamInt     0   Inf     Inf              5 [NULL]
##  6:      minbucket ParamInt     1   Inf     Inf <NoDefault[0]> [NULL]
##  7:       minsplit ParamInt     1   Inf     Inf             20 [NULL]
##  8: surrogatestyle ParamInt     0     1       2              0 [NULL]
##  9:   usesurrogate ParamInt     0     2       3              2 [NULL]
## 10:           xval ParamInt     0   Inf     Inf             10      0

Let’s now fit these learners using cross-validation to determine the accuracy.

# Fit models with cross validation
res_baseline <- resample(credit_task, lrn_baseline, cv, store_models = TRUE)
res_cart <- resample(credit_task, lrn_cart, cv, store_models = TRUE)

# Calculate and aggregate performance values
res_baseline$aggregate()

## classif.ce 
##   0.281542

res_cart$aggregate()

## classif.ce 
##  0.2355163

What is classif.ce? It is the mean misclassification error, or 0-1 loss generalisation error. Note that the error in the baseline classifier is only 0.282! So this gives us an idea of what we need to beat (we might have expected 0.5, but this is an imbalanced dataset as is often the case with real data).

We can get many other error measures:

# Remember the error measures we can get ... (or use View() in RStudio for
# nicer format)
as.data.table(mlr_measures)

## Key: <key>
##                      key                                               label
##                   <char>                                              <char>
##  1:                  aic                        Akaike Information Criterion
##  2:                  bic                      Bayesian Information Criterion
##  3:          classif.acc                             Classification Accuracy
##  4:          classif.auc                            Area Under the ROC Curve
##  5:         classif.bacc                                   Balanced Accuracy
##  6:       classif.bbrier                                  Binary Brier Score
##  7:           classif.ce                                Classification Error
##  8:        classif.costs                       Cost-sensitive Classification
##  9:          classif.dor                               Diagnostic Odds Ratio
## 10:        classif.fbeta                                        F-beta score
## 11:          classif.fdr                                False Discovery Rate
## 12:           classif.fn                                     False Negatives
## 13:          classif.fnr                                 False Negative Rate
## 14:         classif.fomr                                 False Omission Rate
## 15:           classif.fp                                     False Positives
## 16:          classif.fpr                                 False Positive Rate
## 17:      classif.logloss                                            Log Loss
## 18:    classif.mauc_au1p             Weighted average 1 vs. 1 multiclass AUC
## 19:    classif.mauc_au1u                      Average 1 vs. 1 multiclass AUC
## 20:    classif.mauc_aunp          Weighted average 1 vs. rest multiclass AUC
## 21:    classif.mauc_aunu                   Average 1 vs. rest multiclass AUC
## 22:      classif.mauc_mu                                   Multiclass mu AUC
## 23:       classif.mbrier                              Multiclass Brier Score
## 24:          classif.mcc                    Matthews Correlation Coefficient
## 25:          classif.npv                           Negative Predictive Value
## 26:          classif.ppv                           Positive Predictive Value
## 27:        classif.prauc                              Precision-Recall Curve
## 28:    classif.precision                                           Precision
## 29:       classif.recall                                              Recall
## 30:  classif.sensitivity                                         Sensitivity
## 31:  classif.specificity                                         Specificity
## 32:           classif.tn                                      True Negatives
## 33:          classif.tnr                                  True Negative Rate
## 34:           classif.tp                                      True Positives
## 35:          classif.tpr                                  True Positive Rate
## 36:        debug_classif                        Debug Classification Measure
## 37: internal_valid_score                           Internal Validation Score
## 38:            oob_error                                    Out-of-bag Error
## 39:            regr.bias                                                Bias
## 40:            regr.ktau                                       Kendall's tau
## 41:             regr.mae                                 Mean Absolute Error
## 42:            regr.mape                         Mean Absolute Percent Error
## 43:           regr.maxae                                  Max Absolute Error
## 44:           regr.medae                               Median Absolute Error
## 45:           regr.medse                                Median Squared Error
## 46:             regr.mse                                  Mean Squared Error
## 47:            regr.msle                              Mean Squared Log Error
## 48:           regr.pbias                                        Percent Bias
## 49:         regr.pinball                                                <NA>
## 50:            regr.rmse                             Root Mean Squared Error
## 51:           regr.rmsle                         Root Mean Squared Log Error
## 52:             regr.rqr                                                <NA>
## 53:             regr.rsq                                                <NA>
## 54:             regr.sae                              Sum of Absolute Errors
## 55:           regr.smape               Symmetric Mean Absolute Percent Error
## 56:            regr.srho                                      Spearman's rho
## 57:             regr.sse                               Sum of Squared Errors
## 58:    selected_features Absolute or Relative Frequency of Selected Features
## 59:          sim.jaccard                            Jaccard Similarity Index
## 60:              sim.phi                          Phi Coefficient Similarity
## 61:            time_both                                        Elapsed Time
## 62:         time_predict                                        Elapsed Time
## 63:           time_train                                        Elapsed Time
##                      key                                               label
##     task_type          packages predict_type
##        <char>            <list>       <char>
##  1:      <NA>              mlr3         <NA>
##  2:      <NA>              mlr3         <NA>
##  3:   classif mlr3,mlr3measures     response
##  4:   classif mlr3,mlr3measures         prob
##  5:   classif mlr3,mlr3measures     response
##  6:   classif mlr3,mlr3measures         prob
##  7:   classif mlr3,mlr3measures     response
##  8:   classif              mlr3     response
##  9:   classif mlr3,mlr3measures     response
## 10:   classif mlr3,mlr3measures     response
## 11:   classif mlr3,mlr3measures     response
## 12:   classif mlr3,mlr3measures     response
## 13:   classif mlr3,mlr3measures     response
## 14:   classif mlr3,mlr3measures     response
## 15:   classif mlr3,mlr3measures     response
## 16:   classif mlr3,mlr3measures     response
## 17:   classif mlr3,mlr3measures         prob
## 18:   classif mlr3,mlr3measures         prob
## 19:   classif mlr3,mlr3measures         prob
## 20:   classif mlr3,mlr3measures         prob
## 21:   classif mlr3,mlr3measures         prob
## 22:   classif mlr3,mlr3measures         prob
## 23:   classif mlr3,mlr3measures         prob
## 24:   classif mlr3,mlr3measures     response
## 25:   classif mlr3,mlr3measures     response
## 26:   classif mlr3,mlr3measures     response
## 27:   classif mlr3,mlr3measures         prob
## 28:   classif mlr3,mlr3measures     response
## 29:   classif mlr3,mlr3measures     response
## 30:   classif mlr3,mlr3measures     response
## 31:   classif mlr3,mlr3measures     response
## 32:   classif mlr3,mlr3measures     response
## 33:   classif mlr3,mlr3measures     response
## 34:   classif mlr3,mlr3measures     response
## 35:   classif mlr3,mlr3measures     response
## 36:      <NA>              mlr3     response
## 37:      <NA>              mlr3         <NA>
## 38:      <NA>              mlr3         <NA>
## 39:      regr mlr3,mlr3measures     response
## 40:      regr mlr3,mlr3measures     response
## 41:      regr mlr3,mlr3measures     response
## 42:      regr mlr3,mlr3measures     response
## 43:      regr mlr3,mlr3measures     response
## 44:      regr mlr3,mlr3measures     response
## 45:      regr mlr3,mlr3measures     response
## 46:      regr mlr3,mlr3measures     response
## 47:      regr mlr3,mlr3measures     response
## 48:      regr mlr3,mlr3measures     response
## 49:      regr              mlr3    quantiles
## 50:      regr mlr3,mlr3measures     response
## 51:      regr mlr3,mlr3measures     response
## 52:      regr              mlr3    quantiles
## 53:      regr              mlr3     response
## 54:      regr mlr3,mlr3measures     response
## 55:      regr mlr3,mlr3measures     response
## 56:      regr mlr3,mlr3measures     response
## 57:      regr mlr3,mlr3measures     response
## 58:      <NA>              mlr3         <NA>
## 59:      <NA> mlr3,mlr3measures         <NA>
## 60:      <NA> mlr3,mlr3measures         <NA>
## 61:      <NA>              mlr3         <NA>
## 62:      <NA>              mlr3         <NA>
## 63:      <NA>              mlr3         <NA>
##     task_type          packages predict_type
##                                                               properties
##                                                                   <list>
##  1:      na_score,requires_learner,requires_model,requires_no_prediction
##  2:      na_score,requires_learner,requires_model,requires_no_prediction
##  3:                                                              weights
##  4:                                                                     
##  5:                                                              weights
##  6:                                                              weights
##  7:                                                              weights
##  8:                                                              weights
##  9:                                                                     
## 10:                                                                     
## 11:                                                                     
## 12:                                                                     
## 13:                                                                     
## 14:                                                                     
## 15:                                                                     
## 16:                                                                     
## 17:                                                              weights
## 18:                                                                     
## 19:                                                                     
## 20:                                                                     
## 21:                                                                     
## 22:                                                                     
## 23:                                                                     
## 24:                                                                     
## 25:                                                                     
## 26:                                                                     
## 27:                                                                     
## 28:                                                                     
## 29:                                                                     
## 30:                                                                     
## 31:                                                                     
## 32:                                                                     
## 33:                                                                     
## 34:                                                                     
## 35:                                                                     
## 36:                                                             na_score
## 37:                     na_score,requires_learner,requires_no_prediction
## 38:                     na_score,requires_learner,requires_no_prediction
## 39:                                                              weights
## 40:                                                                     
## 41:                                                              weights
## 42:                                                              weights
## 43:                                                                     
## 44:                                                                     
## 45:                                                                     
## 46:                                                              weights
## 47:                                                              weights
## 48:                                                              weights
## 49:                                                                     
## 50:                                                              weights
## 51:                                                              weights
## 52:                                                               [NULL]
## 53:                                                              weights
## 54:                                                                     
## 55:                                                                     
## 56:                                                                     
## 57:                                                                     
## 58: requires_task,requires_learner,requires_model,requires_no_prediction
## 59:                                requires_model,requires_no_prediction
## 60:                                requires_model,requires_no_prediction
## 61:                              requires_learner,requires_no_prediction
## 62:                              requires_learner,requires_no_prediction
## 63:                              requires_learner,requires_no_prediction
##                                                               properties
##     task_properties
##              <list>
##  1:                
##  2:                
##  3:                
##  4:        twoclass
##  5:                
##  6:        twoclass
##  7:                
##  8:                
##  9:        twoclass
## 10:        twoclass
## 11:        twoclass
## 12:        twoclass
## 13:        twoclass
## 14:        twoclass
## 15:        twoclass
## 16:        twoclass
## 17:                
## 18:                
## 19:                
## 20:                
## 21:                
## 22:                
## 23:                
## 24:                
## 25:        twoclass
## 26:        twoclass
## 27:        twoclass
## 28:        twoclass
## 29:        twoclass
## 30:        twoclass
## 31:        twoclass
## 32:        twoclass
## 33:        twoclass
## 34:        twoclass
## 35:        twoclass
## 36:                
## 37:                
## 38:                
## 39:                
## 40:                
## 41:                
## 42:                
## 43:                
## 44:                
## 45:                
## 46:                
## 47:                
## 48:                
## 49:                
## 50:                
## 51:                
## 52:                
## 53:                
## 54:                
## 55:                
## 56:                
## 57:                
## 58:                
## 59:                
## 60:                
## 61:                
## 62:                
## 63:                
##     task_properties

# ... whilst this just gives the names
mlr_measures

## <DictionaryMeasure> with 63 stored values
## Keys: aic, bic, classif.acc, classif.auc, classif.bacc, classif.bbrier,
##   classif.ce, classif.costs, classif.dor, classif.fbeta, classif.fdr,
##   classif.fn, classif.fnr, classif.fomr, classif.fp, classif.fpr,
##   classif.logloss, classif.mauc_au1p, classif.mauc_au1u,
##   classif.mauc_aunp, classif.mauc_aunu, classif.mauc_mu,
##   classif.mbrier, classif.mcc, classif.npv, classif.ppv, classif.prauc,
##   classif.precision, classif.recall, classif.sensitivity,
##   classif.specificity, classif.tn, classif.tnr, classif.tp,
##   classif.tpr, debug_classif, internal_valid_score, oob_error,
##   regr.bias, regr.ktau, regr.mae, regr.mape, regr.maxae, regr.medae,
##   regr.medse, regr.mse, regr.msle, regr.pbias, regr.pinball, regr.rmse,
##   regr.rmsle, regr.rqr, regr.rsq, regr.sae, regr.smape, regr.srho,
##   regr.sse, selected_features, sim.jaccard, sim.phi, time_both,
##   time_predict, time_train

# Again, to see help on any of them, prefix the key name with mlr_measures_
?mlr_measures_classif.ce

We can request the benchmark to show us multiple of these measures:

res_baseline$aggregate(list(msr("classif.ce"),
                            msr("classif.acc"),
                            msr("classif.auc"),
                            msr("classif.fpr"),
                            msr("classif.fnr")))

##  classif.ce classif.acc classif.auc classif.fpr classif.fnr 
##    0.281542    0.718458    0.500000    0.000000    1.000000

res_cart$aggregate(list(msr("classif.ce"),
                        msr("classif.acc"),
                        msr("classif.auc"),
                        msr("classif.fpr"),
                        msr("classif.fnr")))

##  classif.ce classif.acc classif.auc classif.fpr classif.fnr 
##  0.23551629  0.76448371  0.73213136  0.09192862  0.60204032

When we want to do multiple models it is actually more convenient to use the benchmark function to run them all on a grid we define (so we can actually run multiple learners on multiple tasks using multiple resampling strategies, as it takes the Cartesian product of all options):

# Use the benchmark functionality to do everything at once, ensuring identical
# settings such as task, folds, etc
res <- benchmark(
  benchmark_grid(
    task        = list(credit_task),
    learners    = list(lrn_baseline,
                       lrn_cart),
    resamplings = list(rsmp("cv", folds = 3))
  ), store_models = TRUE)
res

## 
## ── <BenchmarkResult> of 6 rows with 2 resampling run ───────────────────────────
##  nr    task_id learner_id resampling_id iters warnings errors
##   1 BankCredit   baseline            cv     3        0      0
##   2 BankCredit       tree            cv     3        0      0

res$aggregate()

##       nr    task_id learner_id resampling_id iters classif.ce
##    <int>     <char>     <char>        <char> <int>      <num>
## 1:     1 BankCredit   baseline            cv     3  0.2815420
## 2:     2 BankCredit       tree            cv     3  0.2330479
## Hidden columns: resample_result

Let’s request the more interesting suite of measures from the benchmark (a lot simpler and neater than doing this call repeatedly for each model ourselves):

res$aggregate(list(msr("classif.ce"),
                   msr("classif.acc"),
                   msr("classif.auc"),
                   msr("classif.fpr"),
                   msr("classif.fnr")))

##       nr    task_id learner_id resampling_id iters classif.ce classif.acc
##    <int>     <char>     <char>        <char> <int>      <num>       <num>
## 1:     1 BankCredit   baseline            cv     3  0.2815420   0.7184580
## 2:     2 BankCredit       tree            cv     3  0.2330479   0.7669521
##    classif.auc classif.fpr classif.fnr
##          <num>       <num>       <num>
## 1:    0.500000  0.00000000   1.0000000
## 2:    0.711091  0.09392361   0.5870528
## Hidden columns: resample_result

Advanced look at models within

We can examine in depth the results by getting out the models fitted in each fold:

# Get the trees (2nd model fitted), by asking for second set of resample
# results
trees <- res$resample_result(2)

# Then, let's look at the tree from first CV iteration, for example:
tree1 <- trees$learners[[1]]

# This is a fitted rpart object, so we can look at the model within
tree1_rpart <- tree1$model

# If you look in the rpart package documentation, it tells us how to plot the
# tree that was fitted
plot(tree1_rpart, compress = TRUE, margin = 0.1)
text(tree1_rpart, use.n = TRUE, cex = 0.8)

We can see the other trees too. Change the 3 in double brackets [[]] below to other values from 1 to 3 to see the model from each round of cross validation.

# Looking at other rounds from CV
plot(res$resample_result(2)$learners[[3]]$model, compress = TRUE, margin = 0.1)
text(res$resample_result(2)$learners[[3]]$model, use.n = TRUE, cex = 0.8)

It may be that these trees need to be pruned. To do this, we would need to enable the cross-validation option to rpart in the learner. We can fit this individually and make a selection for the cost penalty (see alpha in lectures), before then setting this value when benchmarking (NOTE: this is not quite optimal but MLR3 doesn’t yet have the option for us to select this within folds … coming soon hopefully).

In particular, note we are now doing nested cross validation which is the correct way to do parameter selection without biasing test error. Change the 3 in double brackets [[]] to other values from 1 to 3 to see cross validation plot from each round.

# Enable nested cross validation
lrn_cart_cv <- lrn("classif.rpart", predict_type = "prob", xval = 10, id = "tree.cv")

res_cart_cv <- resample(credit_task, lrn_cart_cv, cv, store_models = TRUE)
rpart::plotcp(res_cart_cv$learners[[3]]$model)

Now, choose a cost penalty and add this as a model to our benchmark set:

# Try refitting with a chosen complexity parameter for pruning
lrn_cart_cp <- lrn("classif.rpart", predict_type = "prob", cp = 0.016, id = "tree.pruned")

# Then run this in the benchmark with other options
res <- benchmark(benchmark_grid(
  task        = list(credit_task),
  learners    = list(baseline   = lrn_baseline,
                     cart       = lrn_cart,
                     cart_prune = lrn_cart_cp),
  resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)

res$aggregate(list(msr("classif.ce"),
                   msr("classif.acc"),
                   msr("classif.auc"),
                   msr("classif.fpr"),
                   msr("classif.fnr")))

##       nr    task_id  learner_id resampling_id iters classif.ce classif.acc
##    <int>     <char>      <char>        <char> <int>      <num>       <num>
## 1:     1 BankCredit    baseline            cv     3  0.2815433   0.7184567
## 2:     2 BankCredit        tree            cv     3  0.2301289   0.7698711
## 3:     3 BankCredit tree.pruned            cv     3  0.2395621   0.7604379
##    classif.auc classif.fpr classif.fnr
##          <num>       <num>       <num>
## 1:   0.5000000  0.00000000   1.0000000
## 2:   0.7156031  0.09086107   0.5846605
## 3:   0.6562473  0.06504538   0.6858471
## Hidden columns: resample_result

In this case we see a slight improvement to false-positive rate at the cost of higher errors elsewhere. These might be tradeoffs you need to make in the real world.

Dealing with missingness and factors

To handle missing data and factors, we will need to introduce a modelling pipeline. In this pipeline we need to impute missing values and dummy/one-hot code factors.

Pipelines allow us to create a sophisticated workflow without having to manually code how everything ties together. To see what pipeline operations are available:

# Trying out pipelines
library("mlr3pipelines")

# Pipelines available (or use View() in RStudio for a nicer look) ...
as.data.table(mlr_pipeops)

## Key: <key>
##                       key
##                    <char>
##  1:                  adas
##  2:               blsmote
##  3:                boxcox
##  4:                branch
##  5:                 chunk
##  6:        classbalancing
##  7:            classifavg
##  8:          classweights
##  9:              colapply
## 10:       collapsefactors
## 11:              colroles
## 12:                  copy
## 13:          datefeatures
## 14:                decode
## 15:                encode
## 16:          encodeimpact
## 17:            encodelmer
## 18:     encodeplquantiles
## 19:          encodepltree
## 20:          featureunion
## 21:                filter
## 22:            fixfactors
## 23:               histbin
## 24:                   ica
## 25:        imputeconstant
## 26:            imputehist
## 27:         imputelearner
## 28:            imputemean
## 29:          imputemedian
## 30:            imputemode
## 31:             imputeoor
## 32:          imputesample
## 33:             kernelpca
## 34:               learner
## 35:            learner_cv
## 36:     learner_pi_cvplus
## 37:     learner_quantiles
## 38:               missind
## 39:           modelmatrix
## 40:     multiplicityexply
## 41:     multiplicityimply
## 42:                mutate
## 43:              nearmiss
## 44:                   nmf
## 45:                   nop
## 46:              ovrsplit
## 47:              ovrunite
## 48:                   pca
## 49:                 proxy
## 50:           quantilebin
## 51:      randomprojection
## 52:        randomresponse
## 53:               regravg
## 54:       removeconstants
## 55:         renamecolumns
## 56:             replicate
## 57:              rowapply
## 58:                 scale
## 59:           scalemaxabs
## 60:            scalerange
## 61:                select
## 62:                 smote
## 63:               smotenc
## 64:           spatialsign
## 65:             subsample
## 66:          targetinvert
## 67:          targetmutate
## 68: targettrafoscalerange
## 69:        textvectorizer
## 70:             threshold
## 71:                 tomek
## 72:         tunethreshold
## 73:              unbranch
## 74:                vtreat
## 75:            yeojohnson
##                       key
##                                                                                           label
##                                                                                          <char>
##  1:                                                                              ADAS Balancing
##  2:                                                                           BLSMOTE Balancing
##  3:                                                  Box-Cox Transformation of Numeric Features
##  4:                                                                              Path Branching
##  5:                                                           Chunk Input into Multiple Outputs
##  6:                                                                             Class Balancing
##  7:                                                                    Majority Vote Prediction
##  8:                                                          Class Weights for Sample Weighting
##  9:                                                   Apply a Function to each Column of a Task
## 10:                                                                            Collapse Factors
## 11:                                                               Change Column Roles of a Task
## 12:                                                                   Copy Input Multiple Times
## 13:                                                                    Preprocess Date Features
## 14:                                                                     Reverse Factor Encoding
## 15:                                                                             Factor Encoding
## 16:                                                    Conditional Target Value Impact Encoding
## 17:                                                Impact Encoding with Random Intercept Models
## 18:                                                   Piecewise Linear Encoding using Quantiles
## 19:                                              Piecewise Linear Encoding using Decision Trees
## 20:                                                     Aggregate Features from Multiple Inputs
## 21:                                                                           Feature Filtering
## 22:                                                                           Fix Factor Levels
## 23:                                             Split Numeric Features into Equally Spaced Bins
## 24:                                                              Independent Component Analysis
## 25:                                                               Impute Features by a Constant
## 26:                                                      Impute Numerical Features by Histogram
## 27:                                                        Impute Features by Fitting a Learner
## 28:                                                     Impute Numerical Features by their Mean
## 29:                                                   Impute Numerical Features by their Median
## 30:                                                               Impute Features by their Mode
## 31:                                                                     Out of Range Imputation
## 32:                                                                 Impute Features by Sampling
## 33:                                                     Kernelized Principal Component Analysis
## 34:                                                                Wrap a Learner into a PipeOp
## 35:                   Wrap a Learner into a PipeOp with Cross-validated Predictions as Features
## 36: Wrap a Learner into a PipeOp with Cross-validation Plus Confidence Intervals as Predictions
## 37:                               Wrap a Learner into a PipeOp to to predict multiple Quantiles
## 38:                                                               Add Missing Indicator Columns
## 39:                                            Transform Columns by Constructing a Model Matrix
## 40:                                                                    Explicate a Multiplicity
## 41:                                                                    Implicate a Multiplicity
## 42:                                                       Add Features According to Expressions
## 43:                                                                      Nearmiss Down-Sampling
## 44:                                                           Non-negative Matrix Factorization
## 45:                                                                   Simply Push Input Forward
## 46:                                Split a Classification Task into Binary Classification Tasks
## 47:                                                           Unite Binary Classification Tasks
## 48:                                                                Principle Component Analysis
## 49:                                            Wrap another PipeOp or Graph as a Hyperparameter
## 50:                                                   Split Numeric Features into Quantile Bins
## 51:                                   Project Numeric Features onto a Randomly Sampled Subspace
## 52:                                                   Generate a Randomized Response Prediction
## 53:                                                               Weighted Prediction Averaging
## 54:                                                                    Remove Constant Features
## 55:                                                                              Rename Columns
## 56:                                                       Replicate the Input as a Multiplicity
## 57:                                                      Apply a Function to each Row of a Task
## 58:                                                           Center and Scale Numeric Features
## 59:                         Scale Numeric Features with Respect to their Maximum Absolute Value
## 60:                               Linearly Transform Numeric Features to Match Given Boundaries
## 61:                                                     Remove Features Depending on a Selector
## 62:                                                                             SMOTE Balancing
## 63:                                                                           SMOTENC Balancing
## 64:                                                                     Normalize Data Row-wise
## 65:                                                                                 Subsampling
## 66:                                                               Invert Target Transformations
## 67:                                                            Transform a Target by a Function
## 68:                               Linearly Transform a Numeric Target to Match Given Boundaries
## 69:                                            Bag-of-word Representation of Character Features
## 70:                                         Change the Threshold of a Classification Prediction
## 71:                                                                         Tomek Down-Sampling
## 72:                                           Tune the Threshold of a Classification Prediction
## 73:                                                                    Unbranch Different Paths
## 74:                                                             Interface to the vtreat Package
## 75:                                              Yeo-Johnson Transformation of Numeric Features
##                                                                                           label
##                             packages                             tags
##                               <list>                           <list>
##  1:        mlr3pipelines,smotefamily   imbalanced data,data transform
##  2:        mlr3pipelines,smotefamily   imbalanced data,data transform
##  3:      mlr3pipelines,bestNormalize                   data transform
##  4:                    mlr3pipelines                             meta
##  5:                    mlr3pipelines                             meta
##  6:                    mlr3pipelines   imbalanced data,data transform
##  7:              mlr3pipelines,stats                         ensemble
##  8:                    mlr3pipelines   imbalanced data,data transform
##  9:                    mlr3pipelines                   data transform
## 10:                    mlr3pipelines                   data transform
## 11:                    mlr3pipelines                   data transform
## 12:                    mlr3pipelines                             meta
## 13:                    mlr3pipelines                   data transform
## 14:                    mlr3pipelines            encode,data transform
## 15:              mlr3pipelines,stats            encode,data transform
## 16:                    mlr3pipelines            encode,data transform
## 17:        mlr3pipelines,lme4,nloptr            encode,data transform
## 18:              mlr3pipelines,stats            encode,data transform
## 19:         mlr3pipelines,mlr3,rpart            encode,data transform
## 20:                    mlr3pipelines                         ensemble
## 21:                    mlr3pipelines feature selection,data transform
## 22:                    mlr3pipelines         robustify,data transform
## 23:           mlr3pipelines,graphics                   data transform
## 24:            mlr3pipelines,fastICA                   data transform
## 25:                    mlr3pipelines                         missings
## 26:           mlr3pipelines,graphics                         missings
## 27:                    mlr3pipelines                         missings
## 28:                    mlr3pipelines                         missings
## 29:              mlr3pipelines,stats                         missings
## 30:                    mlr3pipelines                         missings
## 31:                    mlr3pipelines                         missings
## 32:                    mlr3pipelines                         missings
## 33:            mlr3pipelines,kernlab                   data transform
## 34:                    mlr3pipelines                          learner
## 35:                    mlr3pipelines  learner,ensemble,data transform
## 36:                    mlr3pipelines                 learner,ensemble
## 37:                    mlr3pipelines                 learner,ensemble
## 38:                    mlr3pipelines          missings,data transform
## 39:              mlr3pipelines,stats                   data transform
## 40:                    mlr3pipelines                     multiplicity
## 41:                    mlr3pipelines                     multiplicity
## 42:                    mlr3pipelines                   data transform
## 43:             mlr3pipelines,themis   imbalanced data,data transform
## 44:           mlr3pipelines,MASS,NMF                   data transform
## 45:                    mlr3pipelines                             meta
## 46:                    mlr3pipelines    target transform,multiplicity
## 47:                    mlr3pipelines            multiplicity,ensemble
## 48:                    mlr3pipelines                   data transform
## 49:                    mlr3pipelines                             meta
## 50:              mlr3pipelines,stats                   data transform
## 51:                    mlr3pipelines                   data transform
## 52:                    mlr3pipelines                         abstract
## 53:                    mlr3pipelines                         ensemble
## 54:                    mlr3pipelines         robustify,data transform
## 55:                    mlr3pipelines                   data transform
## 56:                    mlr3pipelines                     multiplicity
## 57:                    mlr3pipelines                   data transform
## 58:                    mlr3pipelines                   data transform
## 59:                    mlr3pipelines                   data transform
## 60:                    mlr3pipelines                   data transform
## 61:                    mlr3pipelines feature selection,data transform
## 62:        mlr3pipelines,smotefamily   imbalanced data,data transform
## 63:             mlr3pipelines,themis   imbalanced data,data transform
## 64:                    mlr3pipelines                   data transform
## 65:                    mlr3pipelines                   data transform
## 66:                    mlr3pipelines                 target transform
## 67:                    mlr3pipelines                 target transform
## 68:                    mlr3pipelines                 target transform
## 69: mlr3pipelines,quanteda,stopwords                   data transform
## 70:                    mlr3pipelines                 target transform
## 71:             mlr3pipelines,themis   imbalanced data,data transform
## 72:              mlr3pipelines,bbotk                 target transform
## 73:                    mlr3pipelines                             meta
## 74:             mlr3pipelines,vtreat   encode,missings,data transform
## 75:      mlr3pipelines,bestNormalize                   data transform
##                             packages                             tags
##                                            feature_types input.num output.num
##                                                   <list>     <int>      <int>
##  1: logical,integer,numeric,character,factor,ordered,...         1          1
##  2: logical,integer,numeric,character,factor,ordered,...         1          1
##  3:                                      numeric,integer         1          1
##  4:                                                   NA         1         NA
##  5:                                                   NA         1         NA
##  6: logical,integer,numeric,character,factor,ordered,...         1          1
##  7:                                                   NA        NA          1
##  8: logical,integer,numeric,character,factor,ordered,...         1          1
##  9: logical,integer,numeric,character,factor,ordered,...         1          1
## 10:                                       factor,ordered         1          1
## 11: logical,integer,numeric,character,factor,ordered,...         1          1
## 12:                                                   NA         1         NA
## 13:                                              POSIXct         1          1
## 14:                                      integer,numeric         1          1
## 15:                                       factor,ordered         1          1
## 16:                                       factor,ordered         1          1
## 17:                                       factor,ordered         1          1
## 18:                                      numeric,integer         1          1
## 19:                                      numeric,integer         1          1
## 20:                                                   NA        NA          1
## 21: logical,integer,numeric,character,factor,ordered,...         1          1
## 22:                                       factor,ordered         1          1
## 23:                                      numeric,integer         1          1
## 24:                                      numeric,integer         1          1
## 25: logical,integer,numeric,character,factor,ordered,...         1          1
## 26:                                      integer,numeric         1          1
## 27:                               logical,factor,ordered         1          1
## 28:                                      numeric,integer         1          1
## 29:                                      numeric,integer         1          1
## 30:               factor,integer,logical,numeric,ordered         1          1
## 31:             character,factor,integer,numeric,ordered         1          1
## 32:               factor,integer,logical,numeric,ordered         1          1
## 33:                                      numeric,integer         1          1
## 34:                                                   NA         1          1
## 35: logical,integer,numeric,character,factor,ordered,...         1          1
## 36:                                                   NA         1          1
## 37:                                                   NA         1          1
## 38: logical,integer,numeric,character,factor,ordered,...         1          1
## 39: logical,integer,numeric,character,factor,ordered,...         1          1
## 40:                                                   NA         1         NA
## 41:                                                   NA        NA          1
## 42: logical,integer,numeric,character,factor,ordered,...         1          1
## 43: logical,integer,numeric,character,factor,ordered,...         1          1
## 44:                                      numeric,integer         1          1
## 45:                                                   NA         1          1
## 46:                                                   NA         1          1
## 47:                                                   NA         1          1
## 48:                                      numeric,integer         1          1
## 49:                                                   NA        NA          1
## 50:                                      numeric,integer         1          1
## 51:                                      numeric,integer         1          1
## 52:                                                   NA         1          1
## 53:                                                   NA        NA          1
## 54: logical,integer,numeric,character,factor,ordered,...         1          1
## 55: logical,integer,numeric,character,factor,ordered,...         1          1
## 56:                                                   NA         1          1
## 57:                                      numeric,integer         1          1
## 58:                                      numeric,integer         1          1
## 59:                                      numeric,integer         1          1
## 60:                                      numeric,integer         1          1
## 61: logical,integer,numeric,character,factor,ordered,...         1          1
## 62: logical,integer,numeric,character,factor,ordered,...         1          1
## 63: logical,integer,numeric,character,factor,ordered,...         1          1
## 64:                                      numeric,integer         1          1
## 65: logical,integer,numeric,character,factor,ordered,...         1          1
## 66:                                                   NA         2          1
## 67:                                                   NA         1          2
## 68:                                                   NA         1          2
## 69:                                            character         1          1
## 70:                                                   NA         1          1
## 71: logical,integer,numeric,character,factor,ordered,...         1          1
## 72:                                                   NA         1          1
## 73:                                                   NA        NA          1
## 74: logical,integer,numeric,character,factor,ordered,...         1          1
## 75:                                      numeric,integer         1          1
##                                            feature_types input.num output.num
##     input.type.train  input.type.predict output.type.train output.type.predict
##               <list>              <list>            <list>              <list>
##  1:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
##  2:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
##  3:             Task                Task              Task                Task
##  4:                *                   *                 *                   *
##  5:             Task                Task              Task                Task
##  6:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
##  7:             NULL   PredictionClassif              NULL   PredictionClassif
##  8:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
##  9:             Task                Task              Task                Task
## 10:             Task                Task              Task                Task
## 11:             Task                Task              Task                Task
## 12:                *                   *                 *                   *
## 13:             Task                Task              Task                Task
## 14:             Task                Task              Task                Task
## 15:             Task                Task              Task                Task
## 16:   TaskSupervised      TaskSupervised    TaskSupervised      TaskSupervised
## 17:   TaskSupervised      TaskSupervised    TaskSupervised      TaskSupervised
## 18:             Task                Task              Task                Task
## 19:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 20:             Task                Task              Task                Task
## 21:             Task                Task              Task                Task
## 22:             Task                Task              Task                Task
## 23:             Task                Task              Task                Task
## 24:             Task                Task              Task                Task
## 25:             Task                Task              Task                Task
## 26:             Task                Task              Task                Task
## 27:             Task                Task              Task                Task
## 28:             Task                Task              Task                Task
## 29:             Task                Task              Task                Task
## 30:             Task                Task              Task                Task
## 31:             Task                Task              Task                Task
## 32:             Task                Task              Task                Task
## 33:             Task                Task              Task                Task
## 34:      TaskClassif         TaskClassif              NULL   PredictionClassif
## 35:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 36:         TaskRegr            TaskRegr              NULL      PredictionRegr
## 37:         TaskRegr            TaskRegr              NULL      PredictionRegr
## 38:             Task                Task              Task                Task
## 39:             Task                Task              Task                Task
## 40:              [*]                 [*]                 *                   *
## 41:                *                   *               [*]                 [*]
## 42:             Task                Task              Task                Task
## 43:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 44:             Task                Task              Task                Task
## 45:                *                   *                 *                   *
## 46:      TaskClassif         TaskClassif     [TaskClassif]       [TaskClassif]
## 47:           [NULL] [PredictionClassif]              NULL   PredictionClassif
## 48:             Task                Task              Task                Task
## 49:                *                   *                 *                   *
## 50:             Task                Task              Task                Task
## 51:             Task                Task              Task                Task
## 52:             NULL          Prediction              NULL          Prediction
## 53:             NULL      PredictionRegr              NULL      PredictionRegr
## 54:             Task                Task              Task                Task
## 55:             Task                Task              Task                Task
## 56:                *                   *               [*]                 [*]
## 57:             Task                Task              Task                Task
## 58:             Task                Task              Task                Task
## 59:             Task                Task              Task                Task
## 60:             Task                Task              Task                Task
## 61:             Task                Task              Task                Task
## 62:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 63:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 64:             Task                Task              Task                Task
## 65:             Task                Task              Task                Task
## 66:        NULL,NULL function,Prediction              NULL          Prediction
## 67:             Task                Task         NULL,Task       function,Task
## 68:         TaskRegr            TaskRegr     NULL,TaskRegr   function,TaskRegr
## 69:             Task                Task              Task                Task
## 70:             NULL   PredictionClassif              NULL   PredictionClassif
## 71:      TaskClassif         TaskClassif       TaskClassif         TaskClassif
## 72:             Task                Task              NULL          Prediction
## 73:                *                   *                 *                   *
## 74:   TaskSupervised      TaskSupervised    TaskSupervised      TaskSupervised
## 75:             Task                Task              Task                Task
##     input.type.train  input.type.predict output.type.train output.type.predict

# ... whilst this just gives the names
mlr_pipeops

## <DictionaryPipeOp> with 75 stored values
## Keys: adas, blsmote, boxcox, branch, chunk, classbalancing, classifavg,
##   classweights, colapply, collapsefactors, colroles, copy,
##   datefeatures, decode, encode, encodeimpact, encodelmer,
##   encodeplquantiles, encodepltree, featureunion, filter, fixfactors,
##   histbin, ica, imputeconstant, imputehist, imputelearner, imputemean,
##   imputemedian, imputemode, imputeoor, imputesample, kernelpca,
##   learner, learner_cv, learner_pi_cvplus, learner_quantiles, missind,
##   modelmatrix, multiplicityexply, multiplicityimply, mutate, nearmiss,
##   nmf, nop, ovrsplit, ovrunite, pca, proxy, quantilebin,
##   randomprojection, randomresponse, regravg, removeconstants,
##   renamecolumns, replicate, rowapply, scale, scalemaxabs, scalerange,
##   select, smote, smotenc, spatialsign, subsample, targetinvert,
##   targetmutate, targettrafoscalerange, textvectorizer, threshold,
##   tomek, tunethreshold, unbranch, vtreat, yeojohnson

# Again, to see help on any of them, prefix the key name with mlr_pipeops_
?mlr_pipeops_encode

So we can see the encode pipeline can do one-hot encoding of factors. We’ll do this first. XGBoost which can do gradient boosting doesn’t accept factors (look back at learners table earlier), so we now create a pipeline operation to encode them before passing to the learner. the function po() adds operations and %>>% connects the steps

# Uncomment and run the following command first if you do not have the xgboost package
# install.packages("xgboost")

# Create a pipeline which encodes and then fits an XGBoost model
lrn_xgboost <- lrn("classif.xgboost", predict_type = "prob", id = "gradient.boosting")
pl_xgb <- po("encode") %>>%
  po(lrn_xgboost)

# Now fit as normal ... we can just add it to our benchmark set
res <- benchmark(benchmark_grid(
  task        = list(credit_task),
  learners    = list(lrn_baseline,
                     lrn_cart,
                     lrn_cart_cp,
                     pl_xgb),
  resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)

res$aggregate(list(msr("classif.ce"),
                   msr("classif.acc"),
                   msr("classif.fpr"),
                   msr("classif.fnr")))

##       nr    task_id               learner_id resampling_id iters classif.ce
##    <int>     <char>                   <char>        <char> <int>      <num>
## 1:     1 BankCredit                 baseline            cv     3  0.2815447
## 2:     2 BankCredit                     tree            cv     3  0.2287847
## 3:     3 BankCredit              tree.pruned            cv     3  0.2343980
## 4:     4 BankCredit encode.gradient.boosting            cv     3  0.2231697
##    classif.acc classif.fpr classif.fnr
##          <num>       <num>       <num>
## 1:   0.7184553  0.00000000   1.0000000
## 2:   0.7712153  0.08656565   0.5918438
## 3:   0.7656020  0.08156215   0.6246723
## 4:   0.7768303  0.12716639   0.4680833
## Hidden columns: resample_result

Handling missingness is slightly more involved. We provide a pipeline recipie here which is quite robust … read the documentation of each step to understand more.

We then apply this to logistic regression.

# First create a pipeline of just missing fixes we can later use with models
pl_missing <- po("fixfactors") %>>%
  po("removeconstants") %>>%
  po("imputesample", affect_columns = selector_type(c("ordered", "factor"))) %>>%
  po("imputemean")

# Now try with a model that needs no missingness
lrn_log_reg <- lrn("classif.log_reg", predict_type = "prob", id = "logistic.regression")
pl_log_reg <- pl_missing %>>%
  po(lrn_log_reg)

# Now fit as normal ... we can just add it to our benchmark set
res <- benchmark(benchmark_grid(
  task        = list(credit_task),
  learners    = list(lrn_baseline,
                     lrn_cart,
                     lrn_cart_cp,
                     pl_xgb,
                     pl_log_reg),
  resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)

res$aggregate(list(msr("classif.ce"),
                   msr("classif.acc"),
                   msr("classif.fpr"),
                   msr("classif.fnr")))

##       nr    task_id
##    <int>     <char>
## 1:     1 BankCredit
## 2:     2 BankCredit
## 3:     3 BankCredit
## 4:     4 BankCredit
## 5:     5 BankCredit
##                                                                learner_id
##                                                                    <char>
## 1:                                                               baseline
## 2:                                                                   tree
## 3:                                                            tree.pruned
## 4:                                               encode.gradient.boosting
## 5: fixfactors.removeconstants.imputesample.imputemean.logistic.regression
##    resampling_id iters classif.ce classif.acc classif.fpr classif.fnr
##           <char> <int>      <num>       <num>       <num>       <num>
## 1:            cv     3  0.2815420   0.7184580  0.00000000   1.0000000
## 2:            cv     3  0.2341678   0.7658322  0.10479843   0.5649245
## 3:            cv     3  0.2393319   0.7606681  0.08139514   0.6434153
## 4:            cv     3  0.2296776   0.7703224  0.12315002   0.5011726
## 5:            cv     3  0.2056540   0.7943460  0.08663037   0.5089951
## Hidden columns: resample_result

Advanced: super learning

Rather than having to choose among the models that we fitted above, we could instead fit all of them and have a final “super learner” fitted which automatically selects the best prediction based on the available base learners. We can do this using the pipelines in MLR3 …

We start from scratch to make this more advanced example self contained.

library("mlr3verse")

set.seed(212) # set seed for reproducibility

# Load data
data("credit_data", package = "modeldata")

# Define task
credit_task <- TaskClassif$new(id = "BankCredit",
                               backend = credit_data,
                               target = "Status",
                               positive = "bad")

# Cross validation resampling strategy
cv5 <- rsmp("cv", folds = 5)
cv5$instantiate(credit_task)

# Define a collection of base learners
lrn_baseline <- lrn("classif.featureless", predict_type = "prob", id = "baseline")
lrn_cart     <- lrn("classif.rpart", predict_type = "prob", id = "tree")
lrn_cart_cp  <- lrn("classif.rpart", predict_type = "prob", cp = 0.016, id = "tree.pruned")
lrn_ranger   <- lrn("classif.ranger", predict_type = "prob", id = "random.forest")
lrn_xgboost  <- lrn("classif.xgboost", predict_type = "prob", id = "gradient.descent")
lrn_log_reg  <- lrn("classif.log_reg", predict_type = "prob", id = "logistic.regression")

# Define a super learner
lrnsp_log_reg <- lrn("classif.log_reg", predict_type = "prob", id = "super.learner")

# Missingness imputation pipeline
pl_missing <- po("fixfactors") %>>%
  po("removeconstants") %>>%
  po("imputesample", affect_columns = selector_type(c("ordered", "factor"))) %>>%
  po("imputemean")

# Factors coding pipeline
pl_factor <- po("encode")

# Now define the full pipeline
spr_lrn <- gunion(list(
  # First group of learners requiring no modification to input
  gunion(list(
    po("learner_cv", lrn_baseline),
    po("learner_cv", lrn_cart),
    po("learner_cv", lrn_cart_cp)
  )),
  # Next group of learners requiring special treatment of missingness
  pl_missing %>>%
    gunion(list(
      po("learner_cv", lrn_ranger),
      po("learner_cv", lrn_log_reg),
      po("nop") # This passes through the original features adjusted for
                # missingness to the super learner
    )),
  # Last group needing factor encoding
  pl_factor %>>%
    po("learner_cv", lrn_xgboost)
)) %>>%
  po("featureunion") %>>%
  po(lrnsp_log_reg)

# This plot shows a graph of the learning pipeline
spr_lrn$plot()

# Finally fit the base learners and super learner and evaluate
res_spr <- resample(credit_task, spr_lrn, cv5, store_models = TRUE)
res_spr$aggregate(list(msr("classif.ce"),
                       msr("classif.acc"),
                       msr("classif.fpr"),
                       msr("classif.fnr")))

##  classif.ce classif.acc classif.fpr classif.fnr 
##  0.20251706  0.79748294  0.09811216  0.46966500

You will note these are the best results achieved of all the learners (except in false positive), albeit that this is by far the most complicated model

References

Becker, M., Binder, M., Bischl, B., Lang, M., Pfisterer, F., Reich, N.G., Richter, J., Schratz, P., Sonabend, R., Pulatov, D. (2021). mlr3 book. URL https://mlr3book.mlr-org.com/

Kuhn, M., Wickham, H. (2020). Tidymodels: a collection of packages for modeling and machine learning using tidyverse principles. URL https://www.tidymodels.org

Lang, M., Binder, M., Richter, J., Schratz, P., Pfisterer, F., Coors, S., Au, Q., Casalicchio, G., Kotthoff, L., Bischl, B. (2019). mlr3: A modern object-oriented machine learning framework in R. Journal of Open Source Software 4(44), 1903. DOI: 10.21105/joss.01903

R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/

Wright, M.N., Ziegler, A. (2017). ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software 77(1), 1–17. DOI: 10.18637/jss.v077.i01