There has been code provided throughout the course in live code
blocks, where you get snippets of code to perform particular tasks by
directly leveraging individual packages for particular methods.
Therefore, we will not repeat such exercises in the labs and instead we
will focus on the full software pipeline which makes running machine
learning analyses much easier. Arguably the two top ML pipelines in R
(R Core Team, 2021)
today are tidymodels
(Kuhn and Wickham, 2020) and
mlr3
(Lang et
al., 2019). The first lab focused on tidymodels
and this second lab now looks at mlr3
.
Before beginning with mlr3
, we will ensure you can get
the two data sets that will be used for the examples here, which will be
suitable for basic binary and multilabel classification models.
The first is a credit scoring dataset: an important application in banking. The data consist of 4454 observations of 14 variables, each a loan that was given together with the status of whether the loan was “good” (ie repaid) or went “bad” (ie defaulted).
# Uncomment and run the following command first if you do not have the modeldata package
# install.packages("modeldata")
data("credit_data", package = "modeldata")
Use some of the exploration techniques from Lab 1 to explore the dataset before you start modelling.
# Try some plotting functions to explore the data here
Due to technical issues, the alternative dataset has been removed, apologies for the reduced variety in the lab. A solution will be sought for future APTS courses.
MLR takes a quite different approach to building pipelines than Tidymodels did. Neither is “right” and you will probably find you prefer one style over the other, which is totally fine and quite a personal choice. In order to make that choice, it’s good to see both, so today it’s MLR’s turn! Note that MLR 3 is a huge ecosystem and we barely scratch the surface today: there is a fantastic online book by Becker et al. (2021) available, which is highly recommended reading if you decide to use MLR 3 for your applications. You also find the three “cheatsheets” useful for at-a-glance reference: MLR3 cheatsheet, MLR3 tuning cheatsheet, and MLR3 pipelines cheatsheet.
mlr3
is another meta-package for building machine
learning software pipelines, with the aim of automating a lot of the
repetitive tasks we commonly need to do. This both saves time and makes
them less error prone, because we’re relying on software stacks that
have been extensively tested. Note that MLR is slightly more mature
software at this stage having been around for quite a while and
undergone a big revamp with version 3, but Tidymodels is rapidly
catching up and is backed by the fabulous RStudio/Posit team, so in the
medium to longer term both will have similar levels of code maturity and
stability.
# Uncomment and run the following command first if you do not have the mlr3 package
# install.packages("mlr3")
# Load mlr3, which loads a suite of other packages for us
library("mlr3")
MLR breaks machine learning problems up into steps. The most basic ones are where you:
An MLR task defines the dataset and identifies what the response is (all other variables being taken to be the features).
To define a task for the credit data, we therefore provide the data
frame and specify that we are predicting the Status
variable.
# Define a task, which is a dataset together with target variable for prediction
# We wrap the data in an na.omit to avoid issues with missingness, see later for
# better options
task_credit <- TaskClassif$new(id = "credit",
backend = na.omit(credit_data),
target = "Status")
task_credit
##
## ── <TaskClassif> (4039x14) ─────────────────────────────────────────────────────
## • Target: Status
## • Target classes: bad (positive class, 25%), good (75%)
## • Properties: twoclass
## • Features (13):
## • int (9): Age, Amount, Assets, Debt, Expenses, Income, Price, Seniority,
## Time
## • fct (4): Home, Job, Marital, Records
Then, we can see what learners MLR has built in.
# This variable shows available learning algorithms
as.data.table(mlr_learners)
## Key: <key>
## key label task_type
## <char> <char> <char>
## 1: classif.debug Debug Learner for Classification classif
## 2: classif.featureless Featureless Classification Learner classif
## 3: classif.rpart Classification Tree classif
## 4: regr.debug Debug Learner for Regression regr
## 5: regr.featureless Featureless Regression Learner regr
## 6: regr.rpart Regression Tree regr
## feature_types packages
## <list> <list>
## 1: logical,integer,numeric,character,factor,ordered mlr3
## 2: logical,integer,numeric,character,factor,ordered,... mlr3
## 3: logical,integer,numeric,factor,ordered mlr3,rpart
## 4: logical,integer,numeric,character,factor,ordered mlr3,stats
## 5: logical,integer,numeric,character,factor,ordered,... mlr3,stats
## 6: logical,integer,numeric,factor,ordered mlr3,rpart
## properties
## <list>
## 1: hotstart_forward,internal_tuning,marshal,missings,multiclass,twoclass,...
## 2: featureless,importance,missings,multiclass,selected_features,twoclass,...
## 3: importance,missings,multiclass,selected_features,twoclass,weights
## 4: missings,weights
## 5: featureless,importance,missings,selected_features,weights
## 6: importance,missings,selected_features,weights
## predict_types
## <list>
## 1: response,prob
## 2: response,prob
## 3: response,prob
## 4: response,se,quantiles
## 5: response,se,quantiles
## 6: response
This seems rather few! This is because additional packages are used
to add features such as new learners. Many staple learners are in the
mlr3learners
add on package and further probabilistic
learners are in mlr3proba
. These update
mlr_learners
with a lot of additional options.
# Load more learners in supporting packages
library("mlr3learners")
as.data.table(mlr_learners)
## Key: <key>
## key label task_type
## <char> <char> <char>
## 1: classif.cv_glmnet GLM with Elastic Net Regularization classif
## 2: classif.debug Debug Learner for Classification classif
## 3: classif.featureless Featureless Classification Learner classif
## 4: classif.glmnet GLM with Elastic Net Regularization classif
## 5: classif.kknn k-Nearest-Neighbor classif
## 6: classif.lda Linear Discriminant Analysis classif
## 7: classif.log_reg Logistic Regression classif
## 8: classif.multinom Multinomial Log-Linear Model classif
## 9: classif.naive_bayes Naive Bayes classif
## 10: classif.nnet Single Layer Neural Network classif
## 11: classif.qda Quadratic Discriminant Analysis classif
## 12: classif.ranger Random Forest classif
## 13: classif.rpart Classification Tree classif
## 14: classif.svm Support Vector Machine classif
## 15: classif.xgboost Extreme Gradient Boosting classif
## 16: regr.cv_glmnet GLM with Elastic Net Regularization regr
## 17: regr.debug Debug Learner for Regression regr
## 18: regr.featureless Featureless Regression Learner regr
## 19: regr.glmnet GLM with Elastic Net Regularization regr
## 20: regr.kknn k-Nearest-Neighbor regr
## 21: regr.km Kriging regr
## 22: regr.lm Linear Model regr
## 23: regr.nnet Single Layer Neural Network regr
## 24: regr.ranger Random Forest regr
## 25: regr.rpart Regression Tree regr
## 26: regr.svm Support Vector Machine regr
## 27: regr.xgboost Extreme Gradient Boosting regr
## key label task_type
## feature_types
## <list>
## 1: logical,integer,numeric
## 2: logical,integer,numeric,character,factor,ordered
## 3: logical,integer,numeric,character,factor,ordered,...
## 4: logical,integer,numeric
## 5: logical,integer,numeric,factor,ordered
## 6: logical,integer,numeric,factor,ordered
## 7: logical,integer,numeric,character,factor,ordered
## 8: logical,integer,numeric,factor
## 9: logical,integer,numeric,factor
## 10: logical,integer,numeric,factor,ordered
## 11: logical,integer,numeric,factor,ordered
## 12: logical,integer,numeric,character,factor,ordered
## 13: logical,integer,numeric,factor,ordered
## 14: logical,integer,numeric
## 15: logical,integer,numeric
## 16: logical,integer,numeric
## 17: logical,integer,numeric,character,factor,ordered
## 18: logical,integer,numeric,character,factor,ordered,...
## 19: logical,integer,numeric
## 20: logical,integer,numeric,factor,ordered
## 21: logical,integer,numeric
## 22: logical,integer,numeric,character,factor
## 23: logical,integer,numeric,factor,ordered
## 24: logical,integer,numeric,character,factor,ordered
## 25: logical,integer,numeric,factor,ordered
## 26: logical,integer,numeric
## 27: logical,integer,numeric
## feature_types
## packages
## <list>
## 1: mlr3,mlr3learners,glmnet
## 2: mlr3
## 3: mlr3
## 4: mlr3,mlr3learners,glmnet
## 5: mlr3,mlr3learners,kknn
## 6: mlr3,mlr3learners,MASS
## 7: mlr3,mlr3learners,stats
## 8: mlr3,mlr3learners,nnet
## 9: mlr3,mlr3learners,e1071
## 10: mlr3,mlr3learners,nnet
## 11: mlr3,mlr3learners,MASS
## 12: mlr3,mlr3learners,ranger
## 13: mlr3,rpart
## 14: mlr3,mlr3learners,e1071
## 15: mlr3,mlr3learners,xgboost
## 16: mlr3,mlr3learners,glmnet
## 17: mlr3,stats
## 18: mlr3,stats
## 19: mlr3,mlr3learners,glmnet
## 20: mlr3,mlr3learners,kknn
## 21: mlr3,mlr3learners,DiceKriging
## 22: mlr3,mlr3learners,stats
## 23: mlr3,mlr3learners,nnet
## 24: mlr3,mlr3learners,ranger
## 25: mlr3,rpart
## 26: mlr3,mlr3learners,e1071
## 27: mlr3,mlr3learners,xgboost
## packages
## properties
## <list>
## 1: multiclass,offset,selected_features,twoclass,weights
## 2: hotstart_forward,internal_tuning,marshal,missings,multiclass,twoclass,...
## 3: featureless,importance,missings,multiclass,selected_features,twoclass,...
## 4: multiclass,offset,twoclass,weights
## 5: multiclass,twoclass
## 6: multiclass,twoclass
## 7: offset,twoclass,weights
## 8: multiclass,twoclass,weights
## 9: multiclass,twoclass
## 10: multiclass,twoclass,weights
## 11: multiclass,twoclass
## 12: hotstart_backward,importance,missings,multiclass,oob_error,selected_features,...
## 13: importance,missings,multiclass,selected_features,twoclass,weights
## 14: multiclass,twoclass
## 15: hotstart_forward,importance,internal_tuning,missings,multiclass,offset,...
## 16: offset,selected_features,weights
## 17: missings,weights
## 18: featureless,importance,missings,selected_features,weights
## 19: offset,weights
## 20:
## 21:
## 22: offset,weights
## 23: weights
## 24: hotstart_backward,importance,missings,oob_error,selected_features,weights
## 25: importance,missings,selected_features,weights
## 26:
## 27: hotstart_forward,importance,internal_tuning,missings,offset,validation,...
## properties
## predict_types
## <list>
## 1: response,prob
## 2: response,prob
## 3: response,prob
## 4: response,prob
## 5: response,prob
## 6: response,prob
## 7: response,prob
## 8: response,prob
## 9: response,prob
## 10: response,prob
## 11: response,prob
## 12: response,prob
## 13: response,prob
## 14: response,prob
## 15: response,prob
## 16: response
## 17: response,se,quantiles
## 18: response,se,quantiles
## 19: response
## 20: response
## 21: response,se
## 22: response,se
## 23: response
## 24: response,se,quantiles
## 25: response
## 26: response
## 27: response
## predict_types
# ... whilst this just gives the names
mlr_learners
## <DictionaryLearner> with 27 stored values
## Keys: classif.cv_glmnet, classif.debug, classif.featureless,
## classif.glmnet, classif.kknn, classif.lda, classif.log_reg,
## classif.multinom, classif.naive_bayes, classif.nnet, classif.qda,
## classif.ranger, classif.rpart, classif.svm, classif.xgboost,
## regr.cv_glmnet, regr.debug, regr.featureless, regr.glmnet, regr.kknn,
## regr.km, regr.lm, regr.nnet, regr.ranger, regr.rpart, regr.svm,
## regr.xgboost
There are a variety of learners now. The start of their name indicates their functionality, including one additional category we did not separate out in our definition on the course:
classif
for classification problemsregr
for regression problemsdens
for density estimationsurv
for survival analysis (time to event)You can get a nicer detailed view in RStudio by wrapping this in the
View()
command:
# Get more details on learners
View(as.data.table(mlr_learners))
In the View tab, pay particular attention to the
feature_types
and properties
columns. This
gives us information about the capabilities of different learners. For
example, we know that we have numeric
and
factor
data from the exploration above, so only those
learners listing these under feature_types
can handle our
data natively — for other learners we have some work to do. Likewise, we
know that we have missing values, so learners with missings
listed under properties
could handle our data without using
na.omit
or any other additional work.
Having defined the task, the next step is to define the learner,
let’s say logistic regression here. We pass the lrn
function the name from the table to access that learner.
# Define a logistic regression model
learner_lr <- lrn("classif.log_reg")
learner_lr
##
## ── <LearnerClassifLogReg> (classif.log_reg): Logistic Regression ───────────────
## • Model: -
## • Parameters: use_pred_offset=TRUE
## • Packages: mlr3, mlr3learners, and stats
## • Predict Types: [response] and prob
## • Feature Types: logical, integer, numeric, character, factor, and ordered
## • Encapsulation: none (fallback: -)
## • Properties: offset, twoclass, and weights
## • Other settings: use_weights = 'use'
We now proceed to train this learner on the credit data task.
# Train the model
learner_lr$train(task_credit)
Once the learner is trained, we can use the same object to predict on the same data (ie in-sample training prediction)
# Perform prediction
pred <- learner_lr$predict(task_credit)
pred
##
## ── <PredictionClassif> for 4039 observations: ──────────────────────────────────
## row_ids truth response
## 1 good good
## 2 good good
## 3 bad bad
## --- --- ---
## 4037 bad good
## 4038 good good
## 4039 good good
Finally, we can for example assess the training/apparent accuracy and
confusion matrix by accessing this object. The confusion matrix is
special and accessed directly in the prediction object that was
returned. For other measures, we pass a measure object to the score
function and that measure will be computed on the predictions. Just as
we accessed learners via lrn
, we access measure objects via
msr
.
# Evaluate some measures of error
pred$score(msr("classif.acc"))
## classif.acc
## 0.8120822
pred$confusion
## truth
## response bad good
## bad 474 207
## good 552 2806
… and all the measures supported by MLR3 can be found by querying the
mlr_measures
object. A detailed reference of what these are
as available at (https://mlr3measures.mlr-org.com/reference/index.html)[https://mlr3measures.mlr-org.com/reference/index.html]
# This variable shows available error measures
mlr_measures
## <DictionaryMeasure> with 63 stored values
## Keys: aic, bic, classif.acc, classif.auc, classif.bacc, classif.bbrier,
## classif.ce, classif.costs, classif.dor, classif.fbeta, classif.fdr,
## classif.fn, classif.fnr, classif.fomr, classif.fp, classif.fpr,
## classif.logloss, classif.mauc_au1p, classif.mauc_au1u,
## classif.mauc_aunp, classif.mauc_aunu, classif.mauc_mu,
## classif.mbrier, classif.mcc, classif.npv, classif.ppv, classif.prauc,
## classif.precision, classif.recall, classif.sensitivity,
## classif.specificity, classif.tn, classif.tnr, classif.tp,
## classif.tpr, debug_classif, internal_valid_score, oob_error,
## regr.bias, regr.ktau, regr.mae, regr.mape, regr.maxae, regr.medae,
## regr.medse, regr.mse, regr.msle, regr.pbias, regr.pinball, regr.rmse,
## regr.rmsle, regr.rqr, regr.rsq, regr.sae, regr.smape, regr.srho,
## regr.sse, selected_features, sim.jaccard, sim.phi, time_both,
## time_predict, time_train
What happens if you try to get the apparent error using Brier score
loss? By looking at the help file for Learner
or otherwise,
can you see how to compute this error?
# Try computing the Brier score loss ... what is wrong?
# Look at the help file for Learners and see how to rectify this
?Learner
# SOLUTION
# We need to set the prediction type to probabilistic
learner_lr$predict_type <- "prob"
# The prediction function automatically gathers everything necessary: ground
# truth, predicted response label, and probability
pred <- learner_lr$predict(task_credit)
pred
##
## ── <PredictionClassif> for 4039 observations: ──────────────────────────────────
## row_ids truth response prob.bad prob.good
## 1 good good 0.2486010 0.7513990
## 2 good good 0.1132317 0.8867683
## 3 bad bad 0.5613161 0.4386839
## --- --- --- --- ---
## 4037 bad good 0.3711673 0.6288327
## 4038 good good 0.3149378 0.6850622
## 4039 good good 0.2173202 0.7826798
# Then compute the Brier score
pred$score(msr("classif.bbrier"))
## classif.bbrier
## 0.1329752
We can do the same for a random forest using ranger
(Wright and Ziegler,
2017), this time with all the code at once for readability
(we do not need to recreate the task).
# Uncomment and run the following command first if you do not have the ranger package
# install.packages("ranger")
# Redo everything for a random forest model
learner_rf <- lrn("classif.ranger")
learner_rf$train(task_credit)
pred_rf <- learner_rf$predict(task_credit)
pred_rf$score(msr("classif.acc"))
## classif.acc
## 0.9997524
pred_rf$confusion
## truth
## response bad good
## bad 1025 0
## good 1 3013
Above we did the minimal fitting run and just estimated the apparent error to get a feel for the design of MLR 3. We now go into full detail, doing all steps properly.
First, we will redefine the task, this time note two changes: we do
not wrap the data in an na.omit
call (MLR can handle
missingness as part of the pipeline, see later) and we also specifically
identify what we mean by a “positive” case (for later calculation of
true positive, etc – note that tidymodels
uses the order of
the factors to determine this).
# Redefinet the task, this time not getting rid of missing data here and
# specifying what constitutes a positive case
credit_task <- TaskClassif$new(id = "BankCredit",
backend = credit_data,
target = "Status",
positive = "bad")
Next, lets examine the data splitting techniques supported by MLR:
# Let's see what resampling strategies MLR supports
# The final column are the defaults
as.data.table(mlr_resamplings)
## Key: <key>
## key label params iters
## <char> <char> <list> <int>
## 1: bootstrap Bootstrap ratio,repeats 30
## 2: custom Custom Splits NA
## 3: custom_cv Custom Split Cross-Validation NA
## 4: cv Cross-Validation folds 10
## 5: holdout Holdout ratio 1
## 6: insample Insample Resampling 1
## 7: loo Leave-One-Out NA
## 8: repeated_cv Repeated Cross-Validation folds,repeats 100
## 9: subsampling Subsampling ratio,repeats 30
# ... whilst this just gives the names
mlr_resamplings
## <DictionaryResampling> with 9 stored values
## Keys: bootstrap, custom, custom_cv, cv, holdout, insample, loo,
## repeated_cv, subsampling
# To see help on any of them, prefix the key name with mlr_resamplings_
?mlr_resamplings_cv
To then use one of these resampling methods, just as we accessed
learners via lrn
, and measure objects via msr
,
we access resampling via rsmp
.
You can also see from the cross validation help file, that we could access individual training and testing folds via the `$train_set()
# The rsmp function constructs a resampling strategy, taking the name given
# above and allowing any options listed there to be chosen
cv <- rsmp("cv", folds = 3)
# We then instantiate this resampling scheme on the particular task we're
# working on
cv$instantiate(credit_task)
# You can see from the documentation that you could access individual folds
# training and testing data via:
cv$train_set(1)
cv$test_set(1)
Again, as with tidymodels
we don’t want to create for
loops etc, but will rely instead on the capabilities of the package to
automate repetitive tasks.
Lets now create a few different models. Recall from the learners we listed above, we saw two models that support both factors and missings, so we could try these first:
classif.featureless
is a so-called baseline classifier
… it basically just predicts the most common response all the time
ignoring the features! It is often a good idea to include this because
if you don’t beat it then there is something very wrong!classif.rpart
does classification trees using the
methodology we saw in the lectures.In both cases we’re going to ask for probabilistic prediction (the
default just predicts the label). Also, since we’re now running multiple
models it can be helpful to give them an id
so that we can
more easily identify them.
lrn_baseline <- lrn("classif.featureless", predict_type = "prob", id = "baseline")
lrn_cart <- lrn("classif.rpart", predict_type = "prob", id = "tree")
We can see if these have any options or hyperparameters we can change too:
# Have a look at what options and hyperparameters the model possesses
lrn_baseline$param_set
## <ParamSet(1)>
## id class lower upper nlevels default value
## <char> <char> <num> <num> <num> <list> <list>
## 1: method ParamFct NA NA 3 mode mode
lrn_cart$param_set
## <ParamSet(10)>
## id class lower upper nlevels default value
## <char> <char> <num> <num> <num> <list> <list>
## 1: cp ParamDbl 0 1 Inf 0.01 [NULL]
## 2: keep_model ParamLgl NA NA 2 FALSE [NULL]
## 3: maxcompete ParamInt 0 Inf Inf 4 [NULL]
## 4: maxdepth ParamInt 1 30 30 30 [NULL]
## 5: maxsurrogate ParamInt 0 Inf Inf 5 [NULL]
## 6: minbucket ParamInt 1 Inf Inf <NoDefault[0]> [NULL]
## 7: minsplit ParamInt 1 Inf Inf 20 [NULL]
## 8: surrogatestyle ParamInt 0 1 2 0 [NULL]
## 9: usesurrogate ParamInt 0 2 3 2 [NULL]
## 10: xval ParamInt 0 Inf Inf 10 0
Let’s now fit these learners using cross-validation to determine the accuracy.
# Fit models with cross validation
res_baseline <- resample(credit_task, lrn_baseline, cv, store_models = TRUE)
res_cart <- resample(credit_task, lrn_cart, cv, store_models = TRUE)
# Calculate and aggregate performance values
res_baseline$aggregate()
## classif.ce
## 0.281542
res_cart$aggregate()
## classif.ce
## 0.2355163
What is classif.ce
? It is the mean misclassification
error, or 0-1 loss generalisation error. Note that the error in the
baseline classifier is only 0.282! So this gives us an idea of what we
need to beat (we might have expected 0.5, but this is an imbalanced
dataset as is often the case with real data).
We can get many other error measures:
# Remember the error measures we can get ... (or use View() in RStudio for
# nicer format)
as.data.table(mlr_measures)
## Key: <key>
## key label
## <char> <char>
## 1: aic Akaike Information Criterion
## 2: bic Bayesian Information Criterion
## 3: classif.acc Classification Accuracy
## 4: classif.auc Area Under the ROC Curve
## 5: classif.bacc Balanced Accuracy
## 6: classif.bbrier Binary Brier Score
## 7: classif.ce Classification Error
## 8: classif.costs Cost-sensitive Classification
## 9: classif.dor Diagnostic Odds Ratio
## 10: classif.fbeta F-beta score
## 11: classif.fdr False Discovery Rate
## 12: classif.fn False Negatives
## 13: classif.fnr False Negative Rate
## 14: classif.fomr False Omission Rate
## 15: classif.fp False Positives
## 16: classif.fpr False Positive Rate
## 17: classif.logloss Log Loss
## 18: classif.mauc_au1p Weighted average 1 vs. 1 multiclass AUC
## 19: classif.mauc_au1u Average 1 vs. 1 multiclass AUC
## 20: classif.mauc_aunp Weighted average 1 vs. rest multiclass AUC
## 21: classif.mauc_aunu Average 1 vs. rest multiclass AUC
## 22: classif.mauc_mu Multiclass mu AUC
## 23: classif.mbrier Multiclass Brier Score
## 24: classif.mcc Matthews Correlation Coefficient
## 25: classif.npv Negative Predictive Value
## 26: classif.ppv Positive Predictive Value
## 27: classif.prauc Precision-Recall Curve
## 28: classif.precision Precision
## 29: classif.recall Recall
## 30: classif.sensitivity Sensitivity
## 31: classif.specificity Specificity
## 32: classif.tn True Negatives
## 33: classif.tnr True Negative Rate
## 34: classif.tp True Positives
## 35: classif.tpr True Positive Rate
## 36: debug_classif Debug Classification Measure
## 37: internal_valid_score Internal Validation Score
## 38: oob_error Out-of-bag Error
## 39: regr.bias Bias
## 40: regr.ktau Kendall's tau
## 41: regr.mae Mean Absolute Error
## 42: regr.mape Mean Absolute Percent Error
## 43: regr.maxae Max Absolute Error
## 44: regr.medae Median Absolute Error
## 45: regr.medse Median Squared Error
## 46: regr.mse Mean Squared Error
## 47: regr.msle Mean Squared Log Error
## 48: regr.pbias Percent Bias
## 49: regr.pinball <NA>
## 50: regr.rmse Root Mean Squared Error
## 51: regr.rmsle Root Mean Squared Log Error
## 52: regr.rqr <NA>
## 53: regr.rsq <NA>
## 54: regr.sae Sum of Absolute Errors
## 55: regr.smape Symmetric Mean Absolute Percent Error
## 56: regr.srho Spearman's rho
## 57: regr.sse Sum of Squared Errors
## 58: selected_features Absolute or Relative Frequency of Selected Features
## 59: sim.jaccard Jaccard Similarity Index
## 60: sim.phi Phi Coefficient Similarity
## 61: time_both Elapsed Time
## 62: time_predict Elapsed Time
## 63: time_train Elapsed Time
## key label
## task_type packages predict_type
## <char> <list> <char>
## 1: <NA> mlr3 <NA>
## 2: <NA> mlr3 <NA>
## 3: classif mlr3,mlr3measures response
## 4: classif mlr3,mlr3measures prob
## 5: classif mlr3,mlr3measures response
## 6: classif mlr3,mlr3measures prob
## 7: classif mlr3,mlr3measures response
## 8: classif mlr3 response
## 9: classif mlr3,mlr3measures response
## 10: classif mlr3,mlr3measures response
## 11: classif mlr3,mlr3measures response
## 12: classif mlr3,mlr3measures response
## 13: classif mlr3,mlr3measures response
## 14: classif mlr3,mlr3measures response
## 15: classif mlr3,mlr3measures response
## 16: classif mlr3,mlr3measures response
## 17: classif mlr3,mlr3measures prob
## 18: classif mlr3,mlr3measures prob
## 19: classif mlr3,mlr3measures prob
## 20: classif mlr3,mlr3measures prob
## 21: classif mlr3,mlr3measures prob
## 22: classif mlr3,mlr3measures prob
## 23: classif mlr3,mlr3measures prob
## 24: classif mlr3,mlr3measures response
## 25: classif mlr3,mlr3measures response
## 26: classif mlr3,mlr3measures response
## 27: classif mlr3,mlr3measures prob
## 28: classif mlr3,mlr3measures response
## 29: classif mlr3,mlr3measures response
## 30: classif mlr3,mlr3measures response
## 31: classif mlr3,mlr3measures response
## 32: classif mlr3,mlr3measures response
## 33: classif mlr3,mlr3measures response
## 34: classif mlr3,mlr3measures response
## 35: classif mlr3,mlr3measures response
## 36: <NA> mlr3 response
## 37: <NA> mlr3 <NA>
## 38: <NA> mlr3 <NA>
## 39: regr mlr3,mlr3measures response
## 40: regr mlr3,mlr3measures response
## 41: regr mlr3,mlr3measures response
## 42: regr mlr3,mlr3measures response
## 43: regr mlr3,mlr3measures response
## 44: regr mlr3,mlr3measures response
## 45: regr mlr3,mlr3measures response
## 46: regr mlr3,mlr3measures response
## 47: regr mlr3,mlr3measures response
## 48: regr mlr3,mlr3measures response
## 49: regr mlr3 quantiles
## 50: regr mlr3,mlr3measures response
## 51: regr mlr3,mlr3measures response
## 52: regr mlr3 quantiles
## 53: regr mlr3 response
## 54: regr mlr3,mlr3measures response
## 55: regr mlr3,mlr3measures response
## 56: regr mlr3,mlr3measures response
## 57: regr mlr3,mlr3measures response
## 58: <NA> mlr3 <NA>
## 59: <NA> mlr3,mlr3measures <NA>
## 60: <NA> mlr3,mlr3measures <NA>
## 61: <NA> mlr3 <NA>
## 62: <NA> mlr3 <NA>
## 63: <NA> mlr3 <NA>
## task_type packages predict_type
## properties
## <list>
## 1: na_score,requires_learner,requires_model,requires_no_prediction
## 2: na_score,requires_learner,requires_model,requires_no_prediction
## 3: weights
## 4:
## 5: weights
## 6: weights
## 7: weights
## 8: weights
## 9:
## 10:
## 11:
## 12:
## 13:
## 14:
## 15:
## 16:
## 17: weights
## 18:
## 19:
## 20:
## 21:
## 22:
## 23:
## 24:
## 25:
## 26:
## 27:
## 28:
## 29:
## 30:
## 31:
## 32:
## 33:
## 34:
## 35:
## 36: na_score
## 37: na_score,requires_learner,requires_no_prediction
## 38: na_score,requires_learner,requires_no_prediction
## 39: weights
## 40:
## 41: weights
## 42: weights
## 43:
## 44:
## 45:
## 46: weights
## 47: weights
## 48: weights
## 49:
## 50: weights
## 51: weights
## 52: [NULL]
## 53: weights
## 54:
## 55:
## 56:
## 57:
## 58: requires_task,requires_learner,requires_model,requires_no_prediction
## 59: requires_model,requires_no_prediction
## 60: requires_model,requires_no_prediction
## 61: requires_learner,requires_no_prediction
## 62: requires_learner,requires_no_prediction
## 63: requires_learner,requires_no_prediction
## properties
## task_properties
## <list>
## 1:
## 2:
## 3:
## 4: twoclass
## 5:
## 6: twoclass
## 7:
## 8:
## 9: twoclass
## 10: twoclass
## 11: twoclass
## 12: twoclass
## 13: twoclass
## 14: twoclass
## 15: twoclass
## 16: twoclass
## 17:
## 18:
## 19:
## 20:
## 21:
## 22:
## 23:
## 24:
## 25: twoclass
## 26: twoclass
## 27: twoclass
## 28: twoclass
## 29: twoclass
## 30: twoclass
## 31: twoclass
## 32: twoclass
## 33: twoclass
## 34: twoclass
## 35: twoclass
## 36:
## 37:
## 38:
## 39:
## 40:
## 41:
## 42:
## 43:
## 44:
## 45:
## 46:
## 47:
## 48:
## 49:
## 50:
## 51:
## 52:
## 53:
## 54:
## 55:
## 56:
## 57:
## 58:
## 59:
## 60:
## 61:
## 62:
## 63:
## task_properties
# ... whilst this just gives the names
mlr_measures
## <DictionaryMeasure> with 63 stored values
## Keys: aic, bic, classif.acc, classif.auc, classif.bacc, classif.bbrier,
## classif.ce, classif.costs, classif.dor, classif.fbeta, classif.fdr,
## classif.fn, classif.fnr, classif.fomr, classif.fp, classif.fpr,
## classif.logloss, classif.mauc_au1p, classif.mauc_au1u,
## classif.mauc_aunp, classif.mauc_aunu, classif.mauc_mu,
## classif.mbrier, classif.mcc, classif.npv, classif.ppv, classif.prauc,
## classif.precision, classif.recall, classif.sensitivity,
## classif.specificity, classif.tn, classif.tnr, classif.tp,
## classif.tpr, debug_classif, internal_valid_score, oob_error,
## regr.bias, regr.ktau, regr.mae, regr.mape, regr.maxae, regr.medae,
## regr.medse, regr.mse, regr.msle, regr.pbias, regr.pinball, regr.rmse,
## regr.rmsle, regr.rqr, regr.rsq, regr.sae, regr.smape, regr.srho,
## regr.sse, selected_features, sim.jaccard, sim.phi, time_both,
## time_predict, time_train
# Again, to see help on any of them, prefix the key name with mlr_measures_
?mlr_measures_classif.ce
We can request the benchmark to show us multiple of these measures:
res_baseline$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.auc"),
msr("classif.fpr"),
msr("classif.fnr")))
## classif.ce classif.acc classif.auc classif.fpr classif.fnr
## 0.281542 0.718458 0.500000 0.000000 1.000000
res_cart$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.auc"),
msr("classif.fpr"),
msr("classif.fnr")))
## classif.ce classif.acc classif.auc classif.fpr classif.fnr
## 0.23551629 0.76448371 0.73213136 0.09192862 0.60204032
When we want to do multiple models it is actually more convenient to use the benchmark function to run them all on a grid we define (so we can actually run multiple learners on multiple tasks using multiple resampling strategies, as it takes the Cartesian product of all options):
# Use the benchmark functionality to do everything at once, ensuring identical
# settings such as task, folds, etc
res <- benchmark(
benchmark_grid(
task = list(credit_task),
learners = list(lrn_baseline,
lrn_cart),
resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)
res
##
## ── <BenchmarkResult> of 6 rows with 2 resampling run ───────────────────────────
## nr task_id learner_id resampling_id iters warnings errors
## 1 BankCredit baseline cv 3 0 0
## 2 BankCredit tree cv 3 0 0
res$aggregate()
## nr task_id learner_id resampling_id iters classif.ce
## <int> <char> <char> <char> <int> <num>
## 1: 1 BankCredit baseline cv 3 0.2815420
## 2: 2 BankCredit tree cv 3 0.2330479
## Hidden columns: resample_result
Let’s request the more interesting suite of measures from the benchmark (a lot simpler and neater than doing this call repeatedly for each model ourselves):
res$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.auc"),
msr("classif.fpr"),
msr("classif.fnr")))
## nr task_id learner_id resampling_id iters classif.ce classif.acc
## <int> <char> <char> <char> <int> <num> <num>
## 1: 1 BankCredit baseline cv 3 0.2815420 0.7184580
## 2: 2 BankCredit tree cv 3 0.2330479 0.7669521
## classif.auc classif.fpr classif.fnr
## <num> <num> <num>
## 1: 0.500000 0.00000000 1.0000000
## 2: 0.711091 0.09392361 0.5870528
## Hidden columns: resample_result
We can examine in depth the results by getting out the models fitted in each fold:
# Get the trees (2nd model fitted), by asking for second set of resample
# results
trees <- res$resample_result(2)
# Then, let's look at the tree from first CV iteration, for example:
tree1 <- trees$learners[[1]]
# This is a fitted rpart object, so we can look at the model within
tree1_rpart <- tree1$model
# If you look in the rpart package documentation, it tells us how to plot the
# tree that was fitted
plot(tree1_rpart, compress = TRUE, margin = 0.1)
text(tree1_rpart, use.n = TRUE, cex = 0.8)
We can see the other trees too. Change the 3 in double brackets
[[]]
below to other values from 1 to 3 to see the model
from each round of cross validation.
# Looking at other rounds from CV
plot(res$resample_result(2)$learners[[3]]$model, compress = TRUE, margin = 0.1)
text(res$resample_result(2)$learners[[3]]$model, use.n = TRUE, cex = 0.8)
It may be that these trees need to be pruned. To do this, we would
need to enable the cross-validation option to rpart
in the
learner. We can fit this individually and make a selection for the cost
penalty (see alpha in lectures), before then setting this value when
benchmarking (NOTE: this is not quite optimal but MLR3 doesn’t yet have
the option for us to select this within folds … coming soon
hopefully).
In particular, note we are now doing nested cross validation
which is the correct way to do parameter selection without biasing test
error. Change the 3 in double brackets [[]]
to other values
from 1 to 3 to see cross validation plot from each round.
# Enable nested cross validation
lrn_cart_cv <- lrn("classif.rpart", predict_type = "prob", xval = 10, id = "tree.cv")
res_cart_cv <- resample(credit_task, lrn_cart_cv, cv, store_models = TRUE)
rpart::plotcp(res_cart_cv$learners[[3]]$model)
Now, choose a cost penalty and add this as a model to our benchmark set:
# Try refitting with a chosen complexity parameter for pruning
lrn_cart_cp <- lrn("classif.rpart", predict_type = "prob", cp = 0.016, id = "tree.pruned")
# Then run this in the benchmark with other options
res <- benchmark(benchmark_grid(
task = list(credit_task),
learners = list(baseline = lrn_baseline,
cart = lrn_cart,
cart_prune = lrn_cart_cp),
resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)
res$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.auc"),
msr("classif.fpr"),
msr("classif.fnr")))
## nr task_id learner_id resampling_id iters classif.ce classif.acc
## <int> <char> <char> <char> <int> <num> <num>
## 1: 1 BankCredit baseline cv 3 0.2815433 0.7184567
## 2: 2 BankCredit tree cv 3 0.2301289 0.7698711
## 3: 3 BankCredit tree.pruned cv 3 0.2395621 0.7604379
## classif.auc classif.fpr classif.fnr
## <num> <num> <num>
## 1: 0.5000000 0.00000000 1.0000000
## 2: 0.7156031 0.09086107 0.5846605
## 3: 0.6562473 0.06504538 0.6858471
## Hidden columns: resample_result
In this case we see a slight improvement to false-positive rate at the cost of higher errors elsewhere. These might be tradeoffs you need to make in the real world.
To handle missing data and factors, we will need to introduce a modelling pipeline. In this pipeline we need to impute missing values and dummy/one-hot code factors.
Pipelines allow us to create a sophisticated workflow without having to manually code how everything ties together. To see what pipeline operations are available:
# Trying out pipelines
library("mlr3pipelines")
# Pipelines available (or use View() in RStudio for a nicer look) ...
as.data.table(mlr_pipeops)
## Key: <key>
## key
## <char>
## 1: adas
## 2: blsmote
## 3: boxcox
## 4: branch
## 5: chunk
## 6: classbalancing
## 7: classifavg
## 8: classweights
## 9: colapply
## 10: collapsefactors
## 11: colroles
## 12: copy
## 13: datefeatures
## 14: decode
## 15: encode
## 16: encodeimpact
## 17: encodelmer
## 18: encodeplquantiles
## 19: encodepltree
## 20: featureunion
## 21: filter
## 22: fixfactors
## 23: histbin
## 24: ica
## 25: imputeconstant
## 26: imputehist
## 27: imputelearner
## 28: imputemean
## 29: imputemedian
## 30: imputemode
## 31: imputeoor
## 32: imputesample
## 33: kernelpca
## 34: learner
## 35: learner_cv
## 36: learner_pi_cvplus
## 37: learner_quantiles
## 38: missind
## 39: modelmatrix
## 40: multiplicityexply
## 41: multiplicityimply
## 42: mutate
## 43: nearmiss
## 44: nmf
## 45: nop
## 46: ovrsplit
## 47: ovrunite
## 48: pca
## 49: proxy
## 50: quantilebin
## 51: randomprojection
## 52: randomresponse
## 53: regravg
## 54: removeconstants
## 55: renamecolumns
## 56: replicate
## 57: rowapply
## 58: scale
## 59: scalemaxabs
## 60: scalerange
## 61: select
## 62: smote
## 63: smotenc
## 64: spatialsign
## 65: subsample
## 66: targetinvert
## 67: targetmutate
## 68: targettrafoscalerange
## 69: textvectorizer
## 70: threshold
## 71: tomek
## 72: tunethreshold
## 73: unbranch
## 74: vtreat
## 75: yeojohnson
## key
## label
## <char>
## 1: ADAS Balancing
## 2: BLSMOTE Balancing
## 3: Box-Cox Transformation of Numeric Features
## 4: Path Branching
## 5: Chunk Input into Multiple Outputs
## 6: Class Balancing
## 7: Majority Vote Prediction
## 8: Class Weights for Sample Weighting
## 9: Apply a Function to each Column of a Task
## 10: Collapse Factors
## 11: Change Column Roles of a Task
## 12: Copy Input Multiple Times
## 13: Preprocess Date Features
## 14: Reverse Factor Encoding
## 15: Factor Encoding
## 16: Conditional Target Value Impact Encoding
## 17: Impact Encoding with Random Intercept Models
## 18: Piecewise Linear Encoding using Quantiles
## 19: Piecewise Linear Encoding using Decision Trees
## 20: Aggregate Features from Multiple Inputs
## 21: Feature Filtering
## 22: Fix Factor Levels
## 23: Split Numeric Features into Equally Spaced Bins
## 24: Independent Component Analysis
## 25: Impute Features by a Constant
## 26: Impute Numerical Features by Histogram
## 27: Impute Features by Fitting a Learner
## 28: Impute Numerical Features by their Mean
## 29: Impute Numerical Features by their Median
## 30: Impute Features by their Mode
## 31: Out of Range Imputation
## 32: Impute Features by Sampling
## 33: Kernelized Principal Component Analysis
## 34: Wrap a Learner into a PipeOp
## 35: Wrap a Learner into a PipeOp with Cross-validated Predictions as Features
## 36: Wrap a Learner into a PipeOp with Cross-validation Plus Confidence Intervals as Predictions
## 37: Wrap a Learner into a PipeOp to to predict multiple Quantiles
## 38: Add Missing Indicator Columns
## 39: Transform Columns by Constructing a Model Matrix
## 40: Explicate a Multiplicity
## 41: Implicate a Multiplicity
## 42: Add Features According to Expressions
## 43: Nearmiss Down-Sampling
## 44: Non-negative Matrix Factorization
## 45: Simply Push Input Forward
## 46: Split a Classification Task into Binary Classification Tasks
## 47: Unite Binary Classification Tasks
## 48: Principle Component Analysis
## 49: Wrap another PipeOp or Graph as a Hyperparameter
## 50: Split Numeric Features into Quantile Bins
## 51: Project Numeric Features onto a Randomly Sampled Subspace
## 52: Generate a Randomized Response Prediction
## 53: Weighted Prediction Averaging
## 54: Remove Constant Features
## 55: Rename Columns
## 56: Replicate the Input as a Multiplicity
## 57: Apply a Function to each Row of a Task
## 58: Center and Scale Numeric Features
## 59: Scale Numeric Features with Respect to their Maximum Absolute Value
## 60: Linearly Transform Numeric Features to Match Given Boundaries
## 61: Remove Features Depending on a Selector
## 62: SMOTE Balancing
## 63: SMOTENC Balancing
## 64: Normalize Data Row-wise
## 65: Subsampling
## 66: Invert Target Transformations
## 67: Transform a Target by a Function
## 68: Linearly Transform a Numeric Target to Match Given Boundaries
## 69: Bag-of-word Representation of Character Features
## 70: Change the Threshold of a Classification Prediction
## 71: Tomek Down-Sampling
## 72: Tune the Threshold of a Classification Prediction
## 73: Unbranch Different Paths
## 74: Interface to the vtreat Package
## 75: Yeo-Johnson Transformation of Numeric Features
## label
## packages tags
## <list> <list>
## 1: mlr3pipelines,smotefamily imbalanced data,data transform
## 2: mlr3pipelines,smotefamily imbalanced data,data transform
## 3: mlr3pipelines,bestNormalize data transform
## 4: mlr3pipelines meta
## 5: mlr3pipelines meta
## 6: mlr3pipelines imbalanced data,data transform
## 7: mlr3pipelines,stats ensemble
## 8: mlr3pipelines imbalanced data,data transform
## 9: mlr3pipelines data transform
## 10: mlr3pipelines data transform
## 11: mlr3pipelines data transform
## 12: mlr3pipelines meta
## 13: mlr3pipelines data transform
## 14: mlr3pipelines encode,data transform
## 15: mlr3pipelines,stats encode,data transform
## 16: mlr3pipelines encode,data transform
## 17: mlr3pipelines,lme4,nloptr encode,data transform
## 18: mlr3pipelines,stats encode,data transform
## 19: mlr3pipelines,mlr3,rpart encode,data transform
## 20: mlr3pipelines ensemble
## 21: mlr3pipelines feature selection,data transform
## 22: mlr3pipelines robustify,data transform
## 23: mlr3pipelines,graphics data transform
## 24: mlr3pipelines,fastICA data transform
## 25: mlr3pipelines missings
## 26: mlr3pipelines,graphics missings
## 27: mlr3pipelines missings
## 28: mlr3pipelines missings
## 29: mlr3pipelines,stats missings
## 30: mlr3pipelines missings
## 31: mlr3pipelines missings
## 32: mlr3pipelines missings
## 33: mlr3pipelines,kernlab data transform
## 34: mlr3pipelines learner
## 35: mlr3pipelines learner,ensemble,data transform
## 36: mlr3pipelines learner,ensemble
## 37: mlr3pipelines learner,ensemble
## 38: mlr3pipelines missings,data transform
## 39: mlr3pipelines,stats data transform
## 40: mlr3pipelines multiplicity
## 41: mlr3pipelines multiplicity
## 42: mlr3pipelines data transform
## 43: mlr3pipelines,themis imbalanced data,data transform
## 44: mlr3pipelines,MASS,NMF data transform
## 45: mlr3pipelines meta
## 46: mlr3pipelines target transform,multiplicity
## 47: mlr3pipelines multiplicity,ensemble
## 48: mlr3pipelines data transform
## 49: mlr3pipelines meta
## 50: mlr3pipelines,stats data transform
## 51: mlr3pipelines data transform
## 52: mlr3pipelines abstract
## 53: mlr3pipelines ensemble
## 54: mlr3pipelines robustify,data transform
## 55: mlr3pipelines data transform
## 56: mlr3pipelines multiplicity
## 57: mlr3pipelines data transform
## 58: mlr3pipelines data transform
## 59: mlr3pipelines data transform
## 60: mlr3pipelines data transform
## 61: mlr3pipelines feature selection,data transform
## 62: mlr3pipelines,smotefamily imbalanced data,data transform
## 63: mlr3pipelines,themis imbalanced data,data transform
## 64: mlr3pipelines data transform
## 65: mlr3pipelines data transform
## 66: mlr3pipelines target transform
## 67: mlr3pipelines target transform
## 68: mlr3pipelines target transform
## 69: mlr3pipelines,quanteda,stopwords data transform
## 70: mlr3pipelines target transform
## 71: mlr3pipelines,themis imbalanced data,data transform
## 72: mlr3pipelines,bbotk target transform
## 73: mlr3pipelines meta
## 74: mlr3pipelines,vtreat encode,missings,data transform
## 75: mlr3pipelines,bestNormalize data transform
## packages tags
## feature_types input.num output.num
## <list> <int> <int>
## 1: logical,integer,numeric,character,factor,ordered,... 1 1
## 2: logical,integer,numeric,character,factor,ordered,... 1 1
## 3: numeric,integer 1 1
## 4: NA 1 NA
## 5: NA 1 NA
## 6: logical,integer,numeric,character,factor,ordered,... 1 1
## 7: NA NA 1
## 8: logical,integer,numeric,character,factor,ordered,... 1 1
## 9: logical,integer,numeric,character,factor,ordered,... 1 1
## 10: factor,ordered 1 1
## 11: logical,integer,numeric,character,factor,ordered,... 1 1
## 12: NA 1 NA
## 13: POSIXct 1 1
## 14: integer,numeric 1 1
## 15: factor,ordered 1 1
## 16: factor,ordered 1 1
## 17: factor,ordered 1 1
## 18: numeric,integer 1 1
## 19: numeric,integer 1 1
## 20: NA NA 1
## 21: logical,integer,numeric,character,factor,ordered,... 1 1
## 22: factor,ordered 1 1
## 23: numeric,integer 1 1
## 24: numeric,integer 1 1
## 25: logical,integer,numeric,character,factor,ordered,... 1 1
## 26: integer,numeric 1 1
## 27: logical,factor,ordered 1 1
## 28: numeric,integer 1 1
## 29: numeric,integer 1 1
## 30: factor,integer,logical,numeric,ordered 1 1
## 31: character,factor,integer,numeric,ordered 1 1
## 32: factor,integer,logical,numeric,ordered 1 1
## 33: numeric,integer 1 1
## 34: NA 1 1
## 35: logical,integer,numeric,character,factor,ordered,... 1 1
## 36: NA 1 1
## 37: NA 1 1
## 38: logical,integer,numeric,character,factor,ordered,... 1 1
## 39: logical,integer,numeric,character,factor,ordered,... 1 1
## 40: NA 1 NA
## 41: NA NA 1
## 42: logical,integer,numeric,character,factor,ordered,... 1 1
## 43: logical,integer,numeric,character,factor,ordered,... 1 1
## 44: numeric,integer 1 1
## 45: NA 1 1
## 46: NA 1 1
## 47: NA 1 1
## 48: numeric,integer 1 1
## 49: NA NA 1
## 50: numeric,integer 1 1
## 51: numeric,integer 1 1
## 52: NA 1 1
## 53: NA NA 1
## 54: logical,integer,numeric,character,factor,ordered,... 1 1
## 55: logical,integer,numeric,character,factor,ordered,... 1 1
## 56: NA 1 1
## 57: numeric,integer 1 1
## 58: numeric,integer 1 1
## 59: numeric,integer 1 1
## 60: numeric,integer 1 1
## 61: logical,integer,numeric,character,factor,ordered,... 1 1
## 62: logical,integer,numeric,character,factor,ordered,... 1 1
## 63: logical,integer,numeric,character,factor,ordered,... 1 1
## 64: numeric,integer 1 1
## 65: logical,integer,numeric,character,factor,ordered,... 1 1
## 66: NA 2 1
## 67: NA 1 2
## 68: NA 1 2
## 69: character 1 1
## 70: NA 1 1
## 71: logical,integer,numeric,character,factor,ordered,... 1 1
## 72: NA 1 1
## 73: NA NA 1
## 74: logical,integer,numeric,character,factor,ordered,... 1 1
## 75: numeric,integer 1 1
## feature_types input.num output.num
## input.type.train input.type.predict output.type.train output.type.predict
## <list> <list> <list> <list>
## 1: TaskClassif TaskClassif TaskClassif TaskClassif
## 2: TaskClassif TaskClassif TaskClassif TaskClassif
## 3: Task Task Task Task
## 4: * * * *
## 5: Task Task Task Task
## 6: TaskClassif TaskClassif TaskClassif TaskClassif
## 7: NULL PredictionClassif NULL PredictionClassif
## 8: TaskClassif TaskClassif TaskClassif TaskClassif
## 9: Task Task Task Task
## 10: Task Task Task Task
## 11: Task Task Task Task
## 12: * * * *
## 13: Task Task Task Task
## 14: Task Task Task Task
## 15: Task Task Task Task
## 16: TaskSupervised TaskSupervised TaskSupervised TaskSupervised
## 17: TaskSupervised TaskSupervised TaskSupervised TaskSupervised
## 18: Task Task Task Task
## 19: TaskClassif TaskClassif TaskClassif TaskClassif
## 20: Task Task Task Task
## 21: Task Task Task Task
## 22: Task Task Task Task
## 23: Task Task Task Task
## 24: Task Task Task Task
## 25: Task Task Task Task
## 26: Task Task Task Task
## 27: Task Task Task Task
## 28: Task Task Task Task
## 29: Task Task Task Task
## 30: Task Task Task Task
## 31: Task Task Task Task
## 32: Task Task Task Task
## 33: Task Task Task Task
## 34: TaskClassif TaskClassif NULL PredictionClassif
## 35: TaskClassif TaskClassif TaskClassif TaskClassif
## 36: TaskRegr TaskRegr NULL PredictionRegr
## 37: TaskRegr TaskRegr NULL PredictionRegr
## 38: Task Task Task Task
## 39: Task Task Task Task
## 40: [*] [*] * *
## 41: * * [*] [*]
## 42: Task Task Task Task
## 43: TaskClassif TaskClassif TaskClassif TaskClassif
## 44: Task Task Task Task
## 45: * * * *
## 46: TaskClassif TaskClassif [TaskClassif] [TaskClassif]
## 47: [NULL] [PredictionClassif] NULL PredictionClassif
## 48: Task Task Task Task
## 49: * * * *
## 50: Task Task Task Task
## 51: Task Task Task Task
## 52: NULL Prediction NULL Prediction
## 53: NULL PredictionRegr NULL PredictionRegr
## 54: Task Task Task Task
## 55: Task Task Task Task
## 56: * * [*] [*]
## 57: Task Task Task Task
## 58: Task Task Task Task
## 59: Task Task Task Task
## 60: Task Task Task Task
## 61: Task Task Task Task
## 62: TaskClassif TaskClassif TaskClassif TaskClassif
## 63: TaskClassif TaskClassif TaskClassif TaskClassif
## 64: Task Task Task Task
## 65: Task Task Task Task
## 66: NULL,NULL function,Prediction NULL Prediction
## 67: Task Task NULL,Task function,Task
## 68: TaskRegr TaskRegr NULL,TaskRegr function,TaskRegr
## 69: Task Task Task Task
## 70: NULL PredictionClassif NULL PredictionClassif
## 71: TaskClassif TaskClassif TaskClassif TaskClassif
## 72: Task Task NULL Prediction
## 73: * * * *
## 74: TaskSupervised TaskSupervised TaskSupervised TaskSupervised
## 75: Task Task Task Task
## input.type.train input.type.predict output.type.train output.type.predict
# ... whilst this just gives the names
mlr_pipeops
## <DictionaryPipeOp> with 75 stored values
## Keys: adas, blsmote, boxcox, branch, chunk, classbalancing, classifavg,
## classweights, colapply, collapsefactors, colroles, copy,
## datefeatures, decode, encode, encodeimpact, encodelmer,
## encodeplquantiles, encodepltree, featureunion, filter, fixfactors,
## histbin, ica, imputeconstant, imputehist, imputelearner, imputemean,
## imputemedian, imputemode, imputeoor, imputesample, kernelpca,
## learner, learner_cv, learner_pi_cvplus, learner_quantiles, missind,
## modelmatrix, multiplicityexply, multiplicityimply, mutate, nearmiss,
## nmf, nop, ovrsplit, ovrunite, pca, proxy, quantilebin,
## randomprojection, randomresponse, regravg, removeconstants,
## renamecolumns, replicate, rowapply, scale, scalemaxabs, scalerange,
## select, smote, smotenc, spatialsign, subsample, targetinvert,
## targetmutate, targettrafoscalerange, textvectorizer, threshold,
## tomek, tunethreshold, unbranch, vtreat, yeojohnson
# Again, to see help on any of them, prefix the key name with mlr_pipeops_
?mlr_pipeops_encode
So we can see the encode pipeline can do one-hot encoding of factors.
We’ll do this first. XGBoost which can do gradient boosting doesn’t
accept factors (look back at learners table earlier), so we now create a
pipeline operation to encode them before passing to the learner. the
function po()
adds operations and %>>%
connects the steps
# Uncomment and run the following command first if you do not have the xgboost package
# install.packages("xgboost")
# Create a pipeline which encodes and then fits an XGBoost model
lrn_xgboost <- lrn("classif.xgboost", predict_type = "prob", id = "gradient.boosting")
pl_xgb <- po("encode") %>>%
po(lrn_xgboost)
# Now fit as normal ... we can just add it to our benchmark set
res <- benchmark(benchmark_grid(
task = list(credit_task),
learners = list(lrn_baseline,
lrn_cart,
lrn_cart_cp,
pl_xgb),
resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)
res$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.fpr"),
msr("classif.fnr")))
## nr task_id learner_id resampling_id iters classif.ce
## <int> <char> <char> <char> <int> <num>
## 1: 1 BankCredit baseline cv 3 0.2815447
## 2: 2 BankCredit tree cv 3 0.2287847
## 3: 3 BankCredit tree.pruned cv 3 0.2343980
## 4: 4 BankCredit encode.gradient.boosting cv 3 0.2231697
## classif.acc classif.fpr classif.fnr
## <num> <num> <num>
## 1: 0.7184553 0.00000000 1.0000000
## 2: 0.7712153 0.08656565 0.5918438
## 3: 0.7656020 0.08156215 0.6246723
## 4: 0.7768303 0.12716639 0.4680833
## Hidden columns: resample_result
Handling missingness is slightly more involved. We provide a pipeline recipie here which is quite robust … read the documentation of each step to understand more.
We then apply this to logistic regression.
# First create a pipeline of just missing fixes we can later use with models
pl_missing <- po("fixfactors") %>>%
po("removeconstants") %>>%
po("imputesample", affect_columns = selector_type(c("ordered", "factor"))) %>>%
po("imputemean")
# Now try with a model that needs no missingness
lrn_log_reg <- lrn("classif.log_reg", predict_type = "prob", id = "logistic.regression")
pl_log_reg <- pl_missing %>>%
po(lrn_log_reg)
# Now fit as normal ... we can just add it to our benchmark set
res <- benchmark(benchmark_grid(
task = list(credit_task),
learners = list(lrn_baseline,
lrn_cart,
lrn_cart_cp,
pl_xgb,
pl_log_reg),
resamplings = list(rsmp("cv", folds = 3))
), store_models = TRUE)
res$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.fpr"),
msr("classif.fnr")))
## nr task_id
## <int> <char>
## 1: 1 BankCredit
## 2: 2 BankCredit
## 3: 3 BankCredit
## 4: 4 BankCredit
## 5: 5 BankCredit
## learner_id
## <char>
## 1: baseline
## 2: tree
## 3: tree.pruned
## 4: encode.gradient.boosting
## 5: fixfactors.removeconstants.imputesample.imputemean.logistic.regression
## resampling_id iters classif.ce classif.acc classif.fpr classif.fnr
## <char> <int> <num> <num> <num> <num>
## 1: cv 3 0.2815420 0.7184580 0.00000000 1.0000000
## 2: cv 3 0.2341678 0.7658322 0.10479843 0.5649245
## 3: cv 3 0.2393319 0.7606681 0.08139514 0.6434153
## 4: cv 3 0.2296776 0.7703224 0.12315002 0.5011726
## 5: cv 3 0.2056540 0.7943460 0.08663037 0.5089951
## Hidden columns: resample_result
Rather than having to choose among the models that we fitted above, we could instead fit all of them and have a final “super learner” fitted which automatically selects the best prediction based on the available base learners. We can do this using the pipelines in MLR3 …
We start from scratch to make this more advanced example self contained.
library("mlr3verse")
set.seed(212) # set seed for reproducibility
# Load data
data("credit_data", package = "modeldata")
# Define task
credit_task <- TaskClassif$new(id = "BankCredit",
backend = credit_data,
target = "Status",
positive = "bad")
# Cross validation resampling strategy
cv5 <- rsmp("cv", folds = 5)
cv5$instantiate(credit_task)
# Define a collection of base learners
lrn_baseline <- lrn("classif.featureless", predict_type = "prob", id = "baseline")
lrn_cart <- lrn("classif.rpart", predict_type = "prob", id = "tree")
lrn_cart_cp <- lrn("classif.rpart", predict_type = "prob", cp = 0.016, id = "tree.pruned")
lrn_ranger <- lrn("classif.ranger", predict_type = "prob", id = "random.forest")
lrn_xgboost <- lrn("classif.xgboost", predict_type = "prob", id = "gradient.descent")
lrn_log_reg <- lrn("classif.log_reg", predict_type = "prob", id = "logistic.regression")
# Define a super learner
lrnsp_log_reg <- lrn("classif.log_reg", predict_type = "prob", id = "super.learner")
# Missingness imputation pipeline
pl_missing <- po("fixfactors") %>>%
po("removeconstants") %>>%
po("imputesample", affect_columns = selector_type(c("ordered", "factor"))) %>>%
po("imputemean")
# Factors coding pipeline
pl_factor <- po("encode")
# Now define the full pipeline
spr_lrn <- gunion(list(
# First group of learners requiring no modification to input
gunion(list(
po("learner_cv", lrn_baseline),
po("learner_cv", lrn_cart),
po("learner_cv", lrn_cart_cp)
)),
# Next group of learners requiring special treatment of missingness
pl_missing %>>%
gunion(list(
po("learner_cv", lrn_ranger),
po("learner_cv", lrn_log_reg),
po("nop") # This passes through the original features adjusted for
# missingness to the super learner
)),
# Last group needing factor encoding
pl_factor %>>%
po("learner_cv", lrn_xgboost)
)) %>>%
po("featureunion") %>>%
po(lrnsp_log_reg)
# This plot shows a graph of the learning pipeline
spr_lrn$plot()
# Finally fit the base learners and super learner and evaluate
res_spr <- resample(credit_task, spr_lrn, cv5, store_models = TRUE)
res_spr$aggregate(list(msr("classif.ce"),
msr("classif.acc"),
msr("classif.fpr"),
msr("classif.fnr")))
## classif.ce classif.acc classif.fpr classif.fnr
## 0.20251706 0.79748294 0.09811216 0.46966500
You will note these are the best results achieved of all the learners (except in false positive), albeit that this is by far the most complicated model