Skip to contents

This is a helper function that constructs a default Surrogate based on properties of the bbotk::OptimInstance.

For numeric-only (including integers) parameter spaces without any dependencies a Gaussian Process is constricted via default_gp(). For mixed numeric-categorical parameter spaces, or spaces with conditional parameters a random forest is constructed via default_rf().

In any case, learners are encapsulated using “"evaluate"”, and a fallback learner is set, in cases where the surrogate learner errors. Currently, the following learner is used as a fallback: lrn("regr.ranger", num.trees = 10L, keep.inbag = TRUE, se.method = "jack").

If additionally dependencies are present in the parameter space, inactive conditional parameters are represented by missing NA values in the training design data. We simply handle those with an imputation method, added to the random forest, more concretely we use po("imputesample") (for logicals) and po("imputeoor") (for anything else) from package mlr3pipelines. Characters are always encoded as factors via po("colapply"). Out of range imputation makes sense for tree-based methods and is usually hard to beat, see Ding et al. (2010). In the case of dependencies, the following learner is used as a fallback: lrn("regr.featureless").

If n_learner is 1, the learner is wrapped as a SurrogateLearner. Otherwise, if n_learner is larger than 1, multiple deep clones of the learner are wrapped as a SurrogateLearnerCollection.

Usage

default_surrogate(
  instance,
  learner = NULL,
  n_learner = NULL,
  force_random_forest = FALSE
)

Arguments

instance

(bbotk::OptimInstance)
An object that inherits from bbotk::OptimInstance.

learner

(NULL | mlr3::Learner). If specified, this learner will be used instead of the defaults described above.

n_learner

(NULL | integer(1)). Number of learners to be considered in the construction of the Surrogate. If not specified will be based on the number of objectives as stated by the instance.

force_random_forest

(logical(1)). If TRUE, a random forest is constructed even if the parameter space is numeric-only.

Value

Surrogate

References

  • Ding, Yufeng, Simonoff, S J (2010). “An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data.” Journal of Machine Learning Research, 11(1), 131–170.