Skip to contents

Train classification tree model with a cross-training algorithm.

Usage

X.crosstrain.tree(
  data = NULL,
  standard.set = NULL,
  train.cycles = 25,
  train.split = 4,
  labels.col = NA,
  tree.type = "RF",
  evaluate = FALSE,
  eval.metric = "prc",
  plot = TRUE,
  train.pars = list(),
  save.dir = "models",
  save.output = c("models", "plots")
)

Arguments

data

Data frame with all data coming from an experiment. Each row corresponds to a unique pair of proteins and other columns must include all predictors that are in the standard set.

standard.set

Data frame with predictors for pairs of proteins and with labels.

train.cycles

Integer: How many times should the training be repeated? Default is 25.

train.split

Integer: To how many individual training sets should the data be split in each training cycle? Default is 4.

labels.col

Character string: 'data' column name with labels (1 for complex-forming, 0 for others.) Default is NA, in which case the last column will be considered as the column with labels.

tree.type

Character string: What type of classification tree should be used for the model training? Options are "J48","CART", "PART", "C5.0","RF".

evaluate

Logical: Should each model be evaluated? Default is FALSE. If TRUE, evaluation plots will be saved in the save.dir and

eval.metric

Character string: What type of evaluation metric should be used for evaluation? Options as in X.evaluate a table with evaluation metrics will be outputted.

plot

Logical: If evaluate is TRUE, should the plots be saved?

train.pars

Named list: Tuning parameters for classification tree training. Default is list(), in which case the default tuning parameters as specified in the arguments of the function X.model.tree will be used.

save.dir

Character string: Name of the folder to be used to save the models. Default is "models".

Value

A list with three elements. $data for all prediction scores, $metric.data for all underlying data for evaluation metric calculation, $metric.plots for metric plots.

Examples

cross.model <- X.crosstrain.tree(data=all.predictors[[tp]][[rep]],standard.set=GS_specific,train.pars=best.train.pars,
        evaluate=TRUE,plot=TRUE,
        train.split=3, train.cycles=10)
#> Error: object 'all.predictors' not found