C5.0 MAP-X models.
Usage
Xmodel.tree.C5.0(
data = NULL,
costs = NA,
winnowing = FALSE,
noGlobalPruning = FALSE,
CF = 0.3,
boost = 1,
downsample = NA
)
Arguments
- data
Data frame with pairwise differential values. Must contain columns "protein1" and "protein2", any number of columns with predictors for modelling and a labels column with values 1 (for protein pairs that form a complex) and 0 (for protein pairs that do not form a complex).
- costs
Integer: A number specifying how much is the cost of falsely predicting non-interacting protein pairs as interacting higher than vice versa.
- CF
Numeric vector: Complexity parameters to be cross-validated. Default is 0.1.
- downsample
Integer: How many times less of the non-interacting proteins should be used for the training? Default is 1. Applicable for C5.0 models.
- eval.metric
Character string: How should the model be evaluated in cross-validation? Defalt is "prc" for area under the precision-recall curve. Other options are "roc" for area under the receiver-operator curve and "kappa" for Cohen's kappa.