h2o.psvm: Trains a Support Vector Machine model on an H2O dataset

Description

Alpha version. Supports only binomial classification problems.

Usage

h2o.psvm(
  x,
  y,
  training_frame,
  model_id = NULL,
  validation_frame = NULL,
  ignore_const_cols = TRUE,
  hyper_param = 1,
  kernel_type = c("gaussian"),
  gamma = -1,
  rank_ratio = -1,
  positive_weight = 1,
  negative_weight = 1,
  disable_training_metrics = TRUE,
  sv_threshold = 1e-04,
  fact_threshold = 1e-05,
  feasible_threshold = 0.001,
  surrogate_gap_threshold = 0.001,
  mu_factor = 10,
  max_iterations = 200,
  seed = -1
)

Arguments

x: (Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except y are used.
y: The name or column index of the response variable in the data. The response must be either a binary categorical/factor variable or a numeric variable with values -1/1 (for compatibility with SVMlight format).
training_frame: Id of the training data frame.
model_id: Destination id for this model; auto-generated if not specified.
validation_frame: Id of the validation data frame.
ignore_const_cols: Logical. Ignore constant columns. Defaults to TRUE.
hyper_param: Penalty parameter C of the error term Defaults to 1.
kernel_type: Type of used kernel Must be one of: "gaussian". Defaults to gaussian.
gamma: Coefficient of the kernel (currently RBF gamma for gaussian kernel, -1 means 1/#features) Defaults to -1.
rank_ratio: Desired rank of the ICF matrix expressed as an ration of number of input rows (-1 means use sqrt(#rows)). Defaults to -1.
positive_weight: Weight of positive (+1) class of observations Defaults to 1.
negative_weight: Weight of positive (-1) class of observations Defaults to 1.
disable_training_metrics: Logical. Disable calculating training metrics (expensive on large datasets) Defaults to TRUE.
sv_threshold: Threshold for accepting a candidate observation into the set of support vectors Defaults to 0.0001.
fact_threshold: Convergence threshold of the Incomplete Cholesky Factorization (ICF) Defaults to 1e-05.
feasible_threshold: Convergence threshold for primal-dual residuals in the IPM iteration Defaults to 0.001.
surrogate_gap_threshold: Feasibility criterion of the surrogate duality gap (eta) Defaults to 0.001.
mu_factor: Increasing factor mu Defaults to 10.
max_iterations: Maximum number of iteration of the algorithm Defaults to 200.
seed: Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default). Defaults to -1 (time-based random number).

Examples

Run this code

if (FALSE) {
library(h2o)
h2o.init()

# Import the splice dataset
f <- "https://s3.amazonaws.com/h2o-public-test-data/smalldata/splice/splice.svm"
splice <- h2o.importFile(f)

# Train the Support Vector Machine model
svm_model <- h2o.psvm(gamma = 0.01, rank_ratio = 0.1,
                      y = "C1", training_frame = splice,
                      disable_training_metrics = FALSE)
}

Run the code above in your browser using DataLab