Principal components analysis of an H2O data frame using the power method to calculate the singular value decomposition of the Gram matrix.
h2o.prcomp(training_frame, x, model_id = NULL, validation_frame = NULL,
ignore_const_cols = TRUE, score_each_iteration = FALSE,
transform = c("NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE"),
pca_method = c("GramSVD", "Power", "Randomized", "GLRM"), k = 1,
max_iterations = 1000, use_all_factor_levels = FALSE,
compute_metrics = TRUE, impute_missing = FALSE, seed = -1,
max_runtime_secs = 0)
Id of the training data frame (Not required, to allow initial validation of model parameters).
A vector containing the character
names of the predictors in the model.
Destination id for this model; auto-generated if not specified.
Id of the validation data frame.
Logical
. Ignore constant columns. Defaults to TRUE.
Logical
. Whether to score during each iteration of model training. Defaults to FALSE.
Transformation of training data Must be one of: "NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE". Defaults to NONE.
Method for computing PCA (Caution: GLRM is currently experimental and unstable) Must be one of: "GramSVD", "Power", "Randomized", "GLRM". Defaults to GramSVD.
Rank of matrix approximation Defaults to 1.
Maximum training iterations Defaults to 1000.
Logical
. Whether first factor level is included in each categorical expansion Defaults to FALSE.
Logical
. Whether to compute metrics on the training data Defaults to TRUE.
Logical
. Whether to impute missing entries with the column mean Defaults to FALSE.
Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default) Defaults to -1 (time-based random number).
Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.
N. Halko, P.G. Martinsson, J.A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions[http://arxiv.org/abs/0909.4061]. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011.
library(h2o)
h2o.init()
ausPath <- system.file("extdata", "australia.csv", package="h2o")
australia.hex <- h2o.uploadFile(path = ausPath)
h2o.prcomp(training_frame = australia.hex, k = 8, transform = "STANDARDIZE")
Run the code above in your browser using DataLab