lavaan (version 0.6-19)

lavPredictY_cv: Determine an optimal lambda penalty value through cross-validation


This function can be used to determine an optimal lambda value for the lavPredictY function. based on cross-validation.


lavPredictY_cv(object, data = NULL,
              xnames = lavNames(object, "ov.x"),
              ynames = lavNames(object, "ov.y"),
              n.folds = 10L,
              lambda.seq = seq(0, 1, 0.1))



An object of class lavaan.


A data.frame, containing the same variables as the data.frame that was used when fitting the model in object.


The names of the observed variables that should be treated as the x-variables. Can also be a list to allow for a separate set of variable names per group (or block).


The names of the observed variables that should be treated as the y-variables. It is for these variables that the function will predict the (model-based) values for each observation. Can also be a list to allow for a separate set of variable names per group (or block).


Integer. The number of folds to be used during cross-validation.


An R seq() containing the range of lambda penalty values to be tested during cross-validation.


This function is used to generate an optimal lambda value for lavPredictY predictions to improve prediction accuracy.


de Rooij, M., Karch, J.D., Fokkema, M., Bakk, Z., Pratiwi, B.C, and Kelderman, H. (2022) SEM-Based Out-of-Sample Predictions, Structural Equation Modeling: A Multidisciplinary Journal. DOI:10.1080/10705511.2022.2061494

Molina, M. D., Molina, L., & Zappaterra, M. W. (2024). Aspects of Higher Consciousness: A Psychometric Validation and Analysis of a New Model of Mystical Experience. tools:::Rd_expr_doi("https://doi.org/10.31219/osf.io/cgb6e")

See Also

lavPredictY to predict the values of (observed) y-variables given the values of (observed) x-variables in a structural equation model.


colnames(PoliticalDemocracy) <- c("z1", "z2", "z3", "z4", 
                                  "y1", "y2", "y3", "y4", 
                                  "x1", "x2", "x3")

model <- '
  # latent variable definitions
  ind60 =~ x1 + x2 + x3
  dem60 =~ z1 + z2 + z3 + z4
  dem65 =~ y1 + y2 + y3 + y4
  # regressions
  dem60 ~ ind60
  dem65 ~ ind60 + dem60
  # residual correlations
  z1 ~~ y1
  z2 ~~ z4 + y2
  z3 ~~ y3
  z4 ~~ y4
  y2 ~~ y4
fit <- sem(model, data = PoliticalDemocracy, meanstructure = TRUE)

percent <- 0.5
nobs <- lavInspect(fit, "ntotal")
idx <- sort(sample(x = nobs, size = floor(percent*nobs)))

xnames = c("z1", "z2", "z3", "z4", "x1", "x2", "x3")
ynames = c("y1", "y2", "y3", "y4")

reg.results <- lavPredictY_cv(
    PoliticalDemocracy[-idx, ],
    xnames = xnames,
    ynames = ynames,
    n.folds = 10L,
    lambda.seq = seq(from = .6, to = 2.5, by = .1)
lam <- reg.results$lambda.min

lavPredictY(fit, newdata = PoliticalDemocracy[idx,],
                 ynames  = ynames,
                 xnames  = xnames,
                 lambda  = lam)

