model.syntax: The Lavaan Model Syntax

Description

The lavaan model syntax describes a latent variable model. The function lavaanify turns it into a list that represents the full model as specified by the user. We refer to this list as the parameter table.

Usage

lavaanify(model = NULL, meanstructure = FALSE, int.ov.free = FALSE, 
    int.lv.free = FALSE, orthogonal = FALSE, std.lv = FALSE, 
    fixed.x = TRUE, constraints = NULL, auto = FALSE, model.type = "sem", 
    auto.fix.first = FALSE, auto.fix.single = FALSE, auto.var = FALSE, 
    auto.cov.lv.x = FALSE, auto.cov.y = FALSE, auto.th = FALSE, 
    auto.delta = FALSE, varTable = NULL, ngroups = 1L, group.equal = NULL, 
    group.partial = NULL, debug = FALSE, warn = TRUE, as.data.frame. = TRUE)
parseModelString(model.syntax = '', as.data.frame.=FALSE, warn=TRUE, debug=FALSE)
lavaanNames(object, type = "ov", group = NULL)

Arguments

model

A description of the user-specified model. Typically, the model is described using the lavaan model syntax; see details for more information. Alternatively, a parameter tablea (e.g., the output of parseModelString is also accept

model.syntax

The model syntax specifying the model. Must be a literal string.

meanstructure

If TRUE, intercepts/means will be added to the model both for both observed and latent variables.

int.ov.free

If FALSE, the intercepts of the observed variables are fixed to zero.

int.lv.free

If FALSE, the intercepts of the latent variables are fixed to zero.

orthogonal

If TRUE, the exogenous latent variables are assumed to be uncorrelated.

std.lv

If TRUE, the metric of each latent variable is determined by fixing their variances to 1.0. If FALSE, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0.

fixed.x

If TRUE, the exogenous `x' covariates are considered fixed variables and the means, variances and covariances of these variables are fixed to their sample values. If FALSE, they are considered random, and the means, v

constraints

Additional (in)equality constraints. See details for more information.

auto

If TRUE, the default values are used for the auto.* arguments, depending on the value of model.type.

model.type

Either "sem" or "growth"; only used if auto=TRUE.

auto.fix.first

If TRUE, the factor loading of the first indicator is set to 1.0 for every latent variable.

auto.fix.single

If TRUE, the residual variance (if included) of an observed indicator is set to zero if it is the only indicator of a latent variable.

auto.var

If TRUE, the residual variances and the variances of exogenous latent variables are included in the model and set free.

auto.cov.lv.x

If TRUE, the covariances of exogenous latent variables are included in the model and set free.

auto.cov.y

If TRUE, the covariances of dependent variables (both observed and latent) are included in the model and set free.

auto.th

If TRUE, thresholds for limited dependent variables are included in the model and set free.

auto.delta

If TRUE, response scaling parameters for limited dependent variables are included in the model and set free.

varTable

The variable table containing information about the observed variables in the model.

ngroups

The number of (independent) groups.

group.equal

A vector of character strings. Only used in a multiple group analysis. Can be one or more of the following: "loadings", "intercepts", "means", "regressions", "residuals" or <

group.partial

A vector of character strings containing the labels of the parameters which should be free in all groups (thereby overriding the group.equal argument for some specific parameters).

warn

If TRUE, some (possibly harmless) warnings are printed out.

as.data.frame.

If TRUE, return the list of model parameters as a data.frame.

debug

If TRUE, debugging information is printed out.

object

Either a list containing the parameter table, as returned by lavaanify or parseModelString, or an object of class lavaan.

type

Only used in the function lavaanNames. If type contains "ov", only observed variable names are returned. If type contains "lv", only latent variable names are returned. The "ov.x" and "lv.x"

group

Only used in the function lavaanNames. If NULL, all groups (if any) are used. If an integer (vector), only names from those groups are extracted. The group numbers are found in the group column of the parameter table

Fixing parameters

It is often desirable to fix a model parameter that is otherwise (by default) free. Any parameter in a model can be fixed by using a modifier resulting in a numerical constaint. Here are some examples:

Fixing the regression coefficient of the predictorx2:y ~ x1 + 2.4*x2 + x3
Specifying an orthogonal (zero) covariance between two latent variables:f1 ~~ 0*f2
Specifying an intercept and a linear slope in a growth model:i =~ 1*y11 + 1*y12 + 1*y13 + 1*y14 s =~ 0*y11 + 1*y12 + 2*y13 + 3*y14

Instead of a numeric constant, one can use a mathematical function that returns a numeric constant, for example sqrt(10). Multiplying with NA will force the corresponding parameter to be free.

Starting values

User-provided starting values can be given by using the special function start(), containing a numeric constant. For example: y ~ x1 + start(1.0)*x2 + x3 Note that if a starting value is provided, the parameter is not automatically considered to be free.

Parameter labels and equality constraints

Each free parameter in a model is automatically given a name (or label). The name given to a model parameter consists of three parts, coerced to a single character vector. The first part is the name of the variable in the left-hand side of the formula where the parameter was implied. The middle part is based on the special `operator' used in the formula. This can be either one of "=~", "~" or "~~". The third part is the name of the variable in the right-hand side of the formula where the parameter was implied, or "1" if it is an intercept. The three parts are pasted together in a single string. For example, the name of the fixed regression coefficient in the regression formula y ~ x1 + 2.4*x2 + x3 is the string "y~x2". The name of the parameter corresponding to the covariance between two latent variables in the formula f1 ~~ f2 is the string "f1~~f2".

Although this automatic labeling of parameters is convenient, the user may specify its own labels for specific parameters by using the label() modifier in a formula. For example, in the formula f1 =~ x1 + x2 + label("mylabel")*x3, the parameter corresponding with the factor loading of x3 will be named "mylabel" instead of the default name "f1=~x3". Since version 0.4-8, a more convenient way to specify the label is as follows: f1 =~ x1 + x2 + mylabel*x3: simply multiplying with a string literal is equivalent to the using the "label" modifier.

To constrain a parameter to be equal to another target parameter, there are two ways. If you have specified your own labels, you can use the fact that equal labels imply equal parameter values. If you rely on automatic parameter labels, you can use the special function equal(). The argument of equal() is the (automatic or user-specified) name of the target parameter. For example, in the confirmatory factor analysis example below, the intercepts of the three indicators of each latent variable are constrained to be equal to each other. For the first three, we have used the default names. For the last three, we have provided a custom label for the y2a intercept. model <- ' # two latent variables with fixed loadings f1 =~ 1*y1a + 1*y1b + 1*y1c f2 =~ 1*y2a + 1*y2b + 1*y2c

# intercepts constrained to be equal # using the default names y1a ~ 1 y1b ~ equal("y1a~1") * 1 y1c ~ equal("y1a~1") * 1

# intercepts constrained to be equal # using a custom label y2a ~ int2 * 1 y2b ~ int2 * 1 y2c ~ int2 * 1 '

Multiple groups

In a multiple group analysis, modifiers that contain a single constant must be replaced by a vector, having the same length as the number of groups. The only exception are numerical constants (for fixing values): if you provide only a single number, the same number will be used for all groups. However, it is safer (and cleaner) to specify the same number of elements as you have groups. For example, if there are two groups: HS.model <- ' visual =~ x1 + 0.5*x2 + c(0.6, 0.8)*x3

textual =~ x4 + start(c(1.2, 0.6))*x5 + x6

speed =~ x7 + x8 + c(x9.group1, x9.group2) * x9 ' In this example, the factor loading of the `x2' indicator is fixed to the value 0.5 for all groups. However, the factor loadings of the `x3' indicator are fixed to 0.6 and 0.8 for group 1 and group 2 respectively. The same logic is used for all modifiers. Note that character vectors can contain unquoted strings.

Multiple modifiers

In the model syntax, you can specify a variable more than once on the right hand side of an operator; therefore, several `modifiers' can be applied simultaneously; for example, if you want to fix the value of a parameter and also label that parameter, you can use something like: f1 =~ x1 + x2 + 4*x3 + x3.loading*x3

Details

The model syntax consists of one or more formula-like expressions, each one describing a specific part of the model. The model syntax can be read from a file (using readLines), or can be specified as a literal string enclosed by single quotes as in the example below. myModel <- ' # latent variable definitions f1 =~ y1 + y2 + y3 f2 =~ y4 + y5 + y6 f3 =~ y7 + y8 + y9 + y10 f4 =~ y11 + y12 + y13

! this is also a comment # regressions f1 ~ f3 + f4 f2 ~ f4 y1 + y2 ~ x1 + x2 + x3

# (co)variances y1 ~~ y1 y2 ~~ y4 + y5 f1 ~~ f2

# intercepts f1 ~ 1; y5 ~ 1 ' Blank lines and comments can be used in between the formulas, and formulas can be split over multiple lines. Both the sharp (#) and the exclamation (!) characters can be used to start a comment. Multiple formulas can be placed on a single line if they are separated by a semicolon (;).

There can be four types of formula-like expressions in the model syntax:

Latent variable definitions: The"=~"operator can be used to define (continuous) latent variables. The name of the latent variable is on the left of the"=~"operator, while the terms on the right, separated by"+"operators, are the indicators of the latent variable.

The operator"=~"can be read as ``is manifested by''.

Regressions: The"~"operator specifies a regression. The dependent variable is on the left of a"~"operator and the independent variables, separated by"+"operators, are on the right. These regression formulas are similar to the way ordinary linear regression formulas are used in R, but they may include latent variables. Interaction terms are currently not supported.

Variance-covariances: The"~~"(`double tilde') operator specifies (residual) variances of an observed or latent variable, or a set of covariances between one variable, and several other variables (either observed or latent). Several variables, separated by"+"operators can appear on the right. This way, several pairwise (co)variances involving the same left-hand variable can be expressed in a single expression. The distinction between variances and residual variances is made automatically.

Intercepts: A special case of a regression formula can be used to specify an intercept (or a mean) of either an observed or a latent variable. The variable name is on the left of a"~"operator. On the right is only the number"1"representing the intercept. Including an intercept formula in the model automatically impliesmeanstructure = TRUE. The distinction between intercepts and means is made automatically.

Usually, only a single variable name appears on the left side of an operator. However, if multiple variable names are specified, separated by the "+" operator, the formula is repeated for each element on the left side (as for example in the third regression formula in the example above).

In the right-hand side of these formula-like expressions, each element can be modified (using the "*" operator) by either a numeric constant, an expression resulting in a numeric constant, an expression resulting in a character vector, or one of three special functions: start(), label() and equal(). This provides the user with a mechanism to fix parameters, to provide alternative starting values, to label the parameters, and to define equality constraints among model parameters. This is explained in more detail in the following sections.

References

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.