umxCP: umxCP: Build and run a Common pathway twin model

Description

Make a 2-group Common Pathway twin model (Common-factor common-pathway multivariate model).

The common-pathway model provides a powerful tool for theory-based decomposition of genetic and environmental differences.

umxCP supports this with pairs of mono-zygotic (MZ) and di-zygotic (DZ) twins reared together to model the genetic and environmental structure of multiple phenotypes (measured behaviors).

Common-pathway path diagram:

Figure: CP.png

As can be seen, each phenotype also by default has A, C, and E influences specific to that phenotype.

Features include the ability to include more than one common pathway, to use ordinal data.

note: The function umx_set_optimization_options() allows users to see and set mvnRelEps and mvnMaxPointsA mvnRelEps defaults to .005. For ordinal models, you might find that '0.01' works better.

Usage

umxCP(
  name = "CP",
  selDVs,
  selCovs = NULL,
  dzData = NULL,
  mzData = NULL,
  sep = NULL,
  nFac = 1,
  type = c("Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS"),
  data = NULL,
  zyg = "zygosity",
  allContinuousMethod = c("cumulants", "marginals"),
  correlatedACE = FALSE,
  dzAr = 0.5,
  dzCr = 1,
  autoRun = getOption("umx_auto_run"),
  tryHard = c("yes", "no", "ordinal", "search"),
  optimizer = NULL,
  equateMeans = TRUE,
  weightVar = NULL,
  bVector = FALSE,
  boundDiag = 0,
  addStd = TRUE,
  addCI = TRUE,
  numObsDZ = NULL,
  numObsMZ = NULL,
  freeLowerA = FALSE,
  freeLowerC = FALSE,
  freeLowerE = FALSE,
  correlatedA = "deprecated"
)

Arguments

name

The name of the model (defaults to "CP").

selDVs

The variables to include. omit sep in selDVs, i.e., just "dep" not c("dep_T1", "dep_T2").

selCovs

basenames for covariates

dzData

The DZ dataframe.

mzData

The MZ dataframe.

sep

(required) The suffix for twin 1 and twin 2, often "_T".

nFac

How many common factors (default = 1)

type

One of "Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS"

data

If provided, dzData and mzData are treated as valid levels of zyg to select() data sets (default = NULL)

zyg

If data provided, this column is used to select rows by zygosity (Default = "zygosity")

allContinuousMethod

"cumulants" or "marginals". Used in all-continuous WLS data to determine if a means model needed.

correlatedACE

Allows correlations between the factors built by each of the a, c, and e matrices. Default = FALSE.

dzAr

The DZ genetic correlation (defaults to .5, vary to examine assortative mating).

dzCr

The DZ "C" correlation (defaults to 1: set to .25 to make an ADE model).

autoRun

Whether to run the model (default), or just to create it and return without running.

tryHard

Default ("yes") uses mxTryHard, "no" uses normal mxRun. Other options: "ordinal", "search"

optimizer

optionally set the optimizer (default NULL does nothing).

equateMeans

Whether to equate the means across twins (defaults to TRUE).

weightVar

If provided, a vector objective will be used to weight the data. (default = NULL).

bVector

Whether to compute row-wise likelihoods (defaults to FALSE).

boundDiag

= Numeric lbound for diagonal of the a_cp, c_cp, & e_cp matrices. Set = NULL to ignore.

addStd

Whether to add the algebras to compute a std model (defaults to TRUE).

addCI

Whether to add the interval requests for CIs (defaults to TRUE).

numObsDZ

= not yet implemented: Ordinal Number of DZ twins: Set this if you input covariance data.

numObsMZ

= not yet implemented: Ordinal Number of MZ twins: Set this if you input covariance data.

freeLowerA

(ignore): Whether to leave the lower triangle of A free (default = FALSE).

freeLowerC

(ignore): Whether to leave the lower triangle of C free (default = FALSE).

freeLowerE

(ignore): Whether to leave the lower triangle of E free (default = FALSE).

correlatedA

deprecated.

Value

mxModel()

Details

Like the umxACE() model, the CP model decomposes phenotypic variance into additive genetic (A), unique environmental (E) and, optionally, either common or shared-environment (C) or non-additive genetic effects (D).

Unlike the Cholesky, these factors do not act directly on the phenotype. Instead latent A, C, and E influences impact on one or more latent factors which in turn account for variance in the phenotypes (see Figure).

Data Input Currently, the umxCP function accepts only raw data. This may change in future versions.

Ordinal Data

In an important capability, the model transparently handles ordinal (binary or multi-level ordered factor data) inputs, and can handle mixtures of continuous, binary, and ordinal data in any combination.

Additional features

The umxCP function supports varying the DZ genetic association (defaulting to .5) to allow exploring assortative mating effects, as well as varying the DZ “C” factor from 1 (the default for modeling family-level effects shared 100% by twins in a pair), to .25 to model dominance effects.

Matrices and Labels in CP model

A good way to see which matrices are used in umxCP is to run an example model and plot it.

All the shared matrices are in the model "top".

Matrices top$as, top$cs, and top$es contain the path loadings specific to each variable on their diagonals.

So, to see the 'as' values, labels, or free states, you can say:

m1$top$as$values

m1$top$as$free

m1$top$as$labels

Labels relevant to modifying the specific loadings take the form "as_r1c1", "as_r2c2" etc.

The common-pathway loadings on the factors are in matrices top$a_cp, top$c_cp, top$e_cp.

The common factors themselves are in the matrix top$cp_loadings (an nVar * 1 matrix)

Less commonly-modified matrices are the mean matrix expMean. This has 1 row, and the columns are laid out for each variable for twin 1, followed by each variable for twin 2. So, in a model where the means for twin 1 and twin 2 had been equated (set = to T1), you could make them independent again with this line:

m1$top$expMean$labels[1,4:6] = c("expMean_r1c4", "expMean_r1c5", "expMean_r1c6")

For a deep-dive, see xmu_make_TwinSuperModel()

References

https://github.com/tbates/umx

Examples

Run this code

# NOT RUN {
# ========================================================
# = Run a 3-factor Common pathway twin model of 6 traits =
# ========================================================
require(umx)
data(GFF)
mzData = subset(GFF, zyg_2grp == "MZ")
dzData = subset(GFF, zyg_2grp == "DZ")
selDVs = c("gff", "fc", "qol", "hap", "sat", "AD") 
m1 = umxCP(selDVs = selDVs, sep = "_T", nFac = 3, tryHard = "yes",
		dzData = dzData, mzData = mzData)

# Shortcut using "data ="
selDVs = c("gff", "fc", "qol", "hap", "sat", "AD") 
m1 = umxCP(selDVs= selDVs, nFac= 3, data=GFF, zyg="zyg_2grp")

# ===================
# = Do it using WLS =
# ===================
m2 = umxCP("new", selDVs = selDVs, sep = "_T", nFac = 3, optimizer = "SLSQP",
		dzData = dzData, mzData = mzData, tryHard = "ordinal", 
	type= "DWLS", allContinuousMethod='marginals'
)

# =================================================
# = Find and test dropping of shared environment  =
# =================================================
# Show all labels for C parameters 
umxParameters(m1, patt = "^c")
# Test dropping the 9 specific and common-factor C paths
m2 = umxModify(m1, regex = "(cs_.*$)|(c_cp_)", name = "dropC", comp = TRUE)
umxSummaryCP(m2, comparison = m1, file = NA)
umxCompare(m1, m2)

# =======================================
# = Mixed continuous and binary example =
# =======================================
data(GFF)
# Cut to form umxFactor 20% depressed  DEP
cutPoints = quantile(GFF[, "AD_T1"], probs = .2, na.rm = TRUE)
ADLevels  = c('normal', 'depressed')
GFF$DEP_T1 = cut(GFF$AD_T1, breaks = c(-Inf, cutPoints, Inf), labels = ADLevels) 
GFF$DEP_T2 = cut(GFF$AD_T2, breaks = c(-Inf, cutPoints, Inf), labels = ADLevels) 
ordDVs = c("DEP_T1", "DEP_T2")
GFF[, ordDVs] = umxFactor(GFF[, ordDVs])

selDVs = c("gff","fc","qol","hap","sat","DEP") 
mzData = subset(GFF, zyg_2grp == "MZ")
dzData = subset(GFF, zyg_2grp == "DZ")

# umx_set_optimizer("NPSOL")
# umx_set_optimization_options("mvnRelEps", .01)
m1 = umxCP(selDVs = selDVs, sep = "_T", nFac = 3, dzData = dzData, mzData = mzData)
m2 = umxModify(m1, regex = "(cs_r[3-5]|c_cp_r[12])", name = "dropC", comp= TRUE)

# Do it using WLS
m3 = umxCP(selDVs = selDVs, sep = "_T", nFac = 3, dzData = dzData, mzData = mzData,
		tryHard = "ordinal", type= "DWLS")
# TODO umxCPL fix WLS here
# label at row 1 and column 1 of matrix 'top.binLabels'' in model 'CP3fac' : object 'Vtot'

# ==============================
# = Correlated factors example =
# ==============================
data(GFF)
mzData = subset(GFF, zyg_2grp == "MZ")
dzData = subset(GFF, zyg_2grp == "DZ")
selDVs = c("gff", "fc", "qol", "hap", "sat", "AD")
m1 = umxCP("base_model", selDVs = selDVs, sep = "_T", correlatedACE = TRUE, 
	 dzData = dzData, mzData = mzData, nFac = 3, tryHard = "yes")

# What are the ace covariance labels? (two ways to get)
umx_lower.tri(m1$top$a_cp$labels)
parameters(m1, patt = "[ace]_cp")

# 1. Now allow a1 and a2 to correlate
m2=umxModify(m1,regex="a_cp_r2c1",name="a2_a1_cov",free=TRUE,tryHard="yes")
umxCompare(m2, m1)

# 2. Drop all (a|c|e) correlations from a model
tmp= namez(umx_lower.tri(m2$top$a_cp$labels), "a_cp", replace= "[ace]_cp")
m3 = umxModify(m2, regex= tmp, comparison = TRUE)
# }
# NOT RUN {
 # end dontrun

# }

Run the code above in your browser using DataLab