Learn R Programming

umx (version 4.9.0)

umxACEv: Build and run 2-group uni- or multi-variate ACE models based on VARIANCE (not paths).

Description

A common task in twin modeling involves using the genetic and environmental differences between large numbers of pairs of mono-zygotic (MZ) and di-zygotic (DZ) twins reared together to model the genetic and environmental structure of one, or, typically, several phenotypes. umxACEv directly estimates variance components (rather than paths, which are then squared to produce variance and therefore cannot be negative). It offers better power, correct Type I error and un-biased estimates (with no zero-bound for the variances) as a saturated model. (Verhulst et al, 2019).

The ACE variance-based model decomposes phenotypic variance into additive genetic (A), unique environmental (E) and, optionally, either common environment (shared-environment, C) or non-additive genetic effects (D). Scroll down to details for how to use the function, a figure and multiple examples.

The following figure shows the A components of a trivariate ACEv model:

Figure: ACEv.png

NOTE: This function does not use the Cholesky decomposition. Instead it directly models variance. This ensures unbiased type-I error rates. It means that occasionally estimates of variance may be negative. This should be used as an occasion to inspect you model choices and data. umxACEv can be used as a base model to validate the ACE Cholesky model, a core model in behavior genetics (Neale and Cardon, 1992).

Usage

umxACEv(
  name = "ACEv",
  selDVs,
  selCovs = NULL,
  sep = NULL,
  dzData,
  mzData,
  dzAr = 0.5,
  dzCr = 1,
  type = c("Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS"),
  allContinuousMethod = c("cumulants", "marginals"),
  data = NULL,
  zyg = "zygosity",
  weightVar = NULL,
  numObsDZ = NULL,
  numObsMZ = NULL,
  addStd = TRUE,
  addCI = TRUE,
  boundDiag = NULL,
  equateMeans = TRUE,
  bVector = FALSE,
  autoRun = getOption("umx_auto_run"),
  tryHard = c("no", "yes", "ordinal", "search"),
  optimizer = NULL,
  nSib = 2
)

Arguments

name

The name of the model (defaults to"ACE").

selDVs

The variables to include from the data: preferably, just "dep" not c("dep_T1", "dep_T2").

selCovs

(optional) covariates to include from the data (do not include sep in names)

sep

The separator in twin var names, often "_T" in vars like "dep_T1". Simplifies selDVs.

dzData

The DZ dataframe.

mzData

The MZ dataframe.

dzAr

The DZ genetic correlation (defaults to .5, vary to examine assortative mating).

dzCr

The DZ "C" correlation (defaults to 1: set to .25 to make an ADE model).

type

Analysis method one of c("Auto", "FIML", "cov", "cor", "WLS", "DWLS", "ULS").

allContinuousMethod

"cumulants" or "marginals". Used in all-continuous WLS data to determine if a means model needed.

data

If provided, dzData and mzData are treated as valid levels of zyg to select() data sets (default = NULL)

zyg

If data provided, this column is used to select rows by zygosity (Default = "zygosity")

weightVar

= If provided, a vector objective will be used to weight the data. (default = NULL).

numObsDZ

= Number of DZ twins: Set this if you input covariance data.

numObsMZ

= Number of MZ twins: Set this if you input covariance data.

addStd

Whether to add the algebras to compute a std model (defaults to TRUE).

addCI

Whether to add intervals to compute CIs (defaults to TRUE).

boundDiag

= Numeric lbound for diagonal of the a, c, and e matrices. Default = NULL (no bound)

equateMeans

Whether to equate the means across twins (defaults to TRUE).

bVector

Whether to compute row-wise likelihoods (defaults to FALSE).

autoRun

Whether to run the model (default), or just to create it and return without running.

tryHard

Default ('no') uses normal mxRun. "yes" uses mxTryHard. Other options: "ordinal", "search"

optimizer

Optionally set the optimizer (default NULL does nothing).

nSib

Number of sibs, default is 2. Working on 3 :-)

Value

Details

Data Input The function flexibly accepts raw data, and also summary covariance data (in which case the user must also supple numbers of observations for the two input data sets).

Ordinal Data In an important capability, the model transparently handles ordinal (binary or multi-level ordered factor data) inputs, and can handle mixtures of continuous, binary, and ordinal data in any combination.

The function also supports weighting of individual data rows. In this case, the model is estimated for each row individually, then each row likelihood is multiplied by its weight, and these weighted likelihoods summed to form the model-likelihood, which is to be minimized. This feature is used in the non-linear GxE model functions.

Additional features The umxACEv function supports varying the DZ genetic association (defaulting to .5) to allow exploring assortative mating effects, as well as varying the DZ “C” factor from 1 (the default for modeling family-level effects shared 100% by twins in a pair), to .25 to model dominance effects.

note: Only one of C or D may be estimated simultaneously. This restriction reflects the lack of degrees of freedom to simultaneously model C and D with only MZ and DZ twin pairs (Eaves et al. 1978 p267).

References

  • Verhulst, B., Prom-Wormley, E., Keller, M., Medland, S., & Neale, M. C. (2019). Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models. Behav Genet, 49, 99-111. 10.1007/s10519-018-9942-y Eaves, L. J., Last, K. A., Young, P. A., & Martin, N. G. (1978). Model-fitting approaches to the analysis of human behaviour. Heredity, 41, 249-320. https://www.nature.com/articles/hdy1978101.pdf

See Also

Other Twin Modeling Functions: power.ACE.test(), umxACEcov(), umxACE(), umxCP(), umxDoCp(), umxDoC(), umxGxE_window(), umxGxEbiv(), umxGxE(), umxIP(), umxReduceACE(), umxReduceGxE(), umxReduce(), umxRotate.MxModelCP(), umxSexLim(), umxSimplex(), umxSummarizeTwinData(), umxSummaryACEv(), umxSummaryACE(), umxSummaryDoC(), umxSummaryGxEbiv(), umxSummarySexLim(), umxSummarySimplex(), umxTwinMaker(), umx

Examples

Run this code
# NOT RUN {
# ==============================
# = Univariate model of weight =
# ==============================
require(umx)
data(twinData) # ?twinData from Australian twins.

# Things to note: ACE model of weight will return a NEGATIVE variance in C.
#  This is exactly why we have ACEv! It suggests we need a different model
#  In this case: ADE.
# Other things to note:
# 1. umxACEv can figure out variable names: provide "sep", and selVars. 
#    Function generates: "wt" -> "wt1" "wt2"
# 2. umxACEv picks the variables it needs from the data.

mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
m1 = umxACEv(selDVs = "wt", sep = "", dzData = dzData, mzData = mzData)

# A short cut (which is even shorter for "_T" twin data with "MZ"/"DZ" data in zygosity column is:
m1 = umxACEv(selDVs = "wt", sep = "", dzData = "MZFF", mzData = "DZFF", data = twinData)
# ========================================================
# = Evidence for dominance ? (DZ correlation set to .25) =
# ========================================================
m2 = umxACEv("ADE", selDVs = "wt", sep = "", dzData = dzData, mzData = mzData, dzCr = .25)
# note: the underlying matrices are still called A, C, and E.
# I catch this in the summary table, so columns are labeled A, D, and E.
# However, currently, the plot will say A, C, E.

# We can modify this model, dropping dominance component (still called C), 
# and see a comparison:
m3 = umxModify(m2, update = "C_r1c1", comparison = TRUE, name="AE")
# =========================================================
# = Well done! Now you can make modify twin models in umx =
# =========================================================

# ============================
# = How heritable is height? =
# ============================
# 
# Note: Height has a small variance. umx can typically picks good starts,
#    but scaling is advisable.
# 
require(umx)
# Load data and rescale height to cm (var in m too small)
data(twinData) # ?twinData from Australian twins.
twinData[,c("ht1", "ht2")]= twinData[,c("ht1", "ht2")]*100

mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
m1 = umxACEv(selDVs = "ht", sep = "", dzData = dzData, mzData = mzData)

umxSummary(m1, std = FALSE) # unstandardized
plot(m1)

# tip: with report = "html", umxSummary can print the table to your browser!
# tip: You can turn off auto-plot with umx_set_auto_plot(FALSE)

# ========================================================
# = Evidence for dominance ? (DZ correlation set to .25) =
# ========================================================
m2 = umxACEv("ADE", selDVs = "ht", dzCr = .25, sep="", dzData = dzData, mzData = mzData)
umxCompare(m2, m1) # Is ADE better?
umxSummary(m2, comparison = m1) # nb: though this is ADE, matrices are still called A,C,E

# We can modify this model, dropping shared environment, and see a comparison:
m3 = umxModify(m2, update = "C_r1c1", comparison = TRUE, name = "AE")

# =====================================
# = Bivariate height and weight model =
# =====================================

data(twinData)
twinData[,c("ht1", "ht2")]= twinData[,c("ht1", "ht2")]*100
mzData = twinData[twinData$zygosity %in% c("MZFF", "MZMM"), ]
dzData = twinData[twinData$zygosity %in% c("DZFF", "DZMM", "DZOS"), ]
mzData = mzData[1:80, ] # Quicker run to keep CRAN happy
dzData = dzData[1:80, ]
m1 = umxACEv(selDVs = c("ht", "wt"), sep = '', dzData = dzData, mzData = mzData)

# ===================
# = Ordinal example =
# ===================
require(umx)
data(twinData)

# Cut bmi column to form ordinal obesity variables
cutPoints = quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
obesityLevels = c('normal', 'overweight', 'obese')
twinData$obese1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 

# Make the ordinal variables into mxFactors (ensure ordered is TRUE, and require levels)
twinData[, c("obese1", "obese2")] = umxFactor(twinData[, c("obese1", "obese2")])
mzData = twinData[twinData$zygosity %in% "MZFF", ][1:80,] # 80 pairs for speed on CRAN
dzData = twinData[twinData$zygosity %in% "DZFF", ][1:80,]
m2 = umxACEv(selDVs = "obese", dzData = dzData, mzData = mzData, sep = '')

# FYI: Show mz, dz, and t1 and t2 have the same levels!
str(mzData)

# ============================================
# = Bivariate continuous and ordinal example =
# ============================================
data(twinData)
# Cut bmi column to form ordinal obesity variables
ordDVs = c("obese1", "obese2")
obesityLevels = c('normal', 'overweight', 'obese')
cutPoints = quantile(twinData[, "bmi1"], probs = c(.5, .2), na.rm = TRUE)
twinData$obese1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 

# Make the ordinal variables into ordered mxFactors
twinData[, ordDVs] = umxFactor(twinData[, ordDVs])

# umxACEv can trim out unused variables on its own
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]

m1 = umxACEv(selDVs = c("wt", "obese"), dzData = dzData, mzData = mzData, sep = '')
plot(m1)

# =======================================
# = Mixed continuous and binary example =
# =======================================
require(umx)
data(twinData)
# Cut to form category of 20% obese subjects
# and make into mxFactors (ensure ordered is TRUE, and require levels)
cutPoints = quantile(twinData[, "bmi1"], probs = .2, na.rm = TRUE)
obesityLevels = c('normal', 'obese')
twinData$obese1 = cut(twinData$bmi1, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
twinData$obese2 = cut(twinData$bmi2, breaks = c(-Inf, cutPoints, Inf), labels = obesityLevels) 
ordDVs = c("obese1", "obese2")
twinData[, ordDVs] = umxFactor(twinData[, ordDVs])

selDVs = c("wt", "obese")
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
# }
# NOT RUN {
m1 = umxACEv(selDVs = selDVs, dzData = dzData, mzData = mzData, sep = '')
umxSummary(m1)
# }
# NOT RUN {
# ===================================
# Example with covariance data only =
# ===================================

require(umx)
data(twinData)
selDVs = c("wt")
mz = cov(twinData[twinData$zygosity %in% "MZFF", tvars(selDVs, "")], use = "complete")
dz = cov(twinData[twinData$zygosity %in% "DZFF", tvars(selDVs, "")], use = "complete")
m1 = umxACEv(selDVs = selDVs, sep= "", dzData = dz, mzData= mz, numObsDZ= 569, numObsMZ= 351)
umxSummary(m1, std = FALSE)

# }

Run the code above in your browser using DataLab