Learn R Programming

jackstraw (version 1.3.17)

jackstraw_subspace: Jackstraw for the User-Defined Dimension Reduction Methods

Description

Test association between the observed variables and their latent variables, captured by a user-defined dimension reduction method.

Usage

jackstraw_subspace(
  dat,
  r,
  FUN,
  r1 = NULL,
  s = NULL,
  B = NULL,
  covariate = NULL,
  noise = NULL,
  verbose = TRUE
)

Value

jackstraw_subspace returns a list consisting of

p.value

m p-values of association tests between variables and their principal components

obs.stat

m observed statistics

null.stat

s*B null statistics

Arguments

dat

a data matrix with m rows as variables and n columns as observations.

r

a number of significant latent variables.

FUN

Provide a specific function to estimate LVs. Must output r estimated LVs in a n*r matrix.

r1

a numeric vector of latent variables of interest.

s

a number of ``synthetic'' null variables. Out of m variables, s variables are independently permuted.

B

a number of resampling iterations.

covariate

a model matrix of covariates with n observations. Must include an intercept in the first column.

noise

specify a parametric distribution to generate a noise term. If NULL, a non-parametric jackstraw test is performed.

verbose

a logical specifying to print the computational progress.

Author

Neo Christopher Chung nchchung@gmail.com

Details

This function computes m p-values of linear association between m variables and their latent variables, captured by a user-defined dimension reduction method. Its resampling strategy accounts for the over-fitting characteristics due to direct computation of PCs from the observed data and protects against an anti-conservative bias.

This function allows you to specify a parametric distribution of a noise term. It is an experimental feature. Then, a small number s of observed variables are replaced by synthetic null variables generated from a specified distribution.

References

Chung and Storey (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics, 31(4): 545-554 tools:::Rd_expr_doi("10.1093/bioinformatics/btu674")

Chung (2020) Statistical significance of cluster membership for unsupervised evaluation of cell identities. Bioinformatics, 36(10): 3107–3114 tools:::Rd_expr_doi("10.1093/bioinformatics/btaa087")

See Also

jackstraw_pca jackstraw

Examples

Run this code
## simulate data from a latent variable model: Y = BL + E
B = c(rep(1,50),rep(-1,50), rep(0,900))
L = rnorm(20)
E = matrix(rnorm(1000*20), nrow=1000)
dat = B %*% t(L) + E
dat = t(scale(t(dat), center=TRUE, scale=TRUE))

## apply the jackstraw with the svd as a function
out = jackstraw_subspace(dat, FUN = function(x) svd(x)$v[,1,drop=FALSE], r=1, s=100, B=50)

Run the code above in your browser using DataLab