Synthetic

Generation of a synthetic dataset with n=10 observations (samples) and \(p=100\) variables, 
 where \(nvar=20\) of them are significantly different between the two sample groups. 
This is a balanced design with two sample groups (\(G=2\)), under unequal sample group variance.

datasets

This is a non-parametric method for joint adaptive mean-variance regularization and variance stabilization of high-dimensional data. It is suited for handling difficult problems posed by high-dimensional multivariate datasets (p >> n paradigm). Among those are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. Key features include:
(i) Normalization and/or variance stabilization of the data,
(ii) Computation of mean-variance-regularized t-statistics (F-statistics to follow),
(iii) Generation of diverse diagnostic plots,
(iv) Computationally efficient implementation using C/C++ interfacing and an option for parallel computing to enjoy a faster and easier experience in the R environment.

Jean-Eudes Dazard

Mean-Variance Regularization

Hua Xu

Alberto Santana

Synthetic function

A numeric matrix containing \(n=10\) observations (samples) by rows 
 and \(p=100\) variables by columns, named \(v_{1},...,v_{p}\).
 Samples are balanced (\(n_{1}=5\),\(n_{2}=5\)) between the two groups (\(G_{1}, G_{2}\)).
 Compressed Rda data file.

Format

This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. 
 This project was partially funded by the National Institutes of Health (P30-CA043703).

Synthetic: Multi-Groups Synthetic Dataset

Description

Usage

Arguments

Format

Acknowledgments

References