Learn R Programming

WGCNA (version 1.25-1)

goodSamplesGenes: Iterative filtering of samples and genes with too many missing entries

Description

This function checks data for missing entries and zero-variance genes, and returns a list of samples and genes that pass criteria maximum number of missing values. If necessary, the filtering is iterated.

Usage

goodSamplesGenes(
  datExpr, 
  minFraction = 1/2, 
  minNSamples = ..minNSamples, 
  minNGenes = ..minNGenes, 
  verbose = 1, indent = 0)

Arguments

datExpr
expression data. A data frame in which columns are genes and rows ar samples.
minFraction
minimum fraction of non-missing samples for a gene to be considered good.
minNSamples
minimum number of non-missing samples for a gene to be considered good.
minNGenes
minimum number of good genes for the data set to be considered fit for analysis. If the actual number of good genes falls below this threshold, an error will be issued.
verbose
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose.
indent
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces.

Value

  • A list with the foolowing components:
  • goodSamplesA logical vector with one entry per sample that is TRUE if the sample is considered good and FALSE otherwise.
  • goodGenesA logical vector with one entry per gene that is TRUE if the gene is considered good and FALSE otherwise.

Details

This function iteratively identifies samples and genes with too many missing entries and genes with zero variance. Iterations may be required since excluding samples effectively changes criteria on genes and vice versa. The process is repeated until the lists of good samples and genes are stable. The constants ..minNSamples and ..minNGenes are both set to the value 4.

See Also

goodSamples, goodGenes