Usage
GenePermGSEA(countMatrix, GeneScoreType, idxCase, idxControl, GenesetFile, normalization, minGenesetSize = 10, maxGenesetSize = 300, q = 1, nPerm = 1000, GSEAtype = "absFilter", FDR = 0.05, FDRfilter = 0.05, minCount = 3)
Arguments
countMatrix
Normalized RNA-seq read count matrix.
GeneScoreType
Type of gene score. Possible gene score is "moderated_t","SNR", "FC" (log fold change score) or "RANKSUM" (zero centered).
idxCase
Indices for case samples in the count matrix. e.g., 1:3
idxControl
Indices for control samples in the count matrix. e.g., 4:6
GenesetFile
File path for gene set file. Typical GMT file or its similar 'tab-delimited' file is available. e.g., "C:/geneset.gmt"
normalization
Type 'DESeq' if the input matrix is composed of raw read counts. It will normalize the raw count data using DESeq method. Or type 'AlreadyNormalized' if the input matrix is already normalized.
minGenesetSize
Minimum size of gene set allowed. Gene-sets of which sizes are below this value are filtered out from the analysis. Default = 10
maxGenesetSize
Maximum size of gene set allowed. Gene-sets of which sizes are larger this value are filtered out from the analysis. Default = 300
q
Weight exponent for gene score. For example, if q=0, only rank of gene score is reflected in calculating gene set score (preranked GSEA). If q=1, the gene score itself is used. If q=2, square of the gene score is used.
nPerm
The number of gene permutation. Default = 1000.
GSEAtype
Type of GSEA. Possible value is "absolute", "original" or "absFilter". "absolute" for one-tailed absolute GSEA. "original" for the original two-tailed GSEA. "absFilter" for the original GSEA filtered by the results from the one-tailed absolute GSEA.
FDR
FDR cutoff for the original or absolute GSEA. Default = 0.05
FDRfilter
FDR cutoff for the one-tailed absolute GSEA for absolute filtering (only working when GSEAtype is "absFilter"). Default = 0.05
minCount
Minimum median count of a gene to be included in the analysis. It is used for gene-filtering to avoid genes having small read counts. Default = 0
Source
Nam, D. Effect of the absolute statistic on gene-sampling gene-set analysis methods. Stat Methods Med Res 2015.
Subramanian, A., et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. P Natl Acad Sci USA 2005;102(43):15545-15550.
Li, J. and Tibshirani, R. Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data. Statistical Methods in Medical Research 2013;22(5):519-536.