Learn R Programming

rsgcc (version 1.0.6)

getsgene: identify tissue(or condtion)-specific genes

Description

This function identifies tissue(or condition)- specific genes by considering the difference between the mean expression value of one tissue and the max expression value of other tissue.

Usage

getsgene(x, Log = FALSE, Base = 2, AddOne = FALSE, tsThreshold = 0.95, MeanOrMax = "Mean", Fraction = TRUE)

Arguments

x
a numeric matrix containing gene expression value. The column labels are samples names. For two samples from the same tissue T, their names should be assigned as T.1 and T.2, respectively.
Log
logical indicating whether the gene expression value would be log-transformed.
Base
a numeric value specifying the base of logarithm.
AddOne
logical indicating if add one for avoding the problem of log-zero.
tsThreshold
a numeric value giving the threshold of tissue specificity score. The tissue specificity score is 1, if the gene is only expressed in one tissue. Otherwise, the tissue specificity socre will be smaller than 1.
MeanOrMax
character "Mean" or "Max" indicate the mean or maximal expression value will be calculated for the tissue of interest.
Fraction
logical indicating whether the gene expression value would be scaled across tissues.

Value

A list with following components:
csGenes
a data matrix containing expression vlaues of tissue specific genes.
csScoreMat
a data matrix with three columns containg the gene index information from x, tissue specificity score and the tissue information with the tissue specificity score.

Details

The tissue specificity score is calculated with the formula 1-min(R(1), R(2), ..., R(i),..., R(n)), where R(i) = M(i)/E(i), E(i) is the mean or maximal expression value of tissue i, and M(i) is the maximal expression value of other tissues. If the tissue specificity score higher than tsThreshold, then the gene is considered as tissue specifically expressed. If Fraction is TRUE, the expression values of a gene is scaled accorss the tissues with the formula e(i)/(e(1)+e(2)+...+e(n)). e(i) is the expression value of the consider gene in ith sample.

References

[1] Chuang Ma, Xiangfeng Wang. Machine learning-based differential network analysis of transcriptomic data: a case study of stress-responsive gene expression in Arabidopsis thaliana. 2013 (Submitted).

Examples

Run this code

## Not run: 
#    data(rsgcc)
#    tsRes <- getsgene(rnaseq, tsThreshold = 0.75, MeanOrMax = "Mean", Fraction = TRUE)
# ## End(Not run)

Run the code above in your browser using DataLab