Learn R Programming

safe (version 3.12.0)

getCmatrix: Generation of a C matrix

Description

This function will construct a matrix of indicator variables for category membership from keyword or gene-indexed lists. Size constraints, the option to prune identical categories, and a vector of present genes can be defined to filter categories and order genes. New to version 3.0.0, annotation can be provided so that each gene, instead of each feature, has equal weight in a category.

Usage

getCmatrix(keyword.list = NULL, gene.list = NULL, present.genes = NULL, min.size = 2, max.size = Inf, by.gene = FALSE, gene.names =  NULL, prefix = "", prune = FALSE, as.matrix = FALSE, GO.ont = NULL, ...)

Arguments

keyword.list
A list containing character vectors for each keyword that specify the gene members.
gene.list
A list containing character vectors for each gene that specify the annotated functional categories.
present.genes
An optional vector used to filter genes in the C matrix. Can be provided as an unordered character vector of gene names that match names(list), or as an ordered vector of presence (1) and absence (0) calls.
min.size
Optional minimum category size to be considered.
max.size
Optional maximum category size to be considered.
by.gene
Optional logical to build 'soft' categories at the gene level, instead of the feature level.
gene.names
Optional character vector of gene names for 'soft' categories.
prefix
Optional character string to preceed category names.
prune
Optional logical to remove duplicate categories.
as.matrix
Optional argument to specify a matrix is returned rather than a matrix.csr.
GO.ont
"CC", "BP", or "MF" specify which Gene Ontology.
...
Any extra arguments will be forwarded to the read.table function when category assignments are given as a file.

Value

C.mat.csr
If as.matrix=F a sparse matrix is returned with the rows corresponding to the genes and columns are categories
row.names
Character vector of gene names
col.names
Character vector of category names
col.synonym
Pipe-delimited Character vector of matching categories when prune=T

Details

Typical usages are
  getCmatrix(keyword.list, present.genes)
  getCmatrix(gene.list, present.genes)
  

References

W. T. Barry, A. B. Nobel and F.A. Wright, 2005, Significance Analysis of functional categories in gene expression studies: a structured permutation approach, Bioinformatics 21(9) 1943-9.

See also the vignette included with this package.

See Also

safe, safeplot, getPImatrix.

Examples

Run this code
if(interactive()){
 require(hgu133a.db)
 genes <- unlist(as.list(hgu133aSYMBOL))
 RS.list <- list(Genes21 = c("ACTB","RPLP0","MYBL2","BIRC5","BAG1",
                             "GUSB","CD68","BCL2","MMP11","AURKA",
                             "GSTM1","ESR1","TFRC","PGR","CTSL2",
                             "GRB7","ERBB2","MKI67","GAPDH","CCNB1",
                             "SCUBE2"),
                 Genes16 = c("MYBL2","BIRC5","BAG1","CD68","BCL2",
                             "MMP11","AURKA","GSTM1","ESR1","PGR","CTSL2",
                             "GRB7","ERBB2","MKI67","CCNB1","SCUBE2"))
 RS.list <- lapply(RS.list,function(x) return(names(genes[which( match(genes, x, nomatch = 0) > 0)])))
 C1 <- getCmatrix(keyword.list = RS.list)
}

Run the code above in your browser using DataLab