Given a gene character vector and a universe character vector, which can be either Ensembl or HGNC symbols, find the over-representation enrichment of the gene list relative to the universe in a gene ontology category using the hypergeometric test and the GOstats R package.
findGOTermEnrichment(gene_vector, universe, pval_GO_cutoff = 1,
HGNC_switch = TRUE, HGNC_clean = TRUE, gene_ontology = "all",
conditional = TRUE, annotation = "org.Hs.eg.db", cleanNames = FALSE)
A data frame with the term enrichments of the GO enrichment analysis given the input gene set and universe.
Character vector gene symbols of interest.
Character vector of gene symbols which should be used as the background in the hypergeomtric test. If using this in the context of a ddcor experiment, this gene list most likely should be the gene set post-filtering, but prior to differential correlation analysis.
Cutoff for the p-values of gene ontology terms in the enrichment tests that should be displayed in the resulting table.
Logical indicating whether or not the input gene symbols need to be switched from HGNC to Ensembl, the latter of which is required for GOstats enrichment test. Note that this is done by selecting the first Enembl symbol that maps to a particular HGNC symbol, which is not always unique. If you need more precision, you should do this outside of the function and insert the Ensembl list to the function. Only applies if cleanNames is TRUE.
Logical indicating whether the input gene symbols should be switched to clean HGNC symbols using the checkGeneSymbols function from the R package HGNChelper. Only applies if HGNC symbols are inputted and cleanNames is TRUE.
A string specifying the branch of GO that should be used for enrichment analysis. One of "BP" (Biological Process), "MF" (Molecular Function), "CC" (Cellular Component), or "all". If "all" is chosen, then this function finds the enrichment for all of the terms and combines them into one table. Default = "all"
Logical specifying whether the GO analysis should be done conditionally to take into account the hierarchical structure of the GO database in making sense of the gene set enrichments. Default = TRUE.
The library indicating the GO annotation database from which the Go terms should be mapped to gene symbols. Default = "org.Hs.eg.db", which is the table for Homo sapiens. Other common choices include "org.Mm.eg.db", "org.Rn.eg.db". The corresponding annotation library needs to be installed.
Logical, indicating whether the gene names for the universe and gene vector should be cleaned prior to enrichment analysis.