Learn R Programming

refGenome (version 1.7.7)

ucscGenome-class: Class "ucscGenome"

Description

ucscGenome class: Represents data stored for UCSC genome. The standard way to import data is to download a "gtf" file from the UCSC Genome Browser (-> Table Browser). Download the "knownGene" Table in output format "GTF". Then import the data via the read.gtf function.

Arguments

Objects from the Class

Objects can be created by calls of the form ucscGenome().

Slots

basedir:

Object of class "character" Directory where SQLite database is written.

%
ev:

Object of class "environment" Environment that contains data structures. Optionally, there are gtf, attr and additionally xref data.frames.

%

Methods

show

signature(object = "refGenome"): Creates a sensible printout.

%
getGtf

signature(object = "refGenome"): Returns content of gtf table.

%
setGtf

signature(object = "refGenome"): Writes content of gtf table.

%
getAttr

signature(object = "refGenome"): Returns content of attribute table.

%
setAttr

signature(object = "refGenome"): Writes content of attribute table.

%
read.gtf

signature(object, filename="transcripts.gtf", sep = "\t", useBasedir=TRUE, comment.char = "#", progress=100000L, ...): Imports content of gtf file. This is the basic mechanism for data import. It works the same way for ucscGenome and for ensemblGenome.

%
writeDB

signature(object = "refGenome"): Copies content of gtf, attr and xref table to database.

addEnsembl

signature(object = "ucscGenome"): Imports UCSC 'knownToEnsembl' table. It's appended to the gtf table.

%
addIsoforms

signature(object = "ucscGenome"): Imports UCSC ' knownIsoforms' table. It's appendet to the gtf table.

%
addXref

signature(object = "ucscGenome"): Imports UCSC 'kgXref' table. A 'geneSymbol' column is added to gtf table. The rest is written into xref table.

%
extractByGeneName

signature(object="ucscGenome", geneNames="character"): Extracts ucscGenome object which contains table subsets. When none of the geneNames matches, the function returns NULL.

%
getXref

signature(object = "ucscGenome"): Returns content of xref table.

getGenePositions

signature(object="ucscGenome", by="character", force="logical"): Extracts table with position data for whole genes (smallest exon start position and largest exon end position. A copy of the table will be placed inside the internal environment. Upon subsequent call only a copy of the contained table is returned unless force=TRUE is given. Upon force=TRUE new gene positions are calculated regardless of existing tables.)

%
getGeneTable

signature(object="ucscGenome"): Returns data.frame containing gene-specific data.

%
loadGenome

signature(filename = "character"): Imports data from stored R-Environment Image.

%
loadGenomeDb

signature(filename = "character"): Imports content of object from sqlite3 database.

%
tableFeatures

signature(object="ucscGenome"): Tables content of "feature" column.

%
tableTranscript.id

signature(object="ucscGenome"): Tables values in transcript_id column.

%
extractTranscript

signature(object="ucscGenome", transcripts="character"): Extracts an object which contains data for subset defined by transcript names.

%

References

http://genome.ucsc.edu/

Examples

Run this code
# NOT RUN {
##-------------------------------------##
## Loading and saving
## From and to R-image (fast loading)
##-------------------------------------##
ucfile <- system.file("extdata", "hs.ucsc.small.RData", package="refGenome")
uc <- loadGenome(ucfile)
uc
# }
# NOT RUN {
saveGenome(uc, "uc.RData", useBasedir=FALSE)
ucr <- loadGenome("uc.RData")
# }
# NOT RUN {
##-------------------------------------##
## Extract data for Primary Assembly seqids
##-------------------------------------##
ucpa <-extractSeqids(uc, ucPrimAssembly())
# Extract data for indival Genes
ddx <- extractByGeneName(uc,"DDX11L1")
ddx
# Extract range limits of entire Genes
gp <-getGenePositions(uc)
gp
tableFeatures(uc)
extractByGeneName(ucpa, "DDX11L1")
tableTranscript.id(ucpa)

##-------------------------------------##
## Create object from scratch
##-------------------------------------##
# }
# NOT RUN {
uc<-ucscGenome()
basedir(uc) <- "/my/genome/basedir"
# Place all UCSC-files in folder
read.gtf(uc, "knownGene.gtf")
addXref(uc, "kgXref.csv")
addEnsembl(uc, "knownToEnsembl.csv")
addIsoforms(uc, "knownisoforms.csv")
# }
# NOT RUN {
##-------------------------------------##
# }

Run the code above in your browser using DataLab