UCSCData object is
automatically exported in this format, if the targeted format is known
to be compatible. The BED and WIG import methods check for a track
line, and delegate to these functions if one is found. Thus, calling
this API directly is only necessary when importing embedded GFF
(rare), or when one wants to create the track line during the export
process.
"import"(con, format, text, subformat = "auto", drop = FALSE, genome = NA, ...)
import.ucsc(con, ...)
"export"(object, con, format, ...)
"export"(object, con, format, ...)
"export"(object, con, format, append = FALSE, index = FALSE, ...)
"export"(object, con, format, subformat = "auto", append = FALSE, index = FALSE, ...)
export.ucsc(object, con, ...)UCSCFile object. For the
functions ending in .ucsc, the file format is indicated by
the function name. For the base export and import
functions, ucsc must be passed as the format
argument.
GRanges or
something coercible to a GRanges. For exporting multiple
tracks pass a GenomicRangesList, or something coercible to one.
con is missing, a character vector to use as the
input
RTLFile
subclass other than UCSCFile is passed as con to
import.ucsc or export.ucsc, the subformat is assumed
to be the corresponding format of con. Otherwise it defaults
to auto. The following describes the logic of the
auto mode. For import, the subformat is taken as
the type field in the track line. If none, the file
extension is consulted. For export, if object is a
UCSCData, the subformat is taken as the type
in its track line, if present. Otherwise, the subformat is chosen
based on whether object contains a score column. If
there is a score, the target is either BEDGraph or
WIG, depending on the structure of the ranges. Otherwise,
BED is the target.
NA if
unknown. Typically, this is a UCSC identifier like hg19. An
attempt will be made to derive the seqinfo on the return
value using either an installed BSgenome package or UCSC, if network
access is available. This defaults to the db BED track line
parameter, if any.
TRUE, and there is only one track in the file,
return the track object directly, rather than embedding it in a list.
TRUE, and con points to a file path,
the data is appended to the file. Obviously, if con is a
connection, the data is always appended.
TRUE, automatically compress and index the
output file with bgzf and tabix. Note that tabix indexing will
sort the data by chromosome and start. Tabix supports a
single track in a file.
GenomicRangesList unless drop is TRUE
and there is only a single track in the file. In that case, the first and
only object is extracted from the list and returned.
The structure of that object depends on the format of the
data. The GenomicRangesList contains UCSCData objects.
UCSCFile class extends RTLFile and is a
formal represention of a resource in the UCSC format.
To cast a path, URL or connection to a UCSCFile, pass it to
the UCSCFile constructor.key=value pairs encoding metadata, most related to
visualization. The standard fields in a track depend on the type of
track being annotated. See TrackLine and its
derivatives for how these lines are represented in R. The
class UCSCData is an extension
of GRanges with a formal slot for a TrackLine.
Each GRanges in the returned GenomicRangesList has the
track line stored in its metadata, under the trackLine key. For each track object to be exported, if the object is not a
UCSCData, and there is no trackLine element in the
metadata, then a new track line needs to be generated. This happens
through the coercion of object to UCSCData. The track line
is initialized to have the appropriate type parameter for the
subformat, and the required name parameter is taken from the
name of the track in the input list (if any). Otherwise, the default
is simply R Track. The db parameter (specific to BED
track lines) is taken as genome(object) if not
NA. Additional arguments passed to the export routines
override parameters in the provided track line.
If the subformat is either WIG or BEDGraph, and the features are stranded, a separate track will be output in the file for each strand. Neither of those formats encodes the strand and disallow overlapping features (which might occur upon destranding).