Learn R Programming

micropan (version 1.2)

gff2fasta: Retrieving sequences from genome

Description

Retrieving the sequences specified in a gff.table.

Usage

gff2fasta(gff.table, genome)

Arguments

gff.table

A gff.table (data.frame) with genomic features information.

genome

A Fasta object with the genome sequence(s).

Value

A Fasta object with one row for each row in gff.table. The Header for each sequence is a summary of the information in the corresponding row of gff.table.

Details

Each row in gff.table (see readGFF) describes a genomic feature in the genome. The information in the columns Seqid, Start, End and Strand are used to retrieve the sequences from genome$Sequence. Every Seqid in the gff.table must match the first token in one of the genome$Header texts.

See Also

readGFF, findOrfs.

Examples

Run this code
# NOT RUN {
# Using two files in this package
xpth <- file.path(path.package("micropan"),"extdata")
gff.file <- file.path(xpth,"Example.gff.xz")
genome.file <- file.path(xpth,"Example_genome.fasta.xz")

# We need to uncompress them first...
gff.tf <- tempfile(fileext=".xz")
s <- file.copy(gff.file,gff.tf)
gff.tf <- xzuncompress(gff.tf)
genome.tf <- tempfile(fileext=".xz")
s <- file.copy(genome.file,genome.tf)
genome.tf <- xzuncompress(genome.tf)

# Reading
gff.table <- readGFF(gff.tf)
genome <- readFasta(genome.tf)

# Retrieving sequences
fasta.obj <- gff2fasta(gff.table,genome)
summary(fasta.obj)
plot(fasta.obj)

# ...and cleaning...
s <- file.remove(gff.tf,genome.tf)

# }

Run the code above in your browser using DataLab