Learn R Programming

microseq (version 2.1.6)

gff2fasta: Retrieving annotated sequences

Description

Retrieving from a genome the sequences specified in a gff.table.

Usage

gff2fasta(gff.table, genome)

Value

A fasta object with one row for each row in gff.table. The Header for each sequence is a summary of the information in the corresponding row of gff.table.

Arguments

gff.table

A gff.table (tibble) with genomic features information.

genome

A fasta object (tibble) with the genome sequence(s).

Author

Lars Snipen and Kristian Hovde Liland.

Details

Each row in gff.table (see readGFF) describes a genomic feature in the genome, which is a tibble with columns Header and Sequence. The information in the columns Seqid, Start, End and Strand are used to retrieve the sequences from the Sequence column of genome. Every Seqid in the gff.table must match the first token in one of the Header texts, in order to retrieve from the correct Sequence.

See Also

readGFF, findOrfs.

Examples

Run this code
# Using two files in this package
gff.file <- file.path(path.package("microseq"),"extdata","small.gff")
genome.file <- file.path(path.package("microseq"),"extdata","small.fna")

# Reading the genome first
genome <- readFasta(genome.file)

# Retrieving sequences
gff.table <- readGFF(gff.file)
fa.tbl <- gff2fasta(gff.table, genome)

# Alternative, using piping
readGFF(gff.file) %>% gff2fasta(genome) -> fa.tbl

Run the code above in your browser using DataLab