Learn R Programming

PopGenome (version 2.7.5)

read.big.fasta: Reading large FASTA alignments

Description

This function splits FASTA alignments that are too large to fit into the computer memory into chunks.

Usage

read.big.fasta(filename,populations=FALSE,outgroup=FALSE,window=2000,
               SNP.DATA=FALSE,include.unknown=FALSE,
               parallized=FALSE,FAST=FALSE,big.data=TRUE)

Arguments

filename

the basepath of the FASTA alignment

outgroup

vector of outgroup sequences

populations

list of populations

window

chunk size: number of columns/nucleotide sites

SNP.DATA

should be switched to TRUE if you use SNP data in alignment format

include.unknown

include unknown positions in the biallelic.matrix

parallized

Use parallel computations to speed up the reading - works only on UNIX systems!

FAST

Fast computation. see readData()

big.data

use the ff-package

Value

The function creates an object of class "GENOME" --------------------------------------------------------- The following slots will be filled in the "GENOME" object ---------------------------------------------------------

Slot Description
1. n.sites total number of sites
2. n.biallelic.sites number of biallelic sites
3. region.names names of regions
4. region.data some detailed information about the data

Details

The algorithm reads the data for each individual and stores the information on disk. The data can be analyzed as regions of the defined window size, or can be concatenated in the PopGenome framework via the function concatenate.regions. This function should only be used when the FASTA file does not fit into the RAM; else, use the function readData.

Examples

Run this code
# NOT RUN {
# GENOME.class <- read.big.fasta("Alignment.fas", big.data=TRUE)
# GENOME.class
# GENOME.class@region.names
# CON <- concatenate.regions(GENOME.class)
# CON@region.data@biallelic.sites
# GENOME.class.slide <- sliding.window.transform(GENOME.class,100,100)
# GENOME.class <- neutrality.stats(GENOME.class,FAST=TRUE)
# show the result:
# get.sum.data(GENOME.class)
# GENOME.class@region.data

# }

Run the code above in your browser using DataLab