All data (alignments or SNP-files) have to be stored in one folder. The folder is the input of this
function. If no GFF file (which also have to be stored in a folder) is specified, an alignment in
the correct reading frame (starting at a first codon position) is expected.
Otherwise synonymous and non-synonymous positions are not identified correctly.
Note:
The GFF-files have to be EXACTLY the same names (without any extensions like .fas or .gff)
as the files storing the nucleotide data to ensure correct matching
format:
"fasta"
,"nexus"
,"phylip"
,
"MAF"
,"MEGA"
"HapMap"
,"VCF"
"RData"
Valid nucleotides are T,t,U,u,G,g,A,a,C,c,N,n,-
parallized:
- will speed up calculations if you use a very large amount of alignments
FAST:
- will not classify synonymous/non-synonymous SNPs directly
- fast computation (via compiled C code) of biallelic matrix, biallelic sites, transversions/transitions
and biallelic substitutions
- can be switched to TRUE
in case of SNP data without loss of information
big.data:
- use the ff-package
- ff mechanism is used for biallelic.matrix and GFF/GTF information
- is automatically activated for readVCF or readSNP
- Note! you should set this to TRUE if you use big chunks of data
and you want to later concatenate them in the PopGenome framework
(for example: sliding windows of the whole dataset).
SNP.DATA:
- should be switched to TRUE
if you use SNP-data in alignment format.
- the corresponding SNP positions can be set via set.ref.positions