Learn R Programming

dartR.base (version 1.0.5)

gl2paup.parsimony: Converts a genlight object to nexus format for parsimony phylogeny analysis in PAUP and, optionally produces accompanying files for parallel processing.

Description

The output nexus file contains the SilicoDArT data as a single line per individual wrapped in the appropriate nexus commands. Pop Labels are used to define taxon partitions.

If out.type="bash", the function produces a series of files in support of an analysis taking advantage of multi-threading and parallel processing.

Usage

gl2paup.parsimony(
  x,
  outfileprefix = "parsimony",
  outpath = NULL,
  out.type = "standard",
  tip.labels = "ind",
  nreps = 100,
  nbootstraps = 1000,
  ncpus = 1,
  mem = 4,
  server = "gadi",
  base.dir.name = NULL,
  test = FALSE,
  verbose = NULL
)

Value

returns no value (i.e. NULL)

Arguments

x

Name of the genlight object containing the SilicoDArT data [required].

outfileprefix

A prefix to use for file names of the output files [default 'parsimony'].

outpath

Path where to save the output file [default global working directory or if not specified, tempdir()].

out.type

Specify the type of output file. Can be 'standard' (consensus tree) or 'newick' (newick) or 'bash' [default 'standard']

tip.labels

Specify whether the terminals should be labelled with the individual labels ('ind'), the population labels ('pop') or both ('indpop') [default 'ind']

nreps

Specify the number of replicate analyses to run in search of the shortest tree [default 100]

nbootstraps

Number of bootstrap replicates [default 1000]

ncpus

Number of cores to use for parallel processing [default 1]

mem

Memory to use for each process [default 4Gb per core]

server

If out.type='bash', provide the name of the linux server [default 'gadi']

base.dir.name

Name of the base directory on the server to act as the workspace [default NULL]

test

If TRUE, the analysis will run with a small subset of the data [default FALSE]

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity]

Author

Custodian: Arthur Georges (Post to https://groups.google.com/d/forum/dartr)

Details

Additional details: This script only applies to SilicoDArT data. The output file is the name of the file PAUP will use to deliver the results of the analysis, in the directory specified by outpath.

The output type (out.type) can be 'standard' which uses default PAUP parameters to construct the boot.tre file. Or it can be 'newick' to add the parameter format=newick whereby the boot.tre file contains the final tree in newick format. This is useful for passing the results to a tree graphics program such as Mega 11 to format the tree for publication. Or it can be 'bash' which creates a number of files to facilitate parallel processing on a supercomputer.

The parameter nreps specifies the number of replicates to run in search of the shortest tree in each bootstrap iteration. The default is 100.

The parameter nbootstraps specifies tne number of bootstrap replicates to run to generate a measure of node support. The default is 1000. The companion parameter ncpus specifies how many cpus to use for parallel processing when out.type='bash'. The default is 1. Note that the number of cpus must divide evenly into the number of bootstrap replicates.

The parameter tip.labels specifies whether the terminals in the tree should be labelled with the individual names, or the population names (multiple tips will have the same label -- which can cause problems at the point of generating a consensus tree), or a combination of the two. Including the population name in the terminal tip labels will assist in collapsing the tree to have populations as the terminals after checking fidelity of populations to supported clades. This can be done in Mega 11.

The parameter 'server' is to allow for future development as users modify the bash scripts to suit other multitasking environments. This script works only for the Gadi server on the Australian National Computing Infrastructure (NCI).

If test=TRUE, the data will be subsetted heavily on numbers of loci, numbers individuals, bootstrap replicates and number of replicates for branch swapping. This is used to test the job run without expenditure of the resources required for the full job.

See Also

Other linkers: gl2paup.svdquartets()

Examples

Run this code
gg <- testset.gs[1:20,1:100]
gg@other$loc.metrics <- gg@other$loc.metrics[1:100,]
gl2paup.parsimony(gg,outfile="test.nex",outpath=tempdir(),nreps=1,nbootstraps=10)
gl2paup.parsimony(gg,outfile="test.nex",out.type="newick",outpath=tempdir(),nreps=1,nbootstraps=10)

Run the code above in your browser using DataLab