Learn R Programming

dartR.base (version 1.0.5)

gl.report.callrate: Reports summary of Call Rate for loci or individuals

Description

SNP datasets generated by DArT have missing values primarily arising from failure to call a SNP because of a mutation at one or both of the restriction enzyme recognition sites. P/A datasets (SilicoDArT) have missing values because it was not possible to call whether a sequence tag was amplified or not.

Usage

gl.report.callrate(
  x,
  method = "loc",
  ind.to.list = 20,
  plot.display = TRUE,
  plot.theme = theme_dartR(),
  plot.colors = NULL,
  plot.dir = NULL,
  plot.file = NULL,
  bins = 50,
  verbose = NULL,
  ...
)

Value

Returns unaltered genlight object

Arguments

x

Name of the genlight object containing the SNP or presence/absence (SilicoDArT) data [required].

method

Specify the type of report by locus (method='loc') or individual (method='ind') [default 'loc'].

ind.to.list

Number of individuals to list for callrate [default 20]

plot.display

Specify if plot is to be displayed in the graphics window [default TRUE].

plot.theme

User specified theme [default theme_dartR()].

plot.colors

Vector with two color names for the borders and fill [default c("#2171B5", "#6BAED6")].

plot.dir

Directory to save the plot RDS files [default as specified by the global working directory or tempdir()]

plot.file

Filename (minus extension) for the RDS plot file [Required for plot save]

bins

Number of bins to display in histograms [default 25].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].

...

Parameters passed to function ggsave, such as width and height, when the ggplot is to be saved.

Author

Author(s): Arthur Georges. Custodian: Arthur Georges -- Post to https://groups.google.com/d/forum/dartr

Details

This function expects a genlight object, containing either SNP data or SilicoDArT (=presence/absence data).

Callrate is summarized by locus or by individual to allow sensible decisions on thresholds for filtering taking into consideration consequential loss of data. The summary is in the form of a tabulation and plots.

The table of quantiles is useful for deciding a threshold for subsequent filtering as it provides an indication of the percentages of loci that will be retained and lost.

In the case of method='ind', a list of individuals to be deleted is provided. To manage the screen output, this list is limited to ind.to.list individuals (or nInd(x)) whichever is the smaller.

To avoid issues from inadvertent use of this function in an assignment statement, the function returns the genlight object unaltered.

A color vector can be obtained with gl.select.colors() and then passed to the function with the plot.colors parameter.

If a plot.file is given, the ggplot arising from this function is saved as an "RDS" binary file using saveRDS(); can be reloaded with readRDS(). A file name must be specified for the plot to be saved.

If a plot directory (plot.dir) is specified, the ggplot binary is saved to that directory; otherwise to the tempdir().

See Also

gl.filter.callrate

Other matched report: gl.filter.excess.het(), gl.report.allna(), gl.report.hamming(), gl.report.locmetric(), gl.report.maf(), gl.report.overshoot(), gl.report.pa(), gl.report.rdepth(), gl.report.reproducibility(), gl.report.secondaries(), gl.report.taglength()

Examples

Run this code
 # \donttest{
# SNP data
  test.gl <- testset.gl[1:20,]
  gl.report.callrate(test.gl)
  gl.report.callrate(test.gl,method='ind')
  gl.report.callrate(test.gl,method='ind',plot.file="test")
  gl.report.callrate(test.gl,method='loc',by.pop=TRUE)
  gl.report.callrate(test.gl,method='loc',by.pop=TRUE,plot.file="test")
# Tag P/A data
  test.gs <- testset.gs[1:20,]
  gl.report.callrate(test.gs)
  gl.report.callrate(test.gs,method='ind')
  # }
  test.gl <- testset.gl[1:20,]
  gl.report.callrate(test.gl)
  

Run the code above in your browser using DataLab