Learn R Programming

GenoPop (version 1.0.0)

Fst: Fst

Description

This function calculates the fixation index (Fst) between two populations from a VCF file using the method of Weir and Cockerham (1984). The formula used for this is equivalent to the one used in vcftools --weir-fst-pop (https://vcftools.sourceforge.net/man_latest.html). For batch processing, it uses process_vcf_in_batches. For windowed analysis, it uses a similar approach tailored to process specific genomic windows (process_vcf_in_windows).

Usage

Fst(
  vcf_path,
  pop1_individuals,
  pop2_individuals,
  weighted = FALSE,
  batch_size = 10000,
  threads = 1,
  write_log = FALSE,
  logfile = "log.txt",
  window_size = NULL,
  skip_size = NULL
)

Value

In batch mode (no window_size or skip_size provided): Fst value (either mean or weighted). In window mode (window_size and skip_size provided): A data frame with columns 'Chromosome', 'Start', 'End', and 'Fst', representing the fixation index within each window.

Arguments

vcf_path

Path to the VCF file.

pop1_individuals

Vector of individual names belonging to the first population.

pop2_individuals

Vector of individual names belonging to the second population.

weighted

Logical, whether weighted Fst or mean Fst is returned (Default = FALSE (mean Fst is returned)).

batch_size

The number of variants to be processed in each batch (used in batch mode only, default of 10,000 should be suitable for most use cases).

threads

Number of threads to use for parallel processing.

write_log

Logical, indicating whether to write progress logs.

logfile

Path to the log file where progress will be logged.

window_size

Size of the window for windowed analysis in base pairs (optional). When specified, skip_size must also be provided.

skip_size

Number of base pairs to skip between windows (optional). Used in conjunction with window_size for windowed analysis.

Examples

Run this code
vcf_file <- system.file("tests/testthat/sim.vcf.gz", package = "GenoPop")
index_file <- system.file("tests/testthat/sim.vcf.gz.tbi", package = "GenoPop")
pop1_individuals <- c("tsk_0", "tsk_1", "tsk_2")
pop2_individuals <- c("tsk_3", "tsk_4", "tsk_5")
# Batch mode example
fst_value <- Fst(vcf_file, pop1_individuals, pop2_individuals, weighted = TRUE)
# Window mode example
fst_windows <- Fst(vcf_file, pop1_individuals, pop2_individuals, weighted = TRUE,
                   window_size = 100000, skip_size = 50000)

Run the code above in your browser using DataLab