Learn R Programming

plinkQC

plinkQC is a R/CRAN package for genotype quality control in genetic association studies. It makes PLINK basic statistics (e.g.missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions easily accessible from within R and allows for automatic evaluation of the results.

Full documentation is available at https://meyer-lab-cshl.github.io/plinkQC/.

plinkQC generates a per-individual and per-marker quality control report. A step-by-step guide on how to run these analyses can be found here.

Individuals and markers that fail the quality control can subsequently be removed with plinkQC to generate a new, clean dataset.

plinkQC facilitates an ancestry check for study individuals based on comparison to reference datasets. The processing of the reference datasets is documented in detail here.

Removal of individuals based on relationship status via plinkQC is optimised to retain as many individuals as possible in the study.

Installation

The current github version of plinkQC is: 0.3.4 and can be installed via

library(devtools)
install_github("meyer-lab-cshl/plinkQC")

The current CRAN version of plinkQC is: 0.3.3 and can be installed via

install.packages("plinkQC")

A log of version changes can be found here.

Citation

Meyer HV (2020) plinkQC: Genotype quality control in genetic association studies.

Copy Link

Version

Install

install.packages('plinkQC')

Monthly Downloads

378

Version

0.3.4

License

MIT + file LICENSE

Maintainer

Last Published

July 15th, 2021

Functions in plinkQC (0.3.4)

checkFiltering

Check and construct PLINK sample and marker filters
check_sex

Identification of individuals with discordant sex information
check_het_and_miss

Identification of individuals with outlying missing genotype or heterozygosity rates
check_snp_missingness

Identification of SNPs with high missingness rate
checkRemoveIDs

Check and construct individual IDs to be removed
check_relatedness

Identification of related individuals
check_maf

Identification of SNPs with low minor allele frequency
checkPlink

Check PLINK software access
check_hwe

Identification of SNPs showing a significant deviation from Hardy-Weinberg- equilibrium (HWE)
check_ancestry

Identification of individuals of divergent ancestry
evaluate_check_het_and_miss

Evaluate results from PLINK missing genotype and heterozygosity rate check.
overviewPerMarkerQC

Overview of per marker QC
evaluate_check_sex

Evaluate results from PLINK sex check.
evaluate_check_ancestry

Evaluate results from PLINK PCA on combined study and reference data
cleanData

Create plink dataset with individuals and markers passing quality control
evaluate_check_relatedness

Evaluate results from PLINK IBD estimation.
relatednessFilter

Remove related individuals while keeping maximum number of individuals
run_check_ancestry

Run PLINK principal component analysis
run_check_relatedness

Run PLINK IBD estimation
run_check_sex

Run PLINK sexcheck
overviewPerIndividualQC

Overview of per sample QC
plinkQC

plinkQC
perMarkerQC

Quality control for all markers in plink-dataset
testNumerics

Test lists for different properties of numerics
perIndividualQC

Quality control for all individuals in plink-dataset
run_check_heterozygosity

Run PLINK heterozygosity rate calculation
run_check_missingness

Run PLINK missingness rate calculation