Learn R Programming

rMVP (version 0.99.14.1)

MVP.FarmCPU: Perform GWAS using FarmCPU method

Description

Date build: Febuary 24, 2013 Last update: May 25, 2017 Requirement: Y, GD, and CV should have same taxa order. GD and GM should have the same order on SNPs

Usage

MVP.FarmCPU(phe, geno, map, CV = NULL, priority = "speed", P = NULL,
  method.sub = "reward", method.sub.final = "reward",
  method.bin = "EMMA", bin.size = c(5e+05, 5e+06, 5e+07),
  bin.selection = seq(10, 100, 10), memo = "MVP.FarmCPU",
  Prior = NULL, ncpus = 2, bar = TRUE, maxLoop = 10,
  threshold.output = 0.01, converge = 1, iteration.output = FALSE,
  p.threshold = NA, QTN.threshold = NULL, bound = NULL)

Arguments

phe

phenotype, n by t matrix, n is sample size, t is number of phenotypes

geno

genotype, m by n matrix, m is marker size, n is sample size. This is Pure Genotype Data Matrix(GD). THERE IS NO COLUMN FOR TAXA.

map

SNP map information, m by 3 matrix, m is marker size, the three columns are SNP_ID, Chr, and Pos

CV

covariates, n by c matrix, n is sample size, c is number of covariates

priority

modes, two options: 'speed' or 'memory'

P

start p values for all SNPs

method.sub

method used in substitution process, five options: 'penalty', 'reward', 'mean', 'median', or 'onsite'

method.sub.final

method used in substitution process, five options: 'penalty', 'reward', 'mean', 'median', or 'onsite'

method.bin

method for selecting the most appropriate bins, two options: 'EMMA' or 'FaSTLMM'

bin.size

bin sizes for all iterations, a vector, the bin size is always from large to small

bin.selection

number of selected bins in each iteration, a vector

memo

a marker on output file name

Prior

prior information, four columns, which are SNP_ID, Chr, Pos, P-value

ncpus

number of threads used for parallele computation

bar

if TRUE, the progress bar will be drawn on the terminal

maxLoop

maximum number of iterations

threshold.output

only the GWAS results with p-values lower than threshold.output will be output

converge

a number, 0 to 1, if selected pseudo QTNs in the last and the second last iterations have a certain probality (the probability is converge) of overlap, the loop will stop

iteration.output

whether to output results of all iterations

p.threshold

if all p values generated in the first iteration are bigger than p.threshold, FarmCPU stops

QTN.threshold

in second and later iterations, only SNPs with lower p-values than QTN.threshold have chances to be selected as pseudo QTNs

bound

maximum number of SNPs selected as pseudo QTNs in each iteration

Value

a m by 4 results matrix, m is marker size, the four columns are SNP_ID, Chr, Pos, and p-value

Examples

Run this code
# NOT RUN {
phePath <- system.file("extdata", "07_other", "mvp.phe", package = "rMVP")
phenotype <- read.table(phePath, header=TRUE)
idx <- !is.na(phenotype[, 2])
phenotype <- phenotype[idx, ]
print(dim(phenotype))
genoPath <- system.file("extdata", "06_mvp-impute", "mvp.imp.geno.desc", package = "rMVP")
genotype <- attach.big.matrix(genoPath)
genotype <- genotype[, idx]
print(dim(genotype))
mapPath <- system.file("extdata", "07_other", "mvp.map", package = "rMVP")
map <- read.table("mvp.map" , head = TRUE)
farmcpu <- MVP.FarmCPU(phe=phenotype, geno=genotype, map=map, method.bin="static", 
  ncpus=detectCores(logical = FALSE), maxLoop=3, P=NULL, method.sub="reward", 
  method.sub.final="reward", bin.size=c(5e5,5e6,5e7), bin.selection=seq(10,100,10), 
  Prior=NULL, p.threshold=NA, QTN.threshold=NULL, bound=NULL)
str(farmcpu)
# }

Run the code above in your browser using DataLab