Learn R Programming

rMVP (version 1.1.1)

MVP.FarmCPU: Perform GWAS using FarmCPU method

Description

Date build: Febuary 24, 2013 Last update: May 25, 2017 Requirement: Y, GD, and CV should have same taxa order. GD and GM should have the same order on SNPs

Usage

MVP.FarmCPU(
  phe,
  geno,
  map,
  CV = NULL,
  geno_ind_idx = NULL,
  P = NULL,
  method.sub = "reward",
  method.sub.final = "reward",
  method.bin = c("EMMA", "static", "FaST-LMM"),
  bin.size = c(5e+05, 5e+06, 5e+07),
  bin.selection = seq(10, 100, 10),
  memo = "MVP.FarmCPU",
  Prior = NULL,
  ncpus = 2,
  maxLoop = 10,
  threshold.output = 0.01,
  converge = 1,
  iteration.output = FALSE,
  p.threshold = NA,
  QTN.threshold = 0.01,
  bound = NULL,
  verbose = TRUE
)

Value

a m by 4 results matrix, m is marker size, the four columns are SNP_ID, Chr, Pos, and p-value

Arguments

phe

phenotype, n by t matrix, n is sample size, t is number of phenotypes

geno

genotype, m by n matrix, m is marker size, n is sample size. This is Pure Genotype Data Matrix(GD). THERE IS NO COLUMN FOR TAXA.

map

SNP map information, m by 3 matrix, m is marker size, the three columns are SNP_ID, Chr, and Pos

CV

covariates, n by c matrix, n is sample size, c is number of covariates

geno_ind_idx

the index of effective genotyped individuals

P

start p values for all SNPs

method.sub

method used in substitution process, five options: 'penalty', 'reward', 'mean', 'median', or 'onsite'

method.sub.final

method used in substitution process, five options: 'penalty', 'reward', 'mean', 'median', or 'onsite'

method.bin

method for selecting the most appropriate bins, three options: 'static', 'EMMA' or 'FaST-LMM'

bin.size

bin sizes for all iterations, a vector, the bin size is always from large to small

bin.selection

number of selected bins in each iteration, a vector

memo

a marker on output file name

Prior

prior information, four columns, which are SNP_ID, Chr, Pos, P-value

ncpus

number of threads used for parallele computation

maxLoop

maximum number of iterations

threshold.output

only the GWAS results with p-values lower than threshold.output will be output

converge

a number, 0 to 1, if selected pseudo QTNs in the last and the second last iterations have a certain probality (the probability is converge) of overlap, the loop will stop

iteration.output

whether to output results of all iterations

p.threshold

if all p values generated in the first iteration are bigger than p.threshold, FarmCPU stops

QTN.threshold

in second and later iterations, only SNPs with lower p-values than QTN.threshold have chances to be selected as pseudo QTNs

bound

maximum number of SNPs selected as pseudo QTNs in each iteration

verbose

whether to print detail.

Author

Xiaolei Liu and Zhiwu Zhang

Examples

Run this code
# \donttest{
phePath <- system.file("extdata", "07_other", "mvp.phe", package = "rMVP")
phenotype <- read.table(phePath, header=TRUE)
idx <- !is.na(phenotype[, 2])
phenotype <- phenotype[idx, ]
print(dim(phenotype))
genoPath <- system.file("extdata", "06_mvp-impute", "mvp.imp.geno.desc", package = "rMVP")
genotype <- attach.big.matrix(genoPath)
genotype <- deepcopy(genotype, cols=idx)
print(dim(genotype))
mapPath <- system.file("extdata", "06_mvp-impute", "mvp.imp.geno.map", package = "rMVP")
map <- read.table(mapPath , head = TRUE)

farmcpu <- MVP.FarmCPU(phe=phenotype,geno=genotype,map=map,maxLoop=2,method.bin="static")
str(farmcpu)
# }

Run the code above in your browser using DataLab