snp_prodBGEN: BGEN matrix product

Description

Compute a matrix product between BGEN files and a matrix. This removes the need to read an intermediate FBM object with snp_readBGEN() to compute the product. Moreover, when using dosages, they are not rounded to two decimal places anymore.

Usage

snp_prodBGEN(
  bgenfiles,
  beta,
  list_snp_id,
  ind_row = NULL,
  bgi_dir = dirname(bgenfiles),
  read_as = c("dosage", "random"),
  block_size = 1000,
  ncores = 1
)

Value

The product bgen_data[ind_row, 'list_snp_id'] %*% beta.

Arguments

bgenfiles

Character vector of paths to files with extension ".bgen". The corresponding ".bgen.bgi" index files must exist.

beta

A matrix (or a vector), with rows corresponding to list_snp_id.

list_snp_id

List (same length as the number of BGEN files) of character vector of SNP IDs to read. These should be in the form "<chr>_<pos>_<a1>_<a2>" (e.g. "1_88169_C_T" or "01_88169_C_T"). If you have one BGEN file only, just wrap your vector of IDs with list(). This function assumes that these IDs are uniquely identifying variants.

ind_row

An optional vector of the row indices (individuals) that are used. If not specified, all rows are used.
Don't use negative indices.

bgi_dir

Directory of index files. Default is the same as bgenfiles.

read_as

How to read BGEN probabilities? Currently implemented:

as dosages (rounded to two decimal places), the default,
as hard calls, randomly sampled based on those probabilities (similar to PLINK option '--hard-call-threshold random').

block_size

Maximum size of temporary blocks (in number of variants). Default is 1000.

ncores

Number of cores used. Default doesn't use parallelism. You may use nb_cores().

Description

Usage

Value

Arguments

See Also