Check coloc dataset inputs for errors
check_dataset(d, suffix = "", req = c("snp"), warn.minp = 1e-06)check.dataset(...)
dataset to check
string to identify which dataset (1 or 2)
names of elements that must be present
print warning if no p value < warn.minp
arguments passed to check_dataset()
NULL if no errors found
Coloc is flexible, requiring perhaps only p values, or z scores, or effect estimates and standard errors, but with this flexibility, also comes difficulties describing exactly the combinations of items required.
P-values for each SNP in dataset 1
Number of samples in dataset 1
minor allele frequency of the variants
regression coefficient for each SNP from dataset 1
variance of beta
the type of data in dataset 1 - either "quant" or "cc" to denote quantitative or case-control
for a case control dataset, the proportion of samples in dataset 1 that are cases
for a quantitative trait, the population standard deviation of the trait. if not given, it can be estimated from the vectors of varbeta and MAF
a character vector of snp ids, optional. If present, it will be used to merge dataset1 and dataset2. Otherwise, the function assumes dataset1 and dataset2 contain results for the same SNPs in the same order.
Some of these items may be missing, but you must always give type
.
Then scalars describing the samples used:
N
type
=="cc"s
type
=="quant" and sdY
knownsdY
If sdY
is unknown, it will be approximated, and this will require
beta
, varbeta
, N
, MAF
Then, if not already covered above, the summary statistics describing the results
beta
, varbeta
pvalues
, MAF
check_dataset
call stop() unless a series of expectations on dataset
input format are met
This is a helper function for use by other coloc functions, but you can use it directly to check the format of a dataset to be supplied to coloc.abf(), coloc.signals(), finemap.abf(), or finemap.signals().