xLDblock: Function to obtain LD blocks

Description

xLDblock is supposed to obtain LD blocks for a list of Lead SNPs together with the significance level.

Usage

xLDblock(data, include.LD = c("AFR", "AMR", "EAS", "EUR", "SAS"),
LD.customised = NULL, LD.r2 = 0.8, GR.SNP = "LDblock_GR", verbose = T,
RData.location = "http://galahad.well.ox.ac.uk/bigdata")

Arguments

data

a named input vector containing the significance level for nodes (dbSNP). For this named vector, the element names are dbSNP (starting with rs or in the format of 'chrN:xxx', where N is either 1-22 or X, xxx is number; for example, 'chr16:28525386'), the element values for the significance level (measured as p-value or fdr). Alternatively, it can be a matrix or data frame with two columns: 1st column for dbSNP, 2nd column for the significance level.

include.LD

additional SNPs in LD with Lead SNPs are also included. By default, it is 'NA' to disable this option. Otherwise, LD SNPs will be included based on one or more of 26 populations and 5 super populations from 1000 Genomics Project data (phase 3). The population can be one of 5 super populations ("AFR", "AMR", "EAS", "EUR", "SAS"). Explanations for population code can be found at http://www.1000genomes.org/faq/which-populations-are-part-your-study

LD.customised

a user-input matrix or data frame with 3 compulsory columns: 1st column for Lead SNPs, 2nd column for LD SNPs, and 3rd for LD r2 value. The recommended columns are 'maf', 'distance' (to the nearest gene) and 'cadd'. It is designed to allow the user analysing their precalcuated LD info. This customisation (if provided) has the high priority over built-in LD SNPs

LD.r2

the LD r2 value. By default, it is 0.8, meaning that SNPs in LD (r2>=0.8) with input SNPs will be considered as LD SNPs. It can be any value from 0.1 to 1

GR.SNP

the genomic regions of SNPs. By default, it is 'LDblock_GR', that is, SNPs from dbSNP (version 150) restricted to GWAS SNPs and their LD SNPs (hg19). Beyond it, the user can also directly provide a customised GR object

verbose

logical to indicate whether the messages will be displayed in the screen. By default, it sets to true for display

RData.location

the characters to tell the location of built-in RData files. See xRDataLoader for details

Value

an object of class "bLD", a list with following components:

best: a GR object. It has optional meta-columns 'maf', 'distance' (to the nearest gene) and 'cadd', and compulsory meta-columns 'pval', 'score' (-log10(pval)), 'upstream' (the lower boundary away from the best SNP, non-positive value), 'downstream' (the upper boundary away from the best SNP, non-negative value) and 'num' (the number of SNPs in the block)
block: a GRL object, each element corresponding to a block for the best SNP with optional meta-columns 'maf', 'distance' (to the nearest gene) and 'cadd', and compulsory meta-columns 'pval', 'score' (-log10(pval)*R2, based on pval for its lead SNP), 'best' (the best SNP) and 'distance_to_best' (to the best SNP)

Examples

Run this code

# NOT RUN {
# Load the XGR package and specify the location of built-in data
library(XGR)
RData.location <- "http://galahad.well.ox.ac.uk/bigdata"

# }
# NOT RUN {
# a) provide the seed SNPs with the significance info
## load ImmunoBase
data(ImmunoBase)
## get lead SNPs reported in AS GWAS and their significance info (p-values)
gr <- ImmunoBase$AS$variant
data <- GenomicRanges::mcols(gr)[,c('Variant','Pvalue')]

# b) get LD block (EUR population)
bLD <- xLDblock(data, include.LD="EUR", LD.r2=0.8,
RData.location=RData.location)

# c1) manhattan plot of the best
best <- bLD$best
best$value <- best$score
gp <- xGRmanhattan(best, top=length(best))
gp
# c2) manhattan plot of all LD block
grl_block <- bLD$block
gr_block <- BiocGenerics::unlist(grl_block,use.names=F)
gr_block$value <- gr_block$score
top.label.query <- names(gr_block)[!is.na(gr_block$pval)]
#gr_block <- gr_block[as.character(GenomicRanges::seqnames(gr_block)) %in% c('chr1','chr2')]
gp <- xGRmanhattan(gr_block, top=length(gr_block),
top.label.query=top.label.query)
# c3) karyogram plot of the best
kp <- xGRkaryogram(gr=best,cytoband=T,label=T,
RData.location=RData.location)
kp
# c4) circle plot of the best
library(ggbio)
gr_ideo <- xRDataLoader(RData.customised="hg19_ideogram",
RData.location=RData.location)$ideogram
#cp <- ggbio() + circle(kp$gr, geom="rect", color="steelblue", size=0.5)
cp <- ggbio() + circle(kp$gr, aes(x=start, y=num), geom="point",
color="steelblue", size=0.5)
cp <- cp + circle(gr_ideo, geom="ideo", fill="gray70") +
circle(gr_ideo, geom="scale", size=1.5) + circle(gr_ideo, geom="text",
aes(label=seqnames), vjust=0, size=3)
cp

# d) track plot of 1st LD block
gr_block <- bLD$block[[1]]
cnames <- c('score','maf','cadd')
ls_gr <- lapply(cnames, function(x) gr_block[,x])
names(ls_gr) <- cnames
ls_gr$score$Label <- names(gr_block)
ls_gr$score$Label[is.na(gr_block$pval)] <-''
GR.score.customised <- ls_gr
## cse.query
df_block <- as.data.frame(gr_block)
chr <- unique(df_block$seqnames)
xlim <- range(df_block$start)
cse.query <- paste0(chr,':',xlim[1],'-',xlim[2])
#cse.query <- paste0(chr,':',xlim[1]-1e4,'-',xlim[2]+1e4)
## xGRtrack
tks <- xGRtrack(cse.query=cse.query, GR.score="RecombinationRate",
GR.score.customised=GR.score.customised, RData.location=RData.location)
tks

###############
# Advanced use: get LD block (based on customised LD and SNP data)
###############
LD.customised <- xRDataLoader('LDblock_EUR',
RData.location=RData.location)
GR.SNP <- xRDataLoader('LDblock_GR', RData.location=RData.location)
bLD <- xLDblock(data, LD.customised=LD.customised, LD.r2=0.8,
GR.SNP=GR.SNP, RData.location=RData.location)
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples