In Solexa ChIP-seq experiments some anomalous positions contain
extremely high number of tags at the exact coordinates. The function
scans the chromosomes, determining local tag density based on a
provided window.size
, doing two types of corrections:
1. removing all tags from positions that exceed local density by
eliminate.fold
; 2. reducing the tag count at positions
exceeding cap.fold
to the maximal allowed count. The
statistical significance of counts exceeding either of these two
threshold densities is calculated based on Poisson model, with
confidence interval determined by the z.threshold
Z-score parameter.
remove.local.tag.anomalies(tags,
window.size = 200,
eliminate.fold = 10,
cap.fold = 4,
z.threshold = 3)
Chromosome-list of tag vectors
Size of the window used to assess local density. Increasing the window size considerably beyond the size of the binding features will result in flattened profiles, with bound positions exhibiting a difference of just 1 tag beyond the background.
Threshold definining fold-over background density above which the position is considered anomalous and removed completely.
Threshold fold-over background density above which the position is capped to the maximum statistically likely given local tag density
Z-score used to assess significance of a given position exceeding either of the two density thresholds.
A modified chromosome-wise tag vector list.
~put references to the literature/web site here ~