Learn R Programming

qrqc (version 1.26.0)

calcKL-methods: Calculate the Kullback-Leibler Divergence Between the k-mer Distribution by Position and the k-mer Distribution Across All Positions.

Description

calcKL takes in an object that inherits from SequenceSummary that has a kmers slot, and returns the terms of the K-L divergence sum (which correspond to items in the sample space, in this case, k-mers).

Usage

calcKL(x)

Arguments

x
an S4 object a class that inherits from SequenceSummary.

Value

calcKL returns a data.frame with columns:
kmer
the k-mer sequence.
position
the position in the read.
kl
the K-L term for this k-mer in the K-L sum, calculated as p(i)*log2(p(i)/q(i)).
p
the probability for this k-mer, at this position.
q
the probability for this k-mer across all positions.

See Also

kmerKLPlot, getKmer

Examples

Run this code
  ## Load a somewhat contaminated FASTQ file
  s.fastq <- readSeqFile(system.file('extdata', 'test.fastq',
    package='qrqc'), hash.prop=1)

  ## As with getQual, this function is provided so custom graphics can
  ## be made easily. For example K-L divergence by position:
  kld <- with(calcKL(s.fastq), aggregate(kl, list(position),
    sum))
  colnames(kld) <- c("position", "KL")
  p <- ggplot(kld) + geom_line(aes(x=position, y=KL), color="blue")
  p + scale_y_continuous("K-L divergence")

Run the code above in your browser using DataLab