Learn R Programming

localScore (version 2.0.3)

daudin: Daudin [p-value] [iid]

Description

Calculates the exact p-value in the identically and independently distributed of a given local score, a sequence length that 'must not be too large' and for a given score distribution

Usage

daudin(
  local_score,
  sequence_length,
  score_probabilities,
  sequence_min = NULL,
  sequence_max = NULL,
  score_values = NULL
)

Value

A double representing the probability of a local score as high as the one given as argument.

Arguments

local_score

the observed local score

sequence_length

length of the sequence

score_probabilities

the probabilities for each score from lowest to greatest (Optionnaly with scores as names)

sequence_min

minimum score (optional if score_values OR names(score_probabilities) is defined)

sequence_max

maximum score (optional if score_values OR names(score_probabilities) is defined)

score_values

vector of integer score values, associated to score_probabilities (optional if sequence_min and sequence_max OR names(score_probabilities) are defined)

Details

Either sequence_min and sequence_max are specified as input, OR all possible score values in score_values vector ; one of this choice is required. <cr> Small in this context depends heavily on your machine. On a 3,7GHZ machine this means for daudin(1000, 5000, c(0.2, 0.2, 0.2, 0.1, 0.2, 0.1), -2, 3) an execution time of ~2 seconds. This is due to the calculation method using matrix exponentiation which takes times. The size of the matrix of the exponentiation is equal to a+1 with a the local score value. The matrix must be put at the power n, with n the sequence length. Moreover, it is known that the local score value is expected to be in mean of order log(n).

Examples

Run this code
p1 <- daudin(local_score = 4, sequence_length = 50, 
       score_probabilities = c(0.2, 0.3, 0.1, 0.2, 0.1, 0.1), 
       sequence_min = -3, sequence_max = 2)
p2 <- daudin(local_score = 4, sequence_length = 50, 
       score_probabilities = c(0.2, 0.3, 0.1, 0.2, 0.1, 0.1), 
       score_values = as.integer(-3:2))
p1 == p2 # TRUE

prob <- c(0.08, 0.32, 0.08, 0.00, 0.08, 0.00, 0.00, 0.08, 0.02, 0.32, 0.02)
score_values <- which(prob != 0) - 6 # keep only non null probability scores
prob0 <- prob[prob != 0]             # and associated probability
p <- daudin(150, 10000, prob, sequence_min = -5, sequence_max =  5)
p0 <- daudin(150, 10000, prob0, score_values = score_values)
names(prob0) <- score_values
p1 <- daudin(150, 10000, prob0)
p == p0 # TRUE
p == p1 # TRUE

Run the code above in your browser using DataLab