coll.test: the Collision test

Description

The Collision test for testing random number generators.

Usage

coll.test(rand, lenSample = 2^14, segments = 2^10, tdim = 2, 
nbSample = 1000, echo = TRUE, ...)

Value

a list with the following components :

statistic the value of the chi-squared statistic.

p.value the p-value of the test.

observed the observed counts.

expected the expected counts under the null hypothesis.

residuals the Pearson residuals, (observed - expected) / sqrt(expected).

Arguments

rand: a function generating random numbers. its first argument must be the 'number of observation' argument as in runif.
lenSample: numeric for the length of generated samples.
segments: numeric for the number of segments to which the interval [0, 1] is split.
tdim: numeric for the length of the disjoint t-tuples.
nbSample: numeric for the overall sample number.
echo: logical to plot detailed results, default TRUE
...: further arguments to pass to function rand

Author

Christophe Dutang.

Details

We consider outputs of multiple calls to a random number generator rand. Let us denote by $n$ the length of samples (i.e. lenSample argument), $k$ the number of cells (i.e. nbCell argument) and $m$ the number of samples (i.e. nbSample argument).

A collision is defined as when a random number falls in a cell where there are already random numbers. Let us note $C$ the number of collisions

The distribution of collision number $C$ is given by $$ P(C = c) = \prod_{i=0}^{n-c-1}\frac{k-i}{k} \frac{1}{k^c} {}_2S_n^{n-c}, $$ where ${}_2S_n^k$ denotes the Stirling number of the second kind and $c=0,\dots,n-1$.

But we cannot use this formula for large $n$ since the Stirling number need $O(n\log(n))$ time to be computed. We use a Gaussian approximation if $\frac{n}{k}>\frac{1}{32}$ and $n\geq 2^8$, a Poisson approximation if $\frac{n}{k} < \frac{1}{32}$ and the exact formula otherwise.

Finally we compute $m$ samples of random numbers, on which we calculate the number of collisions. Then we are able to compute a chi-squared statistic.

References

Planchet F., Jacquemin J. (2003), L'utilisation de methodes de simulation en assurance. Bulletin Francais d'Actuariat, vol. 6, 11, 3-69. (available online)

L'Ecuyer P. (2001), Software for uniform random number generation distinguishing the good and the bad. Proceedings of the 2001 Winter Simulation Conference. (available online)

L'Ecuyer P. (2007), Test U01: a C library for empirical testing of random number generators. ACM Trans. on Mathematical Software 33(4), 22.

Examples

Run this code

# (1) poisson approximation
#
coll.test(runif, 2^7, 2^10, 1, 100)

# (2) exact distribution
#
coll.test(SFMT, 2^7, 2^10, 1, 100)

Run the code above in your browser using DataLab