coll.test: the Collision test

Description

The Collision test for testing random number generators.

Usage

coll.test(rand, lenSample = 2^14, segments = 2^10, tdim = 2, 
nbSample = 1000, echo = TRUE, ...)

Value

a list with the following components :

statistic the value of the chi-squared statistic.

p.value the p-value of the test.

observed the observed counts.

expected the expected counts under the null hypothesis.

residuals the Pearson residuals, (observed - expected) / sqrt(expected).

Arguments

rand: a function generating random numbers. its first argument must be the 'number of observation' argument as in runif.
lenSample: numeric for the length of generated samples.
segments: numeric for the number of segments to which the interval [0, 1] is split.
tdim: numeric for the length of the disjoint t-tuples.
nbSample: numeric for the overall sample number.
echo: logical to plot detailed results, default TRUE
...: further arguments to pass to function rand

Author

Christophe Dutang.

Details

We consider outputs of multiple calls to a random number generator rand. Let us denote by $n$ the length of samples (i.e. lenSample argument), $k$ the number of cells (i.e. nbCell argument) and $m$ the number of samples (i.e. nbSample argument).

A collision is defined as when a random number falls in a cell where there are already random numbers. Let us note $C$ the number of collisions

The distribution of collision number $C$ is given by $$ P(C = c) = \prod_{i=0}^{n-c-1}\frac{k-i}{k} \frac{1}{k^c} {}_2S_n^{n-c}, $$ where ${}_2S_n^k$ denotes the Stirling number of the second kind and $c=0,\dots,n-1$.

But we cannot use this formula for large $n$ since the Stirling number need $O(n\log(n))$ time to be computed. We use a Gaussian approximation if $\frac{n}{k}>\frac{1}{32}$ and $n\geq 2^8$, a Poisson approximation if $\frac{n}{k} < \frac{1}{32}$ and the exact formula otherwise.

Finally we compute $m$ samples of random numbers, on which we calculate the number of collisions. Then we are able to compute a chi-squared statistic.

References

Planchet F., Jacquemin J. (2003), L'utilisation de methodes de simulation en assurance. Bulletin Francais d'Actuariat, vol. 6, 11, 3-69. (available online)

L'Ecuyer P. (2001), Software for uniform random number generation distinguishing the good and the bad. Proceedings of the 2001 Winter Simulation Conference. tools:::Rd_expr_doi("10.1109/WSC.2001.977250")

L'Ecuyer P. (2007), Test U01: a C library for empirical testing of random number generators. ACM Trans. on Mathematical Software 33(4), 22. tools:::Rd_expr_doi("10.1145/1268776.1268777")

Examples

Run this code

# (1) poisson approximation
#
coll.test(runif, 2^7, 2^10, 1, 100)

# (2) exact distribution
#
coll.test(SFMT, 2^7, 2^10, 1, 100)

Run the code above in your browser using DataLab