In order to compute the null distributions for a test statistic (with a specific aggregation and score type, and all partition sizes), the only necessary information is the sample size (the test statistic is "distribution free"). The accuracy of the quantiles of the null distribution depend on the number of replicates used for constructing the null tables. The necessary accuracy depends on the threshold used for rejection of the null hypotheses.
This function creates an object for efficiently storing the null distribution of the test statistics (by partition size m
). Use the returned object, together with hhg.univariate.ind.pvalue
to compute the P-value for the statistics computed by hhg.univariate.ind.stat
Generated null tables also hold the distribution of statistics for combination types (comb.type=='MinP'
and comb.type=='Fisher'
), used by hhg.univariate.ind.combined.test
.
Variant types "ADP-EQP"
and "ADP-EQP-ML"
, are the computationally efficient versions of the "ADP"
and "ADP-ML"
. EQP type variants reduce calculation time by summing over a subset of partitions, where a split between cells may be performed only every \(n/nr.atoms\) observations. This allows for a complexity of O(nr.atoms^4). These variants are only available for aggregation.type=='sum'
type aggregation.
Null tables may be compressed, using the compress
argument. For each of the partition sizes (i.e. m
or mXm
), the null distribution is held at a compress.p0
resolution up to the compress.p
percentile. Beyond that value, the distribution is held at a finer resolution defined by compress.p1
(since higher values are attained when a relation exists in the data, this is required for computing the p-value accurately in the tail of the null disribution.)
For large data (n>100), it is recommended to used Fast.ADP.test
, which is an optimized version of the hhg.univariate.ind.stat
and hhg.univariate.ind.combined
tests. Null Tables for Fast.ADP.test
can be constructed using Fast.ADP.nulltable
.