bwsKSampleTest: Murakami's k-Sample BWS Test

Description

Performs Murakami's k-Sample BWS Test.

Usage

bwsKSampleTest(x, ...)
# S3 method for default
bwsKSampleTest(x, g, nperm = 1000, ...)
# S3 method for formula
bwsKSampleTest(formula, data, subset, na.action, nperm = 1000, ...)

Value

A list with class "htest" containing the following components:

method: a character string indicating what type of test was performed.
data.name: a character string giving the name(s) of the data.
statistic: the estimated quantile of the test statistic.
p.value: the p-value for the test.
parameter: the parameters of the test statistic, if any.
alternative: a character string describing the alternative hypothesis.
estimates: the estimates, if any.
null.value: the estimate under the null hypothesis, if any.

Arguments

x: a numeric vector of data values, or a list of numeric data vectors.
...: further arguments to be passed to or from methods.
g: a vector or factor object giving the group for the corresponding elements of "x". Ignored with a warning if "x" is a list.
nperm: number of permutations for the assymptotic permutation test. Defaults to 1000.
formula: a formula of the form response ~ group where response gives the data values and group a vector or factor of the corresponding groups.
data: an optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
subset: an optional vector specifying a subset of observations to be used.
na.action: a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").

Details

Let $X_{ij} ~ (1 \le i \le k,~ 1 \le 1 \le n_i)$ denote an identically and independently distributed variable that is obtained from an unknown continuous distribution $F_i(x)$. Let $R_{ij}$ be the rank of $X_{ij}$, where $X_{ij}$ is jointly ranked from $1$ to $N, ~ N = \sum_{i=1}^k n_i$. In the $k$-sample test the null hypothesis, H: $F_i = F_j$ is tested against the alternative, A: $F_i \ne F_j ~~(i \ne j)$ with at least one inequality beeing strict. Murakami (2006) has generalized the two-sample Baumgartner-Weiß-Schindler test (Baumgartner et al. 1998) and proposed a modified statistic $B_k^*$ defined by

$$ B_{k}^* = \frac{1}{k}\sum_{i=1}^k \left\{\frac{1}{n_i} \sum_{j=1}^{n_i} \frac{(R_{ij} - \mathsf{E}[R_{ij}])^2} {\mathsf{Var}[R_{ij}]}\right\}, $$

where

$$ \mathsf{E}[R_{ij}] = \frac{N + 1}{n_i + 1} j $$

and

$$ \mathsf{Var}[R_{ij}] = \frac{j}{n_i + 1} \left(1 - \frac{j}{n_i + 1}\right) \frac{\left(N-n_i\right)\left(N+1\right)}{n_i + 2}. $$

The $p$-values are estimated via an assymptotic boot-strap method. It should be noted that the $B_k^*$ detects both differences in the unknown location parameters and / or differences in the unknown scale parameters of the $k$-samples.

References

Baumgartner, W., Weiss, P., Schindler, H. (1998) A nonparametric test for the general two-sample problem, Biometrics 54, 1129--1135.

Murakami, H. (2006) K-sample rank test based on modified Baumgartner statistic and its power comparison, J. Jpn. Comp. Statist. 19, 1--13.

Examples

Run this code

## Hollander & Wolfe (1973), 116.
## Mucociliary efficiency from the rate of removal of dust in normal
## subjects, subjects with obstructive airway disease, and subjects
## with asbestosis.
x <- c(2.9, 3.0, 2.5, 2.6, 3.2) # normal subjects
y <- c(3.8, 2.7, 4.0, 2.4)      # with obstructive airway disease
z <- c(2.8, 3.4, 3.7, 2.2, 2.0) # with asbestosis
g <- factor(x = c(rep(1, length(x)),
                   rep(2, length(y)),
                   rep(3, length(z))),
             labels = c("ns", "oad", "a"))
dat <- data.frame(
   g = g,
   x = c(x, y, z))

## AD-Test
adKSampleTest(x ~ g, data = dat)

## BWS-Test
bwsKSampleTest(x ~ g, data = dat)

## Kruskal-Test
## Using incomplete beta approximation
kruskalTest(x ~ g, dat, dist="KruskalWallis")
## Using chisquare distribution
kruskalTest(x ~ g, dat, dist="Chisquare")

if (FALSE) {
## Check with kruskal.test from R stats
kruskal.test(x ~ g, dat)
}
## Using Conover's F
kruskalTest(x ~ g, dat, dist="FDist")

if (FALSE) {
## Check with aov on ranks
anova(aov(rank(x) ~ g, dat))
## Check with oneway.test
oneway.test(rank(x) ~ g, dat, var.equal = TRUE)
}

## Median Test asymptotic
medianTest(x ~ g, dat)

## Median Test with simulated p-values
set.seed(112)
medianTest(x ~ g, dat, simulate.p.value = TRUE)

Run the code above in your browser using DataLab