The function twoSampleLinearRankTest
allows you to compare two samples using
a locally most powerful rank test (LMPRT) to determine whether the two samples come
from the same distribution. The sections below explain the concepts of location and
scale shifts, linear rank tests, and LMPRT's.
Definitions of Location and Scale Shifts
Let \(X\) denote a random variable representing measurements from group 1 with
cumulative distribution function (cdf):
$$F_1(t) = Pr(X \le t) \;\;\;\;\;\; (1)$$
and let \(x_1, x_2, \ldots, x_m\) denote \(m\) independent observations from this
distribution. Let \(Y\) denote a random variable from group 2 with cdf:
$$F_2(t) = Pr(Y \le t) \;\;\;\;\;\; (2)$$
and let \(y_1, y_2, \ldots, y_n\) denote \(n\) independent observations from this
distribution. Set \(N = m + n\).
General Hypotheses to Test Differences Between Two Populations
A very general hypothesis to test whether two distributions are the same is
given by:
$$H_0: F_1(t) = F_2(t), -\infty < t < \infty \;\;\;\;\;\; (3)$$
versus the two-sided alternative hypothesis:
$$H_a: F_1(t) \ne F_2(t) \;\;\;\;\;\; (4)$$
with strict inequality for at least one value of \(t\).
The two possible one-sided hypotheses would be:
$$H_0: F_1(t) \ge F_2(t) \;\;\;\;\;\; (5)$$
versus the alternative hypothesis:
$$H_a: F_1(t) < F_2(t) \;\;\;\;\;\; (6)$$
and
$$H_0: F_1(t) \le F_2(t) \;\;\;\;\;\; (7)$$
versus the alternative hypothesis:
$$H_a: F_1(t) > F_2(t) \;\;\;\;\;\; (8)$$
A similar set of hypotheses to test whether the two distributions are the same are
given by (Conover, 1980, p. 216):
$$H_0: Pr(X < Y) = 1/2 \;\;\;\;\;\; (9)$$
versus the two-sided alternative hypothesis:
$$H_a: Pr(X < Y) \ne 1/2 \;\;\;\;\;\; (10)$$
or
$$H_0: Pr(X < Y) \ge 1/2 \;\;\;\;\;\; (11)$$
versus the alternative hypothesis:
$$H_a: Pr(X < Y) < 1/2 \;\;\;\;\;\; (12)$$
or
$$H_0: Pr(X < Y) \le 1/2 \;\;\;\;\;\; (13)$$
versus the alternative hypothesis:
$$H_a: Pr(X < Y) > 1/2 \;\;\;\;\;\; (14)$$
Note that this second set of hypotheses (9)--(14) is not equivalent to the
set of hypotheses (3)--(8). For example, if \(X\) takes on the values 1 and 4
with probability 1/2 for each, and \(Y\) only takes on values in the interval
(1, 4) with strict inequality at the enpoints (e.g., \(Y\) takes on the values
2 and 3 with probability 1/2 for each), then the null hypothesis (9) is
true but the null hypothesis (3) is not true. However, the null hypothesis (3)
implies the null hypothesis (9), (5) implies (11), and (7) implies (13).
Location Shift
A special case of the alternative hypotheses (4), (6), and (8) above is the
location shift alternative:
$$H_a: F_1(t) = F_2(t - \Delta) \;\;\;\;\;\; (15)$$
where \(\Delta\) denotes the shift between the two groups. (Note: some references
refer to (15) above as a shift in the median, but in fact this kind of shift
represents a shift in every single quantile, not just the median.)
If \(\Delta\) is positive, this means that observations in group 1 tend to be
larger than observations in group 2, and if \(\Delta\) is negative, observations
in group 1 tend to be smaller than observations in group 2.
The alternative hypothesis (15) is called a location shift: the only
difference between the two distributions is a difference in location (e.g., the
standard deviation is assumed to be the same for both distributions). A location
shift is not applicable to distributions that are bounded below or above by some
constant, such as a lognormal distribution. For lognormal distributions, the
location shift could refer to a shift in location of the distribution of the
log-transformed observations.
For a location shift, the null hypotheses (3) can be generalized as:
$$H_0: F_1(t) = F_2(t - \Delta_0), -\infty < t < \infty \;\;\;\;\;\; (16)$$
where \(\Delta_0\) denotes the null shift between the two groups. Almost always,
however, the null shift is taken to be 0 and we will assume this for the rest of this
help file.
Alternatively, the null and alternative hypotheses can be written as
$$H_0: \Delta = 0 \;\;\;\;\;\; (17)$$
versus the alternative hypothesis
$$H_a: \Delta > 0 \;\;\;\;\;\; (18)$$
The other one-sided alternative hypothesis (\(\Delta < 0\)) and two-sided
alternative hypothesis (\(\Delta \ne 0\)) could be considered as well.
The general hypotheses (3)-(14) are not location shift hypotheses
(e.g., the standard deviation does not have to be the same for both distributions),
but they do allow for distributions that are bounded below or above by a constant
(e.g., lognormal distributions).
Scale Shift
A special kind of scale shift replaces the alternative hypothesis (15) with the
alternative hypothesis:
$$H_a: F_1(t) = F_2(t/\tau) \;\;\;\;\;\; (19)$$
where \(\tau\) denotes the shift in scale between the two groups. Alternatively,
the null and alternative hypotheses for this scale shift can be written as
$$H_0: \tau = 1 \;\;\;\;\;\; (20)$$
versus the alternative hypothesis
$$H_a: \tau > 1 \;\;\;\;\;\; (21)$$
The other one-sided alternative hypothesis (\(t < 1\)) and two-sided alternative
hypothesis (\(t \ne 1\)) could be considered as well.
This kind of scale shift often involves a shift in both location and scale. For
example, suppose the underlying distribution for both groups is
exponential, with parameter rate=
\(\lambda\). Then
the mean and standard deviation of the reference group is \(1/\lambda\), while
the mean and standard deviation of the treatment group is \(\tau/\lambda\). In
this case, the alternative hypothesis (21) implies the more general alternative
hypothesis (8).
Linear Rank Tests
The usual nonparametric test to test the null hypothesis of the same distribution
for both groups versus the location-shift alternative (18) is the
Wilcoxon Rank Sum test
(Gilbert, 1987, pp.247-250; Helsel and Hirsch, 1992, pp.118-123;
Hollander and Wolfe, 1999). Note that the Mann-Whitney U test is equivalent to the
Wilcoxon Rank Sum test (Hollander and Wolfe, 1999; Conover, 1980, p.215,
Zar, 2010). Hereafter, this test will be abbreviated as the MWW test. The MWW test
is performed by combining the \(m\) \(X\) observations with the \(n\) \(Y\)
observations and ranking them from smallest to largest, and then computing the
statistic
$$W = \sum_{i=1}^m R_i \;\;\;\;\;\; (22)$$
where \(R_1, R_2, \ldots, R_m\) denote the ranks of the \(X\) observations when
the \(X\) and \(Y\) observations are combined ranked. The null
hypothesis (5), (11), or (17) is rejected in favor of the alternative hypothesis
(6), (12) or (18) if the value of \(W\) is too large. For small sample sizes,
the exact distribution of \(W\) under the null hypothesis is fairly easy to
compute and may be found in tables (e.g., Hollander and Wolfe, 1999;
Conover, 1980, pp.448-452). For larger sample sizes, a normal approximation is
usually used (Hollander and Wolfe, 1999; Conover, 1980, p.217). For the
R function wilcox.test
, an exact p-value is computed if the
samples contain less than 50 finite values and there are no ties.
It is important to note that the MWW test is actually testing the more general
hypotheses (9)-(14) (Conover, 1980, p.216; Divine et al., 2013), even though it
is often presented as only applying to location shifts.
The MWW W-statistic in Equation (22) is an example of a
linear rank statistic (Hettmansperger, 1984, p.147; Prentice, 1985),
which is any statistic that can be written in the form:
$$L = \sum_{i=1}^m a(R_i) \;\;\;\;\;\; (23)$$
where \(a()\) denotes a score function. Statistics of this form are also called
general scores statistics (Hettmansperger, 1984, p.147). The MWW test
uses the identity score function:
$$a(R_i) = R_i \;\;\;\;\;\; (24)$$
Any test based on a linear rank statistic is called a linear rank test.
Under the null hypothesis (3), (9), (17), or (20), the distribution of the linear
rank statistic \(L\) does not depend on the form of the underlying distribution of
the \(X\) and \(Y\) observations. Hence, tests based on \(L\) are
nonparametric (also called distribution-free). If the null hypothesis is not true,
however, the distribution of \(L\) will depend not only on the distributions of the
\(X\) and \(Y\) observations, but also upon the form the score function
\(a()\).
Locally Most Powerful Linear Rank Tests
The decision of what scores to use may be based on considering the power of the test.
A locally most powerful rank test (LMPRT) of the null hypothesis (17) versus the
alternative (18) maximizes the slope of the power (as a function of \(\Delta\)) in
the neighborhood where \(\Delta=0\). A LMPRT of the null hypothesis (20) versus
the alternative (21) maximizes the slope of the power (as a function of \(\tau\))
in the neighborhood where \(\tau=1\). That is, LMPRT's are the best linear rank
test you can use for detecting small shifts in location or scale.
Table 1 below shows the score functions associated with the LMPRT's for various
assumed underlying distributions (Hettmansperger, 1984, Chapter 3;
Millard and Deverel, 1988, p.2090). A test based on the identity score function of
Equation (24) is equivalent to a test based on the score shown in Table 1 associated
with the logistic distribution, thus the MWW test is the LMPRT for detecting a
location shift when the underlying observations follow the logistic distribution.
When the underlying distribution is normal or lognormal, the LMPRT for a location
shift uses the “Normal scores” shown in Table 1. When the underlying
distribution is exponential, the LMPRT for detecting a scale shift is based on the
“Savage scores” shown in Table 1.
Table 1. Scores of LMPRT's for Various Distributions
Distribution |
Score \(a(R_i)\) |
Shift Type |
Test Name |
Logistic |
\([2/(N+1)]R_i - 1\) |
Location |
Wilcoxon Rank Sum |
|
|
|
|
Normal or |
\(\Phi^{-1}[R_i/(N+1)]\)* |
Location |
Van der Waerden or |
Lognormal (log-scale) |
|
|
Normal scores |
|
|
|
|
Double Exponential |
\(sign[R_i - (N+1)/2]\) |
Location |
Mood's Median |
|
|
|
|
Exponential or |
\(\sum_{j=1}^{R_i} (N-j+1)^{-1}\) |
Scale |
Savage scores |
* Denotes an approximation to the true score. The symbol \(\Phi\) denotes the
cumulative distribution function of the standard normal distribution, and \(sign\)
denotes the sign
function.
A large sample normal approximation to the distribution of the linear rank statistic
\(L\) for arbitrary score functions is given by Hettmansperger (1984, p.148).
Under the null hypothesis (17) or (20), the mean and variance of \(L\) are given by:
$$E(L) = \mu_L = \frac{m}{N} \sum_{i=1}^N a_i = m \bar{a} \;\;\;\;\;\; (24)$$
$$Var(L) = \sigma_L^2 = \frac{mn}{N(N-1)} \sum_{i=1}^N (a_i - \bar{a})^2 \;\;\;\;\;\; (25)$$
Hettmansperger (1984, Chapter 3) shows that under the null hypothesis of no
difference between the two groups, the statistic
$$z = \frac{L - \mu_L}{\sigma_L} \;\;\;\;\;\; (26)$$
is approximately distributed as a standard normal random variable for
“large” sample sizes. This statistic will tend to be large if the
observations in group 1 tend to be larger than the observations in group 2.