Last chance! 50% off unlimited learning
Sale ends in
Two-sample linear rank test to detect a difference (usually a shift) between two
distributions. The Wilcoxon Rank Sum test is a special case of
a linear rank test. The function
twoSampleLinearRankTest
is part of
EnvStats mainly because this help file gives the necessary background to
explain two-sample linear rank tests for censored data (see
twoSampleLinearRankTestCensored
).
twoSampleLinearRankTest(x, y, location.shift.null = 0, scale.shift.null = 1,
alternative = "two.sided", test = "wilcoxon", shift.type = "location")
numeric vector of values for the first sample.
Missing (NA
), undefined (NaN
), and infinite (Inf
,
-Inf
) values are allowed but will be removed.
numeric vector of values for the second sample.
Missing (NA
), undefined (NaN
), and infinite (Inf
,
-Inf
) values are allowed but will be removed.
numeric scalar indicating the hypothesized value of location.shift.null=0
. This argument is ignored if shift.type="scale"
.
numeric scalar indicating the hypothesized value of scale.shift.null=1
. This argument is ignored if shift.type="location"
.
character string indicating the kind of alternative hypothesis. The possible values
are "two.sided"
(the default), "less"
, and "greater"
. See the
DETAILS section below for more information.
character string indicating which linear rank test to use. The possible values are:
"wilcoxon"
(the default), "normal.scores"
, "moods.median"
, and
"savage.scores"
.
character string indicating which kind of shift is being tested. The possible values
are "location"
(the default) and "scale"
.
a list of class "htest"
containing the results of the hypothesis test.
See the help file for htest.object
for details.
The function twoSampleLinearRankTest
allows you to compare two samples using
a locally most powerful rank test (LMPRT) to determine whether the two samples come
from the same distribution. The sections below explain the concepts of location and
scale shifts, linear rank tests, and LMPRT's.
Definitions of Location and Scale Shifts
Let
General Hypotheses to Test Differences Between Two Populations
A very general hypothesis to test whether two distributions are the same is
given by:
A similar set of hypotheses to test whether the two distributions are the same are
given by (Conover, 1980, p. 216):
Note that this second set of hypotheses (9)--(14) is not equivalent to the
set of hypotheses (3)--(8). For example, if
Location Shift
A special case of the alternative hypotheses (4), (6), and (8) above is the
location shift alternative:
The alternative hypothesis (15) is called a location shift: the only difference between the two distributions is a difference in location (e.g., the standard deviation is assumed to be the same for both distributions). A location shift is not applicable to distributions that are bounded below or above by some constant, such as a lognormal distribution. For lognormal distributions, the location shift could refer to a shift in location of the distribution of the log-transformed observations.
For a location shift, the null hypotheses (3) can be generalized as:
Alternatively, the null and alternative hypotheses can be written as
The general hypotheses (3)-(14) are not location shift hypotheses (e.g., the standard deviation does not have to be the same for both distributions), but they do allow for distributions that are bounded below or above by a constant (e.g., lognormal distributions).
Scale Shift
A special kind of scale shift replaces the alternative hypothesis (15) with the
alternative hypothesis:
This kind of scale shift often involves a shift in both location and scale. For
example, suppose the underlying distribution for both groups is
exponential, with parameter rate=
Linear Rank Tests
The usual nonparametric test to test the null hypothesis of the same distribution
for both groups versus the location-shift alternative (18) is the
Wilcoxon Rank Sum test
(Gilbert, 1987, pp.247-250; Helsel and Hirsch, 1992, pp.118-123;
Hollander and Wolfe, 1999). Note that the Mann-Whitney U test is equivalent to the
Wilcoxon Rank Sum test (Hollander and Wolfe, 1999; Conover, 1980, p.215,
Zar, 2010). Hereafter, this test will be abbreviated as the MWW test. The MWW test
is performed by combining the wilcox.test
, an exact p-value is computed if the
samples contain less than 50 finite values and there are no ties.
It is important to note that the MWW test is actually testing the more general hypotheses (9)-(14) (Conover, 1980, p.216; Divine et al., 2013), even though it is often presented as only applying to location shifts.
The MWW W-statistic in Equation (22) is an example of a
linear rank statistic (Hettmansperger, 1984, p.147; Prentice, 1985),
which is any statistic that can be written in the form:
Locally Most Powerful Linear Rank Tests
The decision of what scores to use may be based on considering the power of the test.
A locally most powerful rank test (LMPRT) of the null hypothesis (17) versus the
alternative (18) maximizes the slope of the power (as a function of
Table 1 below shows the score functions associated with the LMPRT's for various assumed underlying distributions (Hettmansperger, 1984, Chapter 3; Millard and Deverel, 1988, p.2090). A test based on the identity score function of Equation (24) is equivalent to a test based on the score shown in Table 1 associated with the logistic distribution, thus the MWW test is the LMPRT for detecting a location shift when the underlying observations follow the logistic distribution. When the underlying distribution is normal or lognormal, the LMPRT for a location shift uses the “Normal scores” shown in Table 1. When the underlying distribution is exponential, the LMPRT for detecting a scale shift is based on the “Savage scores” shown in Table 1.
Table 1. Scores of LMPRT's for Various Distributions
Distribution | Score |
Shift Type | Test Name |
Logistic | |
Location | Wilcoxon Rank Sum |
Normal or | |
Location | Van der Waerden or |
Lognormal (log-scale) | Normal scores | ||
Double Exponential | |
Location | Mood's Median |
Exponential or | |
Scale | Savage scores |
* Denotes an approximation to the true score. The symbol sign
function.
A large sample normal approximation to the distribution of the linear rank statistic
Conover, W.J. (1980). Practical Nonparametric Statistics. Second Edition. John Wiley and Sons, New York, Chapter 4.
Divine, G., H.J. Norton, R. Hunt, and J. Dinemann. (2013). A Review of Analysis and Sample Size Calculation Considerations for Wilcoxon Tests. Anesthesia \& Analgesia 117, 699--710.
Hettmansperger, T.P. (1984). Statistical Inference Based on Ranks. John Wiley and Sons, New York, 323pp.
Hollander, M., and D.A. Wolfe. (1999). Nonparametric Statistical Methods, Second Edition. John Wiley and Sons, New York.
Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data With Multiple Nondetect Limits. Water Resources Research, 24(12), 2087--2098.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL, pp.432--435.
Prentice, R.L. (1985). Linear Rank Tests. In Kotz, S., and N.L. Johnson, eds. Encyclopedia of Statistical Science. John Wiley and Sons, New York. Volume 5, pp.51--58.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.
# NOT RUN {
# Generate 15 observations from a normal distribution with parameters
# mean=3 and sd=1. Call these the observations from the reference group.
# Generate 10 observations from a normal distribution with parameters
# mean=3.5 and sd=1. Call these the observations from the treatment group.
# Compare the results of calling wilcox.test to those of calling
# twoSampleLinearRankTest with test="normal.scores".
# (The call to set.seed allows you to reproduce this example.)
set.seed(346)
x <- rnorm(15, mean = 3)
y <- rnorm(10, mean = 3.5)
wilcox.test(x, y)
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: location shift = 0
#
#Alternative Hypothesis: True location shift is not equal to 0
#
#Test Name: Wilcoxon rank sum test
#
#Data: x and y
#
#Test Statistic: W = 32
#
#P-value: 0.0162759
twoSampleLinearRankTest(x, y, test = "normal.scores")
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: Fy(t) = Fx(t)
#
#Alternative Hypothesis: Fy(t) != Fx(t) for at least one t
#
#Test Name: Two-Sample Linear Rank Test:
# Normal Scores Test
# Based on Normal Approximation
#
#Data: x = x
# y = y
#
#Sample Sizes: nx = 15
# ny = 10
#
#Test Statistic: z = -2.431099
#
#P-value: 0.01505308
#----------
# Clean up
#---------
rm(x, y)
#==========
# Following Example 6.6 on pages 6.22-6.26 of USEPA (1994b), perform the
# Wilcoxon Rank Sum test for the TcCB data (stored in EPA.94b.tccb.df).
# There are m=47 observations from the reference area and n=77 observations
# from the cleanup unit. Then compare the results using the other available
# linear rank tests. Note that Mood's median test yields a p-value less
# than 0.10, while the other tests yield non-significant p-values.
# In this case, Mood's median test is picking up the residual contamination
# in the cleanup unit. (See the example in the help file for quantileTest.)
names(EPA.94b.tccb.df)
#[1] "TcCB.orig" "TcCB" "Censored" "Area"
summary(EPA.94b.tccb.df$Area)
# Cleanup Reference
# 77 47
with(EPA.94b.tccb.df,
twoSampleLinearRankTest(TcCB[Area=="Cleanup"], TcCB[Area=="Reference"]))
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: Fy(t) = Fx(t)
#
#Alternative Hypothesis: Fy(t) != Fx(t) for at least one t
#
#Test Name: Two-Sample Linear Rank Test:
# Wilcoxon Rank Sum Test
# Based on Normal Approximation
#
#Data: x = TcCB[Area == "Cleanup"]
# y = TcCB[Area == "Reference"]
#
#Sample Sizes: nx = 77
# ny = 47
#
#Test Statistic: z = -1.171872
#
#P-value: 0.2412485
with(EPA.94b.tccb.df,
twoSampleLinearRankTest(TcCB[Area=="Cleanup"],
TcCB[Area=="Reference"], test="normal.scores"))$p.value
#[1] 0.3399484
with(EPA.94b.tccb.df,
twoSampleLinearRankTest(TcCB[Area=="Cleanup"],
TcCB[Area=="Reference"], test="moods.median"))$p.value
#[1] 0.09707393
with(EPA.94b.tccb.df,
twoSampleLinearRankTest(TcCB[Area=="Cleanup"],
TcCB[Area=="Reference"], test="savage.scores"))$p.value
#[1] 0.2884351
# }
Run the code above in your browser using DataLab