Last chance! 50% off unlimited learning
Sale ends in
Perform Fisher's one-sample randomization (permutation) test for location.
oneSamplePermutationTest(x, alternative = "two.sided", mu = 0, exact = FALSE,
n.permutations = 5000, seed = NULL, ...)
numeric vector of observations.
Missing (NA
), undefined (NaN
), and infinite (Inf
, -Inf
)
values are allowed but will be removed.
character string indicating the kind of alternative hypothesis. The possible values
are "two.sided"
(the default), "less"
, and "greater"
.
numeric scalar indicating the hypothesized value of the mean.
The default value is mu=0
.
logical scalar indicating whether to perform the exact permutation test
(i.e., enumerate all possible permutations) or simply sample from the permutation
distribution. The default value is exact=FALSE
.
integer indicating how many times to sample from the permutation distribution when
exact=FALSE
. The default value is n.permutations=5000
.
This argument is ignored when exact=TRUE
.
positive integer to pass to the R function set.seed
. The
default is seed=NULL
, in which case the current value of
.Random.seed
is used.
Using the seed
argument lets you reproduce the exact same result if all
other arguments stay the same.
arguments that can be supplied to the format
function. This
argument is used when creating the names
attribute for the statistic
component of the returned list (see permutationTest.object
).
A list of class "permutationTest"
containing the results of the hypothesis
test. See the help file for permutationTest.object
for details.
Randomization Tests In 1935, R.A. Fisher introduced the idea of a randomization test (Manly, 2007, p. 107; Efron and Tibshirani, 1993, Chapter 15), which is based on trying to answer the question: “Did the observed pattern happen by chance, or does the pattern indicate the null hypothesis is not true?” A randomization test works by simply enumerating all of the possible outcomes under the null hypothesis, then seeing where the observed outcome fits in. A randomization test is also called a permutation test, because it involves permuting the observations during the enumeration procedure (Manly, 2007, p. 3).
In the past, randomization tests have not been used as extensively as they are now because of the “large” computing resources needed to enumerate all of the possible outcomes, especially for large sample sizes. The advent of more powerful personal computers and software has allowed randomization tests to become much easier to perform. Depending on the sample size, however, it may still be too time consuming to enumerate all possible outcomes. In this case, the randomization test can still be performed by sampling from the randomization distribution, and comparing the observed outcome to this sampled permutation distribution.
Fisher's One-Sample Randomization Test for Location
Let alternative="greater"
)
alternative="less"
)
For a one-sided upper alternative hypothesis (Equation (2)), the p-value is computed
as the proportion of sums in the permutation distribution that are greater than or
equal to the observed sum
Confidence Intervals Based on Permutation Tests
Based on the relationship between hypothesis tests and confidence intervals, it is
possible to construct a two-sided or one-sided boot
in the R package boot.
Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York, pp.224--227.
Manly, B.F.J. (2007). Randomization, Bootstrap and Monte Carlo Methods in Biology. Third Edition. Chapman & Hall, New York, pp.112-113.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL, pp.404--406.
# NOT RUN {
# Generate 10 observations from a logistic distribution with parameters
# location=7 and scale=2, and test the null hypothesis that the true mean
# is equal to 5 against the alternative that the true mean is greater than 5.
# Use the exact permutation distribution.
# (Note: the call to set.seed() allows you to reproduce this example).
set.seed(23)
dat <- rlogis(10, location = 7, scale = 2)
test.list <- oneSamplePermutationTest(dat, mu = 5,
alternative = "greater", exact = TRUE)
# Print the results of the test
#------------------------------
test.list
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: Mean (Median) = 5
#
#Alternative Hypothesis: True Mean (Median) is greater than 5
#
#Test Name: One-Sample Permutation Test
# (Exact)
#
#Estimated Parameter(s): Mean = 9.977294
#
#Data: dat
#
#Sample Size: 10
#
#Test Statistic: Sum(x - 5) = 49.77294
#
#P-value: 0.001953125
# Plot the results of the test
#-----------------------------
dev.new()
plot(test.list)
#==========
# The guidance document "Supplemental Guidance to RAGS: Calculating the
# Concentration Term" (USEPA, 1992d) contains an example of 15 observations
# of chromium concentrations (mg/kg) which are assumed to come from a
# lognormal distribution. These data are stored in the vector
# EPA.92d.chromium.vec. Here, we will use the permutation test to test
# the null hypothesis that the mean (median) of the log-transformed chromium
# concentrations is less than or equal to log(100 mg/kg) vs. the alternative
# that it is greater than log(100 mg/kg). Note that we *cannot* use the
# permutation test to test a hypothesis about the mean on the original scale
# because the data are not assumed to be symmetric about some mean, they are
# assumed to come from a lognormal distribution.
#
# We will sample from the permutation distribution.
# (Note: setting the argument seed=542 allows you to reproduce this example).
test.list <- oneSamplePermutationTest(log(EPA.92d.chromium.vec),
mu = log(100), alternative = "greater", seed = 542)
test.list
#Results of Hypothesis Test
#--------------------------
#
#Null Hypothesis: Mean (Median) = 4.60517
#
#Alternative Hypothesis: True Mean (Median) is greater than 4.60517
#
#Test Name: One-Sample Permutation Test
# (Based on Sampling
# Permutation Distribution
# 5000 Times)
#
#Estimated Parameter(s): Mean = 4.378636
#
#Data: log(EPA.92d.chromium.vec)
#
#Sample Size: 15
#
#Test Statistic: Sum(x - 4.60517) = -3.398017
#
#P-value: 0.7598
# Plot the results of the test
#-----------------------------
dev.new()
plot(test.list)
#----------
# Clean up
#---------
rm(test.list)
graphics.off()
# }
Run the code above in your browser using DataLab