Learn R Programming

flip (version 2.5.0)

flip: flip

Description

The main function for univariate and multivariate testing under a permutation (and rotation) framework + some utilities.

flip is the main function for permutation (or rotation) test.

It allows for multivariate one sample, C>=2 samples and any regression tests. Also the use of covariates (to be fitted in the model but) not under test is allowed.

statTest="t" is the t statistic derived from the correlation among each Xs and each Ys (i.e. a linear model for each couples of Xs and Ys). This is different from the fit of a multiple (multivariate) linear models, since the correlation does not consider the other covariates). The test t is valid only under the assumption that each variable in X is independent of each variable in Y. To get adequate test while adjusting for covariates, use Z (see example below) The test statistic "sum" is the sum of values (or frequencies) of the given sample centered on the expected (i.e. computed on the overall sample). "coeff" is the statistic based on the estimated coefficient of an lm. It produces a test for every possible combination of (columns of) X and Y (p-values can be combined using npc). "cor" is the correlation (i.e. not partial correlation) between each column of X and each of Y. "cor.Spearman" (or "cor.rank") is the analogous for Spearman's rank correlation coefficient.

"ANOVA" is synonyms of "F". Only valid for dependence tests (i.e. non constant X). "Mann-Whitney" is synonyms of "Wilcoxon". "rank" choose among "Wilcoxon" and "Kruskal-Wallis" depending if the samples are two or more (respectively).

The "Wilcoxon" statistic is based on the 'sum of ranks of second sample minus n1*(n+1)/2' instead of 'sum of ranks of smallest sample minus nSmallest*(n+1)/2'. Therefore the statistic is centered on 0 and allow for two sided alternatives. Despite the p-value are ok, it requires the X to be a two-levels factor in order to compute the right test statistic. When the X is not a two-levels factor, it measures the codeviance among X and ranks of Y.

For paired samples (see also the argument Strata and the example below) the Signed Rank test is performed. To perform the Sign Test use option Sign (i.e. same as Signed Rank but without using magnitude of ranks).

The "Fisher" test is allowed only with dichotomous Ys. The reported statistic is the bottom-right cell of the 2 by 2 frequencies table. The "chisq.separated" test perform cell-wise chi squared (see also Finos and Salmaso (2004) Communications in Statistics - Theory and methods).

The "McNemar" test is based on the signs of the differences, hence it can be used also with ordinal or continuous responses. Only valid for symmetry tests (i.e. X is constant or NULL). The reported statistic for "McNemar" test is the signed squared root of the McNemar statistic. Hence it allows for tailed alternatives.

For ordered X, a stochastic ordering test can be performed using "t","Wilcoxon","sum" and then combining the separated test using npc.

When statTest is a function, the first argument must be Y. This same function is ran to observed data Y and to a number of permuted rows of Y. The returned value must be a vector of test statistics. Please note that argument tail must be defined accordingly. The default way the rows of Y are rearranged is through permutation (without strata). More complex permutation strategies can be defined through proper definition of argument perm (see also permutationSpace).

For testType="rotation": As long as the number of orthogonalized residuals (i.e. the number of observations minus the number of columns in Z) is lower than 50, the function rom is used. The the number is larger, the faster version romFast is used instead. Although the latter is less accurate, for such a big sample size, it is not expected to affect the control of the type I error.

Usage

flip(Y, X = NULL, Z = NULL, data = NULL, tail = 0, perms = 1000,
  statTest = NULL, Strata = NULL, flipReturn, testType = NULL,
  returnGamma = TRUE, ...)

Arguments

Y

The response vector of the regression model. May be supplied as a vector or as a formula object. In the latter case, the right hand side of Y is passed on to alternative if that argument is missing, or otherwise to null.

X

The part of the design matrix corresponding to the alternative hypothesis. The covariates of the null model do not have to be supplied again here. May be given as a half formula object (e.g. ~a+b). In that case the intercept is always suppressed.

Z

The part of the design matrix corresponding to the null hypothesis. May be given as a design matrix or as a half formula object (e.g. ~a+b). The default for Z is ~1, i.e. only an intercept. This intercept may be suppressed, if desired, with Z = ~0.

data

Only used when Y, X, or Z is given in formula form. An optional data frame, list or environment containing the variables used in the formula. If the variables in a formula are not found in data, the variables are taken from environment(formula), typically the environment from which gt is called.

tail

Vector of values -1, 0 or 1 indicating the tail to be used in the test for each column of Y. tail=1 (-1) means that greater (smaller) values bring more evidence to the alternative hypothesis. tail=0 indicates a two sided alternative. If the length of tail is smaller than number of columns of Y, the values are recycled.

perms

The number of permutations to use. The default is perms = 1000. Alternatively it can be a matrix (i.e. the permutation space) or a list with elements number and seed.

statTest

Choose a test statistic from flip.statTest. See also Details section.

Strata

A vector, which unique values identifies strata. This option is used only with testType="permutation"; parameter Z is not considered in this case. Also note that when only two levels with one observation per each level are present in each stratum, the problem becomes a paired two-samples problem and hence simplified to a one-sample test.

flipReturn

list of objects indicating what will be included in the output.

e.g. list(permP=TRUE,permT=TRUE,data=TRUE).

testType

by default testType="permutation". The use of option "combination" is more efficient when X is indicator of groups (i.e. C>1 samples testing). When the total number of possible combinations exceeds 10 thousand, "permutation" is performed. As an alternative, if you choose "rotation", resampling is performed through random linear combinations (i.e. a rotation test is performed). This option is useful when only few permutations are available, that is, minimum reachable significance is hight. See also the details section for the algorithm used. The old syntax rotationTest=TRUE is maintained for compatibility but is deprecated, use testType="rotation" instead.

returnGamma

logical. Should be the eigenvectors (with corresponding non-null eigenvalues) of the anti-projection matrix of Z (i.e. I- Z(Z'Z)^-1 Z') returned?

Further parameters. The followings are still valid but deprecated:

permT.return = TRUE, permP.return = FALSE,

permSpace.return = FALSE, permY.return = FALSE. Use flipReturn instead.

dummyfy a named list of logical values (eg. list(X=TRUE,Y=TRUE))

rotationTest= TRUE. Deprecated, use testType='rotation' instead.

Value

An object of class flip.object. Several operations and plots can be made from this object. See also flip.object-class.

References

For the general framework of univariate and multivariate permutation tests see: Pesarin, F. (2001) Multivariate Permutation Tests with Applications in Biostatistics. Wiley, New York.

For Rotation tests see: Langsrud, O. (2005) Rotation tests, Statistics and Computing, 15, 1, 53-60

A. Solari, L. Finos, J.J. Goeman (2014) Rotation-based multiple testing in the multivariate linear model. Biometrics, 70 (4), 954-961.

Livio Finos and Fortunato Pesarin (2018) On zero-inflated permutation testing and some related problems. Statistical Papers.

See Also

The permutation spaces on which the test is based: permutationSpace function and useful functions associated with that object.

Multiplicity correction: flip.adjust and Global test: npc.

Examples

Run this code
# NOT RUN {
Y=matrix(rnorm(50),10,5)
colnames(Y)=LETTERS[1:5]
Y[,1:2]=Y[,1:2] +1
res = flip(Y)
res
plot(res[[1]])
plot(res[2:3])
plot(res)

X=rep(0:1,5)
Y=Y+matrix(X*2,10,5)

data=data.frame(Y,X=X, Z=rnorm(10))
#testing dependence among Y's and X
(res = flip(Y,~X,data=data))
#same as:
#res = flip(A+B+C+D+E~X,data=data)


#testing dependence among Y's and X, also using covariates
res = flip(Y,~X,~Z,data=data)
res
#Note that
#flip(Y,X=~X,Z=~1,data=data)
#is different from
#flip(Y,~X,data=data)
#since the former is based on orthogonalized residuals of Y and X by Z.

# }
# NOT RUN {
#Rotation tests:
rot=flip(Y,X,Z=~1,testType="rotation")
# note the use Z=~1.
# }
# NOT RUN {
#Using rank tests:
res = flip(Y,~X,data=data,statTest="Wilcoxon")
res

#testing symmetry of Y around 0
Y[,1:2]=Y[,1:2] +2
res = flip(Y)
res
plot(res)


#use of strata (in this case equal to paired samples)
data$S=rep(1:5,rep(2,5))
#paired t
flip(A+B+C+D+E~X,data=data,statTest="t",Strata=~S)
#signed Rank test
flip(A+B+C+D+E~X,data=data,statTest="Wilcox",Strata=~S)

# tests for categorical data
data=data.frame(X=rep(0:2,10))
data=data.frame(X=factor(data$X),Y=factor(rbinom(30,2,.2+.2*data$X)))
flip(~Y,~X,data=data,statTest="chisq")
# separated chisq (Finos and Salmaso, 2004. Nonparametric multi-focus analysis
# for categorical variables. CommStat - T.M.)
(res.sep=flip(~Y,~X,data=data,statTest="chisq.separated"))
npc(res.sep,"sumT2") #note that combined test statistic is the same as chisq

# }
# NOT RUN {
# User-defined test statistic:
my.fun <- function(Y){
summary(lm(Y~X))$coeff[2,"Estimate"]
}
X <- matrix(rep(0:2,5))
Y <- matrix(rnorm(mean=X,n=15))
res=flip(Y=Y,X=X,statTest=my.fun,tail=0)
res
hist(res)
# }

Run the code above in your browser using DataLab