Calculate Hotelling's T-squared test statistic for the difference in two multivariate means.
hotelling.stat(x, y, shrinkage = FALSE, var.equal = TRUE)
a nx by p matrix containing the data points from sample 1 or a list containing elements mean
, cov
, and n
where
mean
is a mean vector of length p, cov
is a variance-covariance matrix of dimension p by p, and n
is the sample size
a ny by p matrix containg the data points from sample 2 or a list containing elements mean
, cov
, and n
where
mean
is a mean vector of length p, cov
is a variance-covariance matrix of dimension p by p, and n
is the sample size
set to TRUE
if the covariance matrices are to be estimated using Schaefer and Strimmer's James-Stein
shrinkage estimator. Note this only works when raw data is supplied, and will
not work if summary statistics are supplied.
set to TRUE
if the covariance matrices are (assumed to be) equal
A list containing the following components:
Hotelling's (unscaled) T-squared statistic
The scaling factor - this can be used by by multiplying it with the test statistic, or dividing the critical F value
a vector of length containing the numerator and denominator degrees of freedom
The sample size of sample 1
The sample size of sample 2
The number of variables to be used in the comparison
Note, the sample size requirements are that nx + ny - 1 > p. The procedure will stop if this is not met and the shrinkage estimator is not being used. The shrinkage estimator has not been rigorously tested for this application (small p, smaller n).
Hotelling, H. (1931). ``The generalization of Student's ratio.'' Annals of Mathematical Statistics 2 (3): 360--378.
Schaefer, J., and K. Strimmer (2005). ``A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.'' Statist. Appl. Genet. Mol. Biol. 4: 32.
Opgen-Rhein, R., and K. Strimmer (2007). ``Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach.'' Statist. Appl. Genet. Mol. Biol. 6: 9.
NEL, D.G. and VAN DER MERWE, C.A. (1986). ``A solution to the - multivariate Behrens-Fisher problem.'' Comm. Statist. Theor.- Meth., A15, 12, 3719-3736.
# NOT RUN {
data(container.df)
split.data = split(container.df[,-1],container.df$gp)
x = split.data[[1]]
y = split.data[[2]]
hotelling.stat(x, y)
hotelling.stat(x, y, TRUE)
# }
Run the code above in your browser using DataLab