Learn R Programming

genefilter (version 1.54.2)

rowFtests: t-tests and F-tests for rows or columns of a matrix

Description

t-tests and F-tests for rows or columns of a matrix, intended to be speed efficient.

Usage

rowttests(x, fac, tstatOnly = FALSE) 
colttests(x, fac, tstatOnly = FALSE)
fastT(x, ig1, ig2, var.equal = TRUE)

rowFtests(x, fac, var.equal = TRUE) colFtests(x, fac, var.equal = TRUE)

Arguments

x
Numeric matrix. The matrix must not contain NA values. For rowttests and colttests, x can also be an ExpressionSet.
fac
Factor which codes the grouping to be tested. There must be 1 or 2 groups for the t-tests (corresponding to one- and two-sample t-test), and 2 or more for the F-tests. If fac is missing, this is taken as a one-group test (i.e. is only allowed for the t-tests). The length of the factor needs to correspond to the sample size: for the row* functions, the length of the factor must be the same as the number of columns of x, for the col* functions, it must be the same as the number of rows of x.

If x is an ExpressionSet, then fac may also be a character vector of length 1 with the name of a covariate in x.

tstatOnly
A logical variable indicating whether to calculate p-values from the t-distribution with appropriate degrees of freedom. If TRUE, just the t-statistics are returned. This can be considerably faster.
ig1
The indices of the columns of x that correspond to group 1.
ig2
The indices of the columns of x that correspond to group 2.
var.equal
A logical variable indicating whether to treat the variances in the samples as equal. If 'TRUE', a simple F test for the equality of means in a one-way analysis of variance is performed. If 'FALSE', an approximate method of Welch (1951) is used, which generalizes the commonly known 2-sample Welch test to the case of arbitrarily many samples.

Value

  • A data.frame with columns statistic, p.value (optional in the case of the t-test functions) and dm, the difference of the group means (only in the case of the t-test functions). The row.names of the data.frame are taken from the corresponding dimension names of x. The degrees of freedom are provided in the attribute df. For the F-tests, if var.equal is 'FALSE', nrow(x)+1 degree of freedoms are given, the first one is the first degree of freedom (it is the same for each row) and the other ones are the second degree of freedom (one for each row).

Details

If fac is specified, rowttests performs for each row of x a two-sided, two-class t-test with equal variances. fac must be a factor of length ncol(x) with two levels, corresponding to the two groups. The sign of the resulting t-statistic corresponds to "group 1 minus group 2". If fac is missing, rowttests performs for each row of x a two-sided one-class t-test against the null hypothesis 'mean=0'. rowttests and colttests are implemented in C and should be reasonably fast and memory-efficient. fastT is an alternative implementation, in Fortran, possibly useful for certain legacy code. rowFtests and colFtests are currently implemented using matrix algebra in R. Compared to the rowttests and colttests functions, they are slower and use more memory.

References

B. L. Welch (1951), On the comparison of several mean values: an alternative approach. Biometrika, *38*, 330-336

See Also

mt.teststat

Examples

Run this code
##
   ## example data
   ##
   x  = matrix(runif(40), nrow=4, ncol=10)
   f2 = factor(floor(runif(ncol(x))*2))
   f4 = factor(floor(runif(ncol(x))*4))

   ##
   ## one- and two group row t-test; 4-group F-test
   ##
   r1 = rowttests(x)
   r2 = rowttests(x, f2)
   r4 = rowFtests(x, f4)

   ## approximate equality
   about.equal = function(x,y,tol=1e-10)
     stopifnot(is.numeric(x), is.numeric(y), length(x)==length(y), all(abs(x-y) < tol))

   ##
   ## compare with the implementation in t.test
   ##
   for (j in 1:nrow(x)) {
     s1 = t.test(x[j,])
     about.equal(s1$statistic, r1$statistic[j])
     about.equal(s1$p.value,   r1$p.value[j])

     s2 = t.test(x[j,] ~ f2, var.equal=TRUE)
     about.equal(s2$statistic, r2$statistic[j])
     about.equal(s2$p.value,   r2$p.value[j])

     dm = -diff(tapply(x[j,], f2, mean))
     about.equal(dm, r2$dm[j])

     s4 = summary(lm(x[j,] ~ f4))
     about.equal(s4$fstatistic["value"], r4$statistic[j])
   }

   ##
   ## colttests
   ##
   c2 = colttests(t(x), f2)
   stopifnot(identical(r2, c2))

   ##
   ## missing values
   ##
   f2n = f2
   f2n[sample(length(f2n), 3)] = NA
   r2n = rowttests(x, f2n)
   for(j in 1:nrow(x)) {
     s2n = t.test(x[j,] ~ f2n, var.equal=TRUE)
     about.equal(s2n$statistic, r2n$statistic[j])
     about.equal(s2n$p.value,   r2n$p.value[j])
   }

   ##
   ## larger sample size
   ##
   x  = matrix(runif(1000000), nrow=4, ncol=250000)
   f2 = factor(floor(runif(ncol(x))*2))
   r2 = rowttests(x, f2) 
   for (j in 1:nrow(x)) {
     s2 = t.test(x[j,] ~ f2, var.equal=TRUE)
     about.equal(s2$statistic, r2$statistic[j])
     about.equal(s2$p.value,   r2$p.value[j])
   }

   ## single row matrix
   rowFtests(matrix(runif(10),1,10),as.factor(c(rep(1,5),rep(2,5))))
   rowttests(matrix(runif(10),1,10),as.factor(c(rep(1,5),rep(2,5))))

Run the code above in your browser using DataLab