rowRanks: Gets the rank of the elements in each row (column) of a matrix

Description

Gets the rank of the elements in each row (column) of a matrix.

Usage

rowRanks(x, rows = NULL, cols = NULL, ties.method = c("max", "average",
  "first", "last", "random", "max", "min", "dense"), dim. = dim(x), ...,
  useNames = TRUE)
colRanks(x, rows = NULL, cols = NULL, ties.method = c("max", "average",
  "first", "last", "random", "max", "min", "dense"), dim. = dim(x),
  preserveShape = FALSE, ..., useNames = TRUE)

Value

A matrix of type integer is returned, unless ties.method = "average" when it is of type numeric.

The rowRanks() function always returns an NxK matrix, where N (K) is the number of rows (columns) whose ranks are calculated.

The colRanks() function returns an NxK matrix, if preserveShape = TRUE, otherwise a KxN matrix.

Any names of x are ignored and absent in the result.

Arguments

x: An NxK matrix or, if dim. is specified, an N * K vector.
rows: A vector indicating subset of rows to operate over. If NULL, no subsetting is done.
cols: A vector indicating subset of columns to operate over. If NULL, no subsetting is done.
ties.method: A character string specifying how ties are treated. For details, see below.
dim.: An integer vector of length two specifying the dimension of x, also when not a matrix. Comment: The reason for this argument being named with a period at the end is purely technical (we get a run-time error if we try to name it dim).
...: Not used.
useNames: If TRUE (default), names attributes of the result are set, otherwise not.
preserveShape: A logical specifying whether the matrix returned should preserve the input shape of x, or not.

Missing values

Missing values are ranked as NA_integer_, as with na.last = "keep" in the rank() function.

Performance

The implementation is optimized for both speed and memory. To avoid coercing to doubles (and hence memory allocation), there is a unique implementation for integer matrices. Furthermore, it is more memory efficient to do colRanks(x, preserveShape = TRUE) than t(colRanks(x, preserveShape = FALSE)).

Author

Hector Corrada Bravo and Harris Jaffee. Peter Langfelder for adding 'ties.method' support. Brian Montgomery for adding more 'ties.method's. Henrik Bengtsson adapted the original native implementation of rowRanks() from Robert Gentleman's rowQ() in the Biobase package.

Details

These functions rank values and treats missing values the same way as rank(). For equal values ("ties"), argument ties.method determines how these are ranked among each other. More precisely, for the following values of ties.method, each index set of ties consists of:

"first" - increasing values that are all unique
"last" - decreasing values that are all unique
"min" - identical values equaling the minimum of their original ranks
"max" - identical values equaling the maximum of their original ranks
"average" - identical values that equal the sample mean of their original ranks. Because the average is calculated, the returned ranks may be non-integer values
"random" - randomly shuffled values of their original ranks.
"dense" - increasing values that are all unique and, contrary to "first", never contain any gaps

For more information on ties.method = "dense", see frank() of the data.table package. For more information on the other alternatives, see rank().

Note that, due to different randomization strategies, the shuffling order produced by these functions when using ties.method = "random" does not reproduce that of rank().

WARNING: For backward-compatibility reasons, the default is ties.method = "max", which differs from rank() which uses ties.method = "average" by default. Since we plan to change the default behavior in a future version, we recommend to explicitly specify the intended value of argument ties.method.