
order
returns a permutation which rearranges its first
argument into ascending or descending order, breaking ties by further
arguments. sort.list
is the same, using only one argument.
See the examples for how to use these functions to sort data frames,
etc.order(…, na.last = TRUE, decreasing = FALSE,
method = c("auto", "shell", "radix"))sort.list(x, partial = NULL, na.last = TRUE, decreasing = FALSE,
method = c("auto", "shell", "quick", "radix"))
NULL
values are not implemented.)"radix"
method, this can be a vector of
length equal to the number of arguments in …
. For the
other methods, it must be length one.NA
s.
If TRUE
, missing values in the data are put last; if
FALSE
, they are put first; if NA
, they are removed
(see ‘Note’.)"auto"
) implies "radix"
for short numeric vectors,
integer vectors, logical vectors and factors.
Otherwise, it implies "shell"
.
For details of methods "shell"
,
"quick"
, and "radix"
,
see the help for sort
.…
arguments,
as the names could match current or future control
arguments such as decreasing
. A sometimes-encountered unsafe
practice is to call do.call('order', df_obj)
where
df_obj
might be a data frame: copy df_obj
and
remove any names, for example using unname
.method = "quick"
),
so any unresolved ties will be left in their original ordering. Complex values are sorted first by the real part, then the imaginary
part. Except for method "radix"
, the sort order for character vectors
will depend on the collating sequence of the locale in use: see
Comparison
. The "shell"
method is generally the safest bet and is the
default method, except for short factors, numeric vectors,
integer vectors and logical vectors, where "radix"
is assumed.
Method "radix"
stably sorts logical,
numeric and character vectors in linear time. It outperforms the other
methods, although there are caveats (see sort
). Method
"quick"
for sort.list
is only supported for numeric
x
with na.last = NA
, is not stable, and is slower than
"radix"
. partial = NULL
is supported for compatibility with other
implementations of S, but no other values are accepted and ordering is
always complete. For a classed R object, the sort order is taken from
xtfrm
: as its help page notes, this can be slow unless a
suitable method has been defined or is.numeric(x)
is
true. For factors, this sorts on the internal codes, which is
particularly appropriate for ordered factors.sort
, rank
, xtfrm
.require(stats)
(ii <- order(x <- c(1,1,3:1,1:4,3), y <- c(9,9:1), z <- c(2,1:9)))
## 6 5 2 1 7 4 10 8 3 9
rbind(x, y, z)[,ii] # shows the reordering (ties via 2nd & 3rd arg)
## Suppose we wanted descending order on y.
## A simple solution for numeric 'y' is
rbind(x, y, z)[, order(x, -y, z)]
## More generally we can make use of xtfrm
cy <- as.character(y)
rbind(x, y, z)[, order(x, -xtfrm(cy), z)]
## The radix sort supports multiple 'decreasing' values:
rbind(x, y, z)[, order(x, cy, z, decreasing = c(FALSE, TRUE, FALSE),
method="radix")]
## Sorting data frames:
dd <- transform(data.frame(x, y, z),
z = factor(z, labels = LETTERS[9:1]))
## Either as above {for factor 'z' : using internal coding}:
dd[ order(x, -y, z), ]
## or along 1st column, ties along 2nd, ... *arbitrary* no.{columns}:
dd[ do.call(order, dd), ]
set.seed(1) # reproducible example:
d4 <- data.frame(x = round( rnorm(100)), y = round(10*runif(100)),
z = round( 8*rnorm(100)), u = round(50*runif(100)))
(d4s <- d4[ do.call(order, d4), ])
(i <- which(diff(d4s[, 3]) == 0))
# in 2 places, needed 3 cols to break ties:
d4s[ rbind(i, i+1), ]
## rearrange matched vectors so that the first is in ascending order
x <- c(5:1, 6:8, 12:9)
y <- (x - 5)^2
o <- order(x)
rbind(x[o], y[o])
## tests of na.last
a <- c(4, 3, 2, NA, 1)
b <- c(4, NA, 2, 7, 1)
z <- cbind(a, b)
(o <- order(a, b)); z[o, ]
(o <- order(a, b, na.last = FALSE)); z[o, ]
(o <- order(a, b, na.last = NA)); z[o, ]
## speed examples on an average laptop for long vectors:
## factor/small-valued integers:
x <- factor(sample(letters, 1e7, replace = TRUE))
system.time(o <- sort.list(x, method = "quick", na.last = NA)) # 0.1 sec
stopifnot(!is.unsorted(x[o]))
system.time(o <- sort.list(x, method = "radix")) # 0.05 sec, 2X faster
stopifnot(!is.unsorted(x[o]))
## large-valued integers:
xx <- sample(1:200000, 1e7, replace = TRUE)
system.time(o <- sort.list(xx, method = "quick", na.last = NA)) # 0.3 sec
system.time(o <- sort.list(xx, method = "radix")) # 0.2 sec
## character vectors:
xx <- sample(state.name, 1e6, replace = TRUE)
system.time(o <- sort.list(xx, method = "shell")) # 2 sec
system.time(o <- sort.list(xx, method = "radix")) # 0.007 sec, 300X faster
## double vectors:
xx <- rnorm(1e6)
system.time(o <- sort.list(xx, method = "shell")) # 0.4 sec
system.time(o <- sort.list(xx, method = "quick", na.last = NA)) # 0.1 sec
system.time(o <- sort.list(xx, method = "radix")) # 0.05 sec, 2X faster
Run the code above in your browser using DataLab