Learn R Programming

gtools (version 3.9.5)

running: Apply a Function Over Adjacent Subsets of a Vector

Description

Applies a function over subsets of the vector(s) formed by taking a fixed number of previous points.

Usage

running(
  X,
  Y = NULL,
  fun = mean,
  width = min(length(X), 20),
  allow.fewer = FALSE,
  pad = FALSE,
  align = c("right", "center", "left"),
  simplify = TRUE,
  by,
  ...
)

Value

List (if simplify==TRUE), vector, or matrix containing the results of applying the function fun to the subsets of X

(running) or X and Y.

Note that this function will create a vector or matrix even for objects which are not simplified by sapply.

Arguments

X

data vector

Y

data vector (optional)

fun

Function to apply. Default is mean

width

Integer giving the number of vector elements to include in the subsets. Defaults to the lesser of the length of the data and 20 elements.

allow.fewer

Boolean indicating whether the function should be computed for subsets with fewer than width points

pad

Boolean indicating whether the returned results should be 'padded' with NAs corresponding to sets with less than width elements. This only applies when when allow.fewer is FALSE.

align

One of "right", "center", or "left". This controls the relative location of `short' subsets with less then width elements: "right" allows short subsets only at the beginning of the sequence so that all of the complete subsets are at the end of the sequence (i.e. `right aligned'), "left" allows short subsets only at the end of the data so that the complete subsets are `left aligned', and "center" allows short subsets at both ends of the data so that complete subsets are `centered'.

simplify

Boolean. If FALSE the returned object will be a list containing one element per evaluation. If TRUE, the returned object will be coerced into a vector (if the computation returns a scalar) or a matrix (if the computation returns multiple values). Defaults to FALSE.

by

Integer separation between groups. If by=width will give non-overlapping windows. Default is missing, in which case groups will start at each value in the X/Y range.

...

parameters to be passed to fun

Author

Gregory R. Warnes greg@warnes.net, with contributions by Nitin Jain nitin.jain@pfizer.com.

Details

running applies the specified function to a sequential windows on X and (optionally) Y. If Y is specified the function must be bivariate.

See Also

wapply to apply a function over an x-y window centered at each x point, sapply, lapply

Examples

Run this code


# show effect of pad
running(1:20, width = 5)
running(1:20, width = 5, pad = TRUE)

# show effect of align
running(1:20, width = 5, align = "left", pad = TRUE)
running(1:20, width = 5, align = "center", pad = TRUE)
running(1:20, width = 5, align = "right", pad = TRUE)

# show effect of simplify
running(1:20, width = 5, fun = function(x) x) # matrix
running(1:20, width = 5, fun = function(x) x, simplify = FALSE) # list

# show effect of by
running(1:20, width = 5) # normal
running(1:20, width = 5, by = 5) # non-overlapping
running(1:20, width = 5, by = 2) # starting every 2nd


# Use 'pad' to ensure correct length of vector, also show the effect
# of allow.fewer.
par(mfrow = c(2, 1))
plot(1:20, running(1:20, width = 5, allow.fewer = FALSE, pad = TRUE), type = "b")
plot(1:20, running(1:20, width = 5, allow.fewer = TRUE, pad = TRUE), type = "b")
par(mfrow = c(1, 1))

# plot running mean and central 2 standard deviation range
# estimated by *last* 40 observations
dat <- rnorm(500, sd = 1 + (1:500) / 500)
plot(dat)
sdfun <- function(x, sign = 1) mean(x) + sign * sqrt(var(x))
lines(running(dat, width = 51, pad = TRUE, fun = mean), col = "blue")
lines(running(dat, width = 51, pad = TRUE, fun = sdfun, sign = -1), col = "red")
lines(running(dat, width = 51, pad = TRUE, fun = sdfun, sign = 1), col = "red")


# plot running correlation estimated by last 40 observations (red)
# against the true local correlation (blue)
sd.Y <- seq(0, 1, length = 500)

X <- rnorm(500, sd = 1)
Y <- rnorm(500, sd = sd.Y)

plot(running(X, X + Y, width = 20, fun = cor, pad = TRUE), col = "red", type = "s")

r <- 1 / sqrt(1 + sd.Y^2) # true cor of (X,X+Y)
lines(r, type = "l", col = "blue")

Run the code above in your browser using DataLab