Learn R Programming

tsibble (version 0.6.2)

tidyverse: Tidyverse methods for tsibble

Description

  • arrange(): if not arranging key and index in past-to-future order, a warning is likely to be issued.

  • slice(): if row numbers are not in ascending order, a warning is likely to be issued.

  • select(): keeps the variables you mention as well as the index.

  • transmute(): keeps the variable you operate on, as well as the index and key.

  • summarise() will not collapse on the index variable.

  • Column-wise verbs, including select(), transmute(), summarise(), mutate() & transmute(), keep the time context hanging around. That is, the index variable cannot be dropped for a tsibble. If any key variable is changed, it will validate whether it's a tsibble internally. Use as_tibble() to leave off the time context.

  • unnest() requires argument key = id() to get back to a tsibble.

Usage

# S3 method for tbl_ts
arrange(.data, ...)

# S3 method for grouped_ts arrange(.data, ..., .by_group = FALSE)

# S3 method for tbl_ts filter(.data, ...)

# S3 method for tbl_ts slice(.data, ...)

# S3 method for tbl_ts select(.data, ..., .drop = FALSE)

# S3 method for tbl_ts rename(.data, ...)

# S3 method for tbl_ts mutate(.data, ..., .drop = FALSE)

# S3 method for tbl_ts transmute(.data, ..., .drop = FALSE)

# S3 method for tbl_ts summarise(.data, ..., .drop = FALSE)

# S3 method for tbl_ts summarize(.data, ..., .drop = FALSE)

# S3 method for tbl_ts group_by(.data, ..., add = FALSE)

# S3 method for grouped_ts ungroup(x, ...)

# S3 method for tbl_ts left_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

# S3 method for tbl_ts right_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

# S3 method for tbl_ts inner_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

# S3 method for tbl_ts full_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)

# S3 method for tbl_ts semi_join(x, y, by = NULL, copy = FALSE, ...)

# S3 method for tbl_ts anti_join(x, y, by = NULL, copy = FALSE, ...)

# S3 method for tbl_ts gather(data, key = "key", value = "value", ..., na.rm = FALSE, convert = FALSE, factor_key = FALSE)

# S3 method for tbl_ts spread(data, key, value, fill = NA, convert = FALSE, drop = TRUE, sep = NULL)

# S3 method for tbl_ts nest(data, ..., .key = "data")

# S3 method for lst_ts unnest(data, ..., key = id(), .drop = NA, .id = NULL, .sep = NULL, .preserve = NULL)

# S3 method for tbl_ts unnest(data, ..., key = id(), .drop = NA, .id = NULL, .sep = NULL, .preserve = NULL)

# S3 method for grouped_ts fill(data, ..., .direction = c("down", "up"))

Arguments

.data

A tbl_ts.

...

same arguments accepted as its dplyr generic.

.by_group

If TRUE, will sort first by grouping variable. Applies to grouped data frames only.

.drop

Deprecated, please use as_tibble() for .drop = TRUE instead. FALSE returns a tsibble object as the input. TRUE drops a tsibble and returns a tibble.

add

When add = FALSE, the default, group_by() will override existing groups. To add to the existing groups, use add = TRUE.

y

tbls to join

by

a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right (to suppress the message, simply explicitly list the variables that you want to join).

To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match x.a to y.b.

copy

If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.

suffix

If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

data

A data frame.

key

Unquoted variables to create the key (via id) after unnesting.

value

Names of new key and value columns, as strings or symbols.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

na.rm

If TRUE, will remove rows from output where the value column in NA.

convert

If TRUE will automatically run type.convert() on the key column. This is useful if the column types are actually numeric, integer, or logical.

factor_key

If FALSE, the default, the key values will be stored as a character vector. If TRUE, will be stored as a factor, which preserves the original ordering of the columns.

fill

If set, missing values will be replaced with this value. Note that there are two types of missingness in the input: explicit missing values (i.e. NA), and implicit missings, rows that simply aren't present. Both types of missing value will be replaced by fill.

drop

If FALSE, will keep factor levels that don't appear in the data, filling in missing combinations with fill.

sep

If NULL, the column names will be taken from the values of key variable. If non-NULL, the column names will be given by "<key_name><sep><key_value>".

.key

The name of the new column, as a string or symbol.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

.id

Data frame identifier - if supplied, will create a new column with name .id, giving a unique identifier. This is most useful if the list column is named.

.sep

If non-NULL, the names of unnested data frame columns will combine the name of the original list-col with the names from nested data frame, separated by .sep.

.preserve

Optionally, list-columns to preserve in the output. These will be duplicated in the same way as atomic vectors. This has dplyr::select semantics so you can preserve multiple variables with .preserve = c(x, y) or .preserve = starts_with("list").

.direction

Direction in which to fill missing values. Currently either "down" (the default) or "up".

Examples

Run this code
# NOT RUN {
# Sum over sensors ----
pedestrian %>%
  summarise(Total = sum(Count))
# Back to tibble
pedestrian %>%
  as_tibble() %>%
  summarise(Total = sum(Count))
# example from tidyr
stocks <- tsibble(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)
stocks %>% gather(stock, price, -time)
# example from tidyr
stocks <- tsibble(
  time = as.Date('2009-01-01') + 0:9,
  X = rnorm(10, 0, 1),
  Y = rnorm(10, 0, 2),
  Z = rnorm(10, 0, 4)
)
stocksm <- stocks %>% gather(stock, price, -time)
stocksm %>% spread(stock, price)
nested_stock <- stocksm %>% 
  nest(-stock)
stocksm %>% 
  group_by(stock) %>% 
  nest()
nested_stock %>% 
  unnest(key = id(stock))
stock_qtl <- stocksm %>% 
  group_by(stock) %>% 
  index_by(day3 = lubridate::floor_date(time, unit = "3 day")) %>% 
  summarise(
    value = list(quantile(price)), 
    qtl = list(c("0%", "25%", "50%", "75%", "100%"))
  )
unnest(stock_qtl, key = id(qtl))
# }

Run the code above in your browser using DataLab