Learn R Programming

SparkR (version 2.1.2)

window: window

Description

Bucketize rows into one or more time windows given a timestamp specifying column. Window starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in the window [12:05,12:10) but not in [12:00,12:05). Windows can support microsecond precision. Windows in the order of months are not supported.

Usage

window(x, ...)

# S4 method for Column window(x, windowDuration, slideDuration = NULL, startTime = NULL)

Arguments

x

a time Column. Must be of TimestampType.

...

further arguments to be passed to or from other methods.

windowDuration

a string specifying the width of the window, e.g. '1 second', '1 day 12 hours', '2 minutes'. Valid interval strings are 'week', 'day', 'hour', 'minute', 'second', 'millisecond', 'microsecond'. Note that the duration is a fixed length of time, and does not vary over time according to a calendar. For example, '1 day' always means 86,400,000 milliseconds, not a calendar day.

slideDuration

a string specifying the sliding interval of the window. Same format as windowDuration. A new window will be generated every slideDuration. Must be less than or equal to the windowDuration. This duration is likewise absolute, and does not vary according to a calendar.

startTime

the offset with respect to 1970-01-01 00:00:00 UTC with which to start window intervals. For example, in order to have hourly tumbling windows that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide startTime as "15 minutes".

Value

An output column of struct called 'window' by default with the nested columns 'start' and 'end'.

See Also

Other datetime_funcs: add_months, date_add, date_format, date_sub, datediff, dayofmonth, dayofyear, from_unixtime, from_utc_timestamp, hour, last_day, minute, months_between, month, next_day, quarter, second, to_date, to_utc_timestamp, unix_timestamp, weekofyear, year

Examples

Run this code
# NOT RUN {
  # One minute windows every 15 seconds 10 seconds after the minute, e.g. 09:00:10-09:01:10,
  # 09:00:25-09:01:25, 09:00:40-09:01:40, ...
  window(df$time, "1 minute", "15 seconds", "10 seconds")

  # One minute tumbling windows 15 seconds after the minute, e.g. 09:00:15-09:01:15,
   # 09:01:15-09:02:15...
  window(df$time, "1 minute", startTime = "15 seconds")

  # Thirty-second windows every 10 seconds, e.g. 09:00:00-09:00:30, 09:00:10-09:00:40, ...
  window(df$time, "30 seconds", "10 seconds")
# }

Run the code above in your browser using DataLab