Learn R Programming

openair (version 1.5)

timeAverage: Function to calculate time averages for data frames

Description

Function to flexibly aggregate or expand data frames by different time periods, calculating vector-averaged wind direction where appropriate. The averaged periods can also take account of data capture rates.

Usage

timeAverage(mydata, avg.time = "day", data.thresh = 0, statistic = "mean",
  percentile = NA, start.date = NA, end.date = NA, interval = NA,
  vector.ws = FALSE, fill = FALSE, ...)

Arguments

mydata
A data frame containing a date field . Can be class POSIXct or Date.
avg.time
This defines the time period to average to. Can be sec, min, hour, day, DSTday, week, month, quarter or year
data.thresh
The data capture threshold to use (%). A value of zero means that all available data will be used in a particular period regardless if of the number of values available. Conversely, a value of 100 will mean that all data will need to be present for th
statistic
The statistic to apply when aggregating the data; default is the mean. Can be one of mean, max, min, median, frequency, sd, percentile
percentile
The percentile level in % used when statistic = "percentile". The default is 95.
start.date
A string giving a start date to use. This is sometimes useful if a time series starts between obvious intervals. For example, for a 1-minute time series that starts 2009-11-29 12:07:00 that needs to be averaged up to 15-minute means,

Value

  • Returns a data frame with date in class POSIXct and will remove any non-numeric columns except a column "site".

item

  • end.date
  • interval
  • vector.ws
  • fill
  • ...

code

timeAverage

emph

force

sQuote

  • padded out
  • pad-out

Details

This function calculates time averages for a data frame. It also treats wind direction correctly through vector-averaging. For example, the average of 350 degrees and 10 degrees is either 0 or 360 - not 180. The calculations therefore average the wind components. When a data capture threshold is set through data.thresh it is necessary for timeAverage to know what the original time interval of the input time series is. The function will try and calculate this interval based on the most common time gap (and will print the assumed time gap to the screen). This works fine most of the time but there are occasions where it may not e.g. when very few data exist in a data frame or the data are monthly (i.e. non-regular time interval between months). In this case the user can explicitly specify the interval through interval in the same format as avg.time e.g. interval = "month". It may also be useful to set start.date and end.date if the time series do not span the entire period of interest. For example, if a time series ended in October and annual means are required, setting end.date to the end of the year will ensure that the whole period is covered and that data.thresh is correctly calculated. The same also goes for a time series that starts later in the year where start.date should be set to the beginning of the year. timeAverage should be useful in many circumstances where it is necessary to work with different time average data. For example, hourly air pollution data and 15-minute meteorological data. To merge the two data sets timeAverage can be used to make the meteorological data 1-hour means first. Alternatively, timeAverage can be used to expand the hourly data to 15 minute data - see example below. For the research community timeAverage should be useful for dealing with outputs from instruments where there are a range of time periods used. It is also very useful for plotting data using timePlot. Often the data are too dense to see patterns and setting different averaging periods easily helps with interpretation.

See Also

See timePlot that plots time series data and uses timeAverage to aggregate data where necessary.

Examples

Run this code
## daily average values
daily <- timeAverage(mydata, avg.time = "day")

## daily average values ensuring at least 75 \% data capture
## i.e. at least 18 valid hours
daily <- timeAverage(mydata, avg.time = "day", data.thresh = 75)

## 2-weekly averages
fortnight <- timeAverage(mydata, avg.time = "2 week")

## make a 15-minute time series from an hourly one
min15 <-  timeAverage(mydata, avg.time = "15 min", fill = TRUE)

Run the code above in your browser using DataLab