Learn R Programming

chillR (version 0.75)

make_all_day_table: Fill in missing days in incomplete time series

Description

Time series often have gaps, and these are often not marked by 'no data' values but simply missing from the dataset. This function completes the time series by adding lines for all these missing records. For these lines, all values are set to 'NA'. By setting timestep<-"hour", this function can also process hourly data. Where data are provided at a time resolution that is finer than timestep, values are aggregated (by calculating the mean) to timestep resolution (e.g. when data are at 15-minute resolution, they will be aggregated to hourly average values - at timestep=="hour" - or daily average values - at timestep=="day").

Usage

make_all_day_table(
  tab,
  timestep = "day",
  input_timestep = timestep,
  tz = "GMT",
  add.DATE = TRUE,
  no_variable_check = FALSE,
  aggregation_hours = NULL
)

Value

data frame containing all the columns of the input data frame, but one row for each day between the start and end of the dataset. Data values for the missing rows are filled in as 'NA'. Dates are expressed as c("YEARMODA","DATE","Year","Month","Day"). In this, 'DATE' is the date in ISOdate format.

Arguments

tab

a data.frame containing a time series dataset. It should have columns c("Year", "Month", "Day") or c("YEAR", "MONTH","DAY") or "YEARMODA".

timestep

time step for the table. This defaults to 'day' but can also be 'hour'

input_timestep

can also be 'day' or 'hour' and defaults to the value assigned to timestep. If timestep is 'day' and input_timestep is 'hour', hourly records are aggregated to daily Tmin, Tmean and Tmax.

tz

timezone. Defaults to GMT. While it isn't important in what time zone the temperatures were recorded, the onset of daylight savings time can cause problems. 'GMT' is the correct setting in cases were the recorded times weren't adjusted according to daylight savings time (i.e. no hours omitted or double-counted because of such adjustment).

add.DATE

boolean parameter indicating whether a column called DATE which contains the IOSdate should be added to the output data.frame.

no_variable_check

boolean parameter to indicate whether the function should check if the dataset contains the usual chillR temperature variables. Defaults to TRUE, but should be set to FALSE for different data formats.

aggregation_hours

vector or list consisting of three integers that specify how the function should search for daily minimum and maximum temperatures in hourly datasets, when not all hourly temperatures have been observed. This is only relevant during conversion from hourly to daily data. Tmin and Tmax can only be derived when temperatures have been recorded during the coldest and warmest parts of the day, respectively. The function should therefore check if records are available for these times. The elements of `aggregation_hours` describe window sizes for the times (as number of hours), during which the coldest and warmest temperature typically occurs. The first two elements (which can be named `min_hours` and `max_hours`) specify the number of hours contained in these windows for the cold and warm parts of the day, respectively. These hours are determined by computing mean hourly temperatures over the entire weather record, disaggregated by month to account for the impact of daylength. The third element, `hours_needed` specifies how many records during these windows have to have been recorded. `aggregation_hours` defaults to NULL, in which case the parameter is ignored.

Author

Eike Luedeling

References

Luedeling E, Kunz A and Blanke M, 2013. Identification of chilling and heat requirements of cherry trees - a statistical approach. International Journal of Biometeorology 57,679-689.

Examples

Run this code

#fill in missing lines in a weather dataset (modified from KA_weather)
day_to_day<-make_all_day_table(KA_weather[c(1:10,20:30),],timestep="day")

#fill in missing hours in the Winters_hours_gaps dataset
Winters_hours<-subset(Winters_hours_gaps, select = -c(Temp_gaps))[1:2000,]
hour_to_hour<-make_all_day_table(Winters_hours,timestep="hour",input_timestep="hour")

#convert Winters_hours_gaps dataset into daily temperature data (min, max, mean)
hour_to_day<-make_all_day_table(Winters_hours,timestep="day",input_timestep="hour")
hour_to_day<-make_all_day_table(Winters_hours,timestep="day",input_timestep="hour",
                               aggregation_hours=c(3,3,2))

Run the code above in your browser using DataLab