Learn R Programming

openair (version 0.3-8)

timePlot: Plot time series

Description

Plot time series quickly, perhaps for multiple pollutants, grouped or in separate panels.

Usage

timePlot(mydata,
pollutant = "nox",
group = FALSE,
stack = FALSE,
normalise = FALSE,
avg.time = "default",
data.thresh = 0,
statistic = "mean",
percentile = 95,
date.pad = FALSE,
type = "default",
layout = c(1, 1),
cols = "brewer1",
main = "",
ylab = pollutant,
lty = 1:length(pollutant),
lwd = 1,
key = TRUE,
strip = TRUE,
log = FALSE,
smooth = FALSE,
ci = TRUE,
key.columns = 1,
name.pol = pollutant,
date.breaks = 7,
auto.text = TRUE, ...)

Arguments

mydata
A data frame of hourly (or higher temporal resolution data). Must include a date field and at least one variable to plot.
pollutant
Name of variable to plot. Two or more pollutants can be plotted, in which case a form like pollutant = c("nox", "co") should be used.
group
If more than one pollutant is chosen, should they all be plotted on the same graph together? The default is FALSE, which means they are plotted in separate panels with their own scaled. If TRUE then they are plotte
stack
If TRUE the time series will be stacked by year. This option can be useful if there are several years worth of data making it difficult to see much detail when plotted on a single plot.
normalise
Should variables be normalised? The default is FALSE. If TRUE then the variable(s) are divided by their mean values. This helps to compare the shape of the diurnal trends for variables on very different scales.
avg.time
This defines the time period to average to. Can be "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year". For much increased flexibility a number can precede these options followed by a space. For example, a timeAverage of 2
data.thresh
The data capture threshold to use (%) when aggregating the data using avg.time. A value of zero means that all available data will be used in a particular period regardless if of the number of values available. Conversely, a value of
statistic
The statistic to apply when aggregating the data; default is the mean. Can be one of "mean", "max", "min", "median", "frequency", "sd", "percentile". Note that "sd" is the standard deviation and "frequency" is the number (frequency) of valid r
percentile
The percentile level in % used when statistic = "percentile" and when aggregating the data with avg.time. The default is 95. Not used if avg.time = "default".
date.pad
Should missing data be padded-out? This is useful where a data frame consists of two or more "chunks" of data with time gaps between them. By setting date.pad = TRUE the time gaps between the chunks are shown properly, rather than wi
type
Currently only "default" or "site" are available options. If type = "site" then plots for a single pollutant across several sites will be plotted.
layout
Determines how the panels are laid out. By default, plots will be shown in one column with the number of rows equal to the number of pollutants, for example. If the user requires 2 columns and two rows, layout should be set to layout
cols
Colours to be used for plotting. Options include "default", "increment", "heat", "spectral", "hue", "brewer1" (default) and user defined (see manual for more details). The same line colour can be set for all pollutant e.g. cols = "bla
main
The plot title; default is no title.
ylab
Name of y-axis variable. By default will use the name of pollutant(s).
lty
The line type used for plotting. Default is to provide different line types for different pollutant. If one requires a continuous line for all pollutants, the set lty = 1, for example. See lty option for standard p
lwd
The line width used; default is 1. To set a wider line for all pollutant the choose, for example, lwd = 2. Alternatively, varying line widths can be chosen depending on the pollutant. For example, if pollutant = c("nox",
key
Should a key be drawn? The default is TRUE.
strip
Should a strip be drawn? The default is TRUE.
log
Should the y-axis appear on a log scale? The default is FALSE. If TRUE a well-formatted log10 scale is used. This can be useful for plotting data for several different pollutants that exist on very different scales. I
smooth
Should a smooth line be applied to the data? The default is FALSE.
ci
If a smooth fit line is applied, then ci determines whether the 95% confidence intervals aer shown.
key.columns
Number of columns to be used in the key. With many pollutants a single column can make to key too wide. The user can thus choose to use several columns by setting columns to be less than the number of pollutants.
name.pol
This option can be used to give alternative names for the variables plotted. Instead of taking the column headings as names, the user can supply replacements. For example, if a column had the name "nox" and the user wanted a different description
date.breaks
Number of major x-axis intervals to use. The function will try and choose a sensible number of dates/times as well as formatting the date/time appropriately to the range being considered. This does not always work as desired automatically.
auto.text
Either TRUE (default) or FALSE. If TRUE titles and axis labels will automatically try and format pollutant names and units properly e.g. by subscripting the `2' in NO2.
...
Other graphical parameters.

Details

The timePlot is the basic time series plotting function in openair. Its purpose is to make it quick and easy to plot time series for pollutants and other variables. The other purpose is to plot potentially many variables together in as compact a way as possible. The function is flexible enough to plot more than one variable at once. If more than one variable is chosen plots it can either show all variables on the same plot (with different line types) on the same scale, or (if group = FALSE) each variable in its own panels with its own scale. The general preference is not to plot two variables on the same graph with two different y-scales. It can be misleading to do so and difficult with more than two variables. If there is in interest in plotting several variables together that have very different scales, then it can be useful to normalise the data first, which can be down be setting normalise = TRUE. This option ensures that each variable is divided by its mean and makes it easy to plot two or more variables on the same plot - generally with group = TRUE. The user has fine control over the choice of colours, line width and line types used. This is useful for example, to emphasise a particular variable with a specific line type/colour/width. timePlot works very well with selectByDate, which is used for selecting particular date ranges quickly and easily. See examples below. By default plots are shown with a colour key at the bottom and in teh case of multiple pollutants or sites, strips on teh left of each plot. Sometimes this may be overkill and the user can opt to remove the key and/or the strip by setting key and/or strip to FALSE. One reason to do this is to maximise the plotting area and therefore the information shown.

See Also

MannKendall, smoothTrend, linearRelation, selectByDate and timeAverage for details on selecting averaging times and other statistics in a flexible way

Examples

Run this code
# basic use, single pollutant
timePlot(mydata, pollutant = "nox")

# two pollutants in separate panels
timePlot(mydata, pollutant = c("nox", "no2"))

# two pollutants in the same panel with the same scale
timePlot(mydata, pollutant = c("nox", "no2"), group = TRUE)

# alternative by normalising concentrations and plotting on the same
  scale
timePlot(mydata, pollutant = c("nox", "no2"), group = TRUE, normalise =
  TRUE)

# examples of selecting by date

# plot for nox in 1999
timePlot(selectByDate(mydata, year = 1999), pollutant = "nox")

# select specific date range for two pollutants
timePlot(selectByDate(mydata, start = "6/8/2003", end = "13/8/2003"),
pollutant = c("no2", "o3"))

# choose different line styles etc
timePlot(mydata, pollutant = c("nox", "no2"), lty = 1)

# choose different line styles etc
timePlot(selectByDate(mydata, year = 2004, month = 6), pollutant =
c("nox", "no2"), lwd = c(1, 2), col = "black")

# different averaging times

#daily mean O3
timePlot(mydata, pollutant = "o3", avg.time = "day")

# daily mean O3 ensuring each day has data capture of at least 75\%
timePlot(mydata, pollutant = "o3", avg.time = "day", data.thresh = 75)

# 2-week average of O3 concentrations
timePlot(mydata, pollutant = "o3", avg.time = "2 week")

Run the code above in your browser using DataLab