Learn R Programming

openair (version 2.9-0)

import: Generic data import for openair

Description

This function is mostly used to simplify the importing of csv and text file in openair. In particular it helps to get the date or date/time into the correct format. The file can contain either a date or date/time in a single column or a date in one column and time in another.

Usage

import(
  file = file.choose(),
  file.type = "csv",
  sep = ",",
  header.at = 1,
  data.at = 2,
  date = "date",
  date.format = "%d/%m/%Y %H:%M",
  time = NULL,
  time.format = NULL,
  tzone = "GMT",
  na.strings = c("", "NA"),
  quote = "\"",
  ws = NULL,
  wd = NULL,
  correct.time = NULL,
  ...
)

Arguments

file

The name of the file to be imported. Default, file = file.choose(), opens browser. Alternatively, the use of read.table (in utils) also allows this to be a character vector of a file path, connection or url.

file.type

The file format, defaults to common ‘csv’ (comma delimited) format, but also allows ‘txt’ (tab delimited).

sep

Allows user to specify a delimiter if not ‘,’ (csv) or TAB (txt). For example ‘;’ is sometimes used to delineate separate columns.

header.at

The file row holding header information or NULL if no header to be used.

data.at

The file row to start reading data from. When generating the data frame, the function will ignore all information before this row, and attempt to include all data from this row onwards.

date

Name of the field containing the date. This can be a date e.g. 10/12/2012 or a date-time format e.g. 10/12/2012 01:00.

date.format

The format of the date. This is given in ‘R’ format according to strptime. For example, a date format such as 1/11/2000 12:00 (day/month/year hour:minutes) is given the format “%d/%m/%Y %H:%M”. See examples below and strptime for more details.

time

The name of the column containing a time --- if there is one. This is used when a time is given in a separate column and date contains no information about time.

time.format

If there is a column for time then the time format must be supplied. Common examples include “%H:%M” (like 07:00) or an integer giving the hour, in which case the format is “%H”. Again, see examples below.

tzone

The time zone for the data. In order to avoid the complexities of DST (daylight savings time), openair assumes the data are in GMT (UTC) or a constant offset from GMT. Users can set a positive or negative offset in hours from GMT. For example, to set the time zone of the data to the time zone in New York (EST, 5 hours behind GMT) set tzone = "Etc/GMT+5". To set the time zone of the data to Central European Time (CET, 1 hour ahead of GMT) set tzone = "Etc/GMT-1". Note that the positive and negative offsets are opposite to what most users expect.

na.strings

Strings of any terms that are to be interpreted as missing (NA). For example, this might be “-999”, or “n/a” and can be of several items.

quote

String of characters (or character equivalents) the imported file may use to represent a character field.

ws

Name of wind speed field if present if different from “ws” e.g. ws = "WSPD".

wd

Name of wind direction field if present if different from “wd” e.g. wd = "WDIR".

correct.time

Numerical correction (in seconds) for imported date. Default NULL turns this option off. This can be useful if the hour is represented as 1 to 24 (rather than 0 to 23 assumed by R). In which case correct.time = -3600 will correct the hour.

...

Other arguments passed to read.table.

Value

A data frame formatted for openair use.

Details

The function uses strptime to parse dates and times. Users should consider the examples for use of these formats.

The function can either deal with combined date-time formats e.g. 10/12/1999 23:00 or with two separate columns that deal with date and time. Often there is a column for the date and another for hour. For the latter, the option time.format = "%H" should be supplied. Note that R considers hours 0 to 23. However, if hours 1 to 24 are detected import will correct the hours accordingly.

import will also ensure wind speed and wind direction are correctly labelled (i.e. "ws", "wd") if ws or wd are given.

Note that it is assumed that the input data are in GMT (UTC) format and in particular there is no consideration of daylight saving time i.e. where in the input data set an hour is missing in spring and duplicated in autumn.

Examples of use are given in the openair manual.

See Also

Dedicated import functions available for selected file types, e.g. : importAURN, importAURNCsv, importKCL, importADMS, etc.