Learn R Programming

countyweather (version 0.1.0)

daily_df: Return average daily weather data for a particular county.

Description

Returns a list with data on weather and stations for a selected county. This function serves as a wrapper to several functions from the rnoaa package, which pull weather data from all relevant stations in a county. This function filters and averages data returned by rnoaa functions across all weather stations in a county based on user-specified coverage specifications.

Usage

daily_df(stations, coverage = NULL, var = "all", date_min = NULL, date_max = NULL, average_data = TRUE)

Arguments

stations
A dataframe containing station metadata, returned from the function daily_stations.
coverage
A numeric value in the range of 0 to 1 that specifies the desired percentage coverage for the weather variable (i.e., what percent of each weather variable must be non-missing to include data from a monitor when calculating daily values averaged across monitors. The default is to include all monitors with any available data (i.e., coverage = 0).)
var
A character vector specifying desired weather variables. For example, var = c("tmin", "tmax", "prcp") for maximum temperature, minimum temperature, and precipitation. The default is "all", which includes all available weather variables at any weather station in the county. For a full list of all possible variable names, see NOAA's README file for the Daily Global Historical Climatology Network (GHCN-Daily) at http://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt. Many of the weather variables are available for some, but not all, monitors, so your output from this function may not include all the variables specified using this argument. If you specify a variable here but it is not included in the output dataset, it means that it was not available in the time range for any monitor in the county.
date_min
A string with the desired starting date in character, ISO format ("yyyy-mm-dd"). The dataframe returned will include only stations that have data for dates including and after the specified date.
date_max
A string with the desired ending date in character, ISO format ("yyyy-mm-dd"). The dataframe returned will include only stations that have data for dates up to and including the specified date.
average_data
TRUE / FALSE to indicate if you want the function to average daily weather data across multiple monitors. If you choose FALSE, the function will return a dataframe with separate entries for each monitor, while TRUE (the default) outputs a single estimate for each day in the dataset, giving the average value of the weather metric across all available monitors in the county that day.

Value

A list with two elements. daily_data is a dataframe of daily weather data averaged across multiple monitors and includes columns ("var"_reporting) for each weather variable showing the number of stations contributing to the average for that variable on that day. The element station_df is a dataframe of station metadata for each station contributing weather data. A weather station will have one row per weather variable to which it contributes data. In addition to information such as station id, name, latitude, and longitude, the station_df dataframe includes statistical information about weather values contributed by each station for each weather variable. These statistics include calc_coverage (the percent of non-missing values for each station-weather variable combination for the specified date range), standard_dev (standard deviation), max, and min, (giving the minimum and maximum values), and range, giving the range of values in each station-weather variable combination. The element radius is the calculated radius within which stations were pulled from the county's center. Elements lat_center and lon_center are the latitude and longitude of the county's center.

Examples

Run this code
## Not run: 
# stations <- daily_stations(fips = "12086", date_min = "2010-01-01",
#                            date_max = "2010-02-01")
# fips_list <- daily_df(stations = stations, coverage = 0.90,
#                  var = c("tmax", "tmin", "prcp"),
#                  date_min = "2010-01-01", date_max = "2010-02-01")
# averaged_data <- fips_list$daily_data
# head(averaged_data)
# station_info <- fips_list$station_df
# head(station_info)
# ## End(Not run)

Run the code above in your browser using DataLab