exportRecords: Export Records from a REDCap Database

Description

Exports records from a REDCap Database, allowing for subsets of subjects, fields, records, and events.

Usage

exportRecords(
  rcon,
  factors = TRUE,
  fields = NULL,
  forms = NULL,
  records = NULL,
  events = NULL,
  labels = TRUE,
  dates = TRUE,
  drop = NULL,
  survey = TRUE,
  dag = TRUE,
  checkboxLabels = FALSE,
  colClasses = character(0),
  ...
)
# S3 method for redcapApiConnection
exportRecords(
  rcon,
  factors = TRUE,
  fields = NULL,
  forms = NULL,
  records = NULL,
  events = NULL,
  labels = TRUE,
  dates = TRUE,
  drop = NULL,
  survey = TRUE,
  dag = TRUE,
  checkboxLabels = FALSE,
  colClasses = character(0),
  ...,
  batch.size = -1,
  error_handling = getOption("redcap_error_handling"),
  config = list(),
  api_param = list(),
  form_complete_auto = TRUE
)
exportRecords_offline(
  dataFile,
  metaDataFile,
  factors = TRUE,
  fields = NULL,
  forms = NULL,
  labels = TRUE,
  dates = TRUE,
  checkboxLabels = FALSE,
  colClasses = NA,
  ...,
  meta_data
)

Arguments

rcon: A REDCap connection object as created by redcapConnection.
factors: Logical. Determines if categorical data from the database is returned as numeric codes or labelled factors. See 'Checkbox Variables' for more on how this interacts with the checkboxLabels argument.
fields: A character vector of fields to be returned. If NULL, all fields are returned.
forms: A character vector of forms to be returned. If NULL, all forms are returned.
records: A vector of study id's to be returned. If NULL, all subjects are returned.
events: A character vector of events to be returned from a longitudinal database. If NULL, all events are returned.
labels: Logical. Determines if the variable labels are applied to the data frame.
dates: Logical. Determines if date variables are converted to POSIXct format during the download.
drop: An optional character vector of REDCap variable names to remove from the dataset; defaults to NULL. E.g., drop=c("date_dmy", "treatment") It is OK for drop to contain variables not present; these names are ignored.
survey: specifies whether or not to export the survey identifier field (e.g., "redcap_survey_identifier") or survey timestamp fields (e.g., form_name+"_timestamp") when surveys are utilized in the project. If you do not pass in this flag, it will default to "true". If set to "true", it will return the redcap_survey_identifier field and also the survey timestamp field for a particular survey when at least one field from that survey is being exported. NOTE: If the survey identifier field or survey timestamp fields are imported via API data import, they will simply be ignored since they are not real fields in the project but rather are pseudo-fields.
dag: specifies whether or not to export the "redcap_data_access_group" field when data access groups are utilized in the project. If you do not pass in this flag, it will default to "false". NOTE: This flag is only viable if the user whose token is being used to make the API request is *not* in a data access group. If the user is in a group, then this flag will revert to its default value.
checkboxLabels: Logical. Determines the format of labels in checkbox variables. If FALSE labels are applies as "Unchecked"/"Checked". If TRUE, they are applied as ""/"[field_label]" where [field_label] is the label assigned to the level in the data dictionary. This option is only available after REDCap version 6.0. See Checkbox Variables for more on how this interacts with the factors argument.
colClasses: A (named) vector of column classes passed to read.csv calls. Useful to force the interpretation of a column in a specific type and avoid an unexpected recast.
...: Additional arguments to be passed between methods.
batch.size: Integer. Specifies the number of subjects to be included in each batch of a batched export. Non-positive numbers export the entire project in a single batch. Batching the export may be beneficial to prevent tying up smaller servers. See details for more explanation.
error_handling: An option for how to handle errors returned by the API. see redcap_error
config: list Additional configuration parameters to pass to POST. These are appended to any parameters in rcon$config.
api_param: list Additional API parameters to pass into the body of the API call. This provides users to execute calls with options that may not otherwise be supported by redcapAPI.
form_complete_auto: logical(1). When TRUE (default), the [form]_complete fields for any form from which at least one variable is requested will automatically be retrieved. When FALSE, these fields must be explicitly requested.
dataFile: For the offline version, a character string giving the location of the dataset downloaded from REDCap. Note that this should be the raw (unlabeled) data set.
metaDataFile: A text string giving the location of the data dictionary downloaded from REDCap.
meta_data: Deprecated version of metaDataFile

Checkbox Variables

There are four ways the data from checkbox variables may be represented depending on the values of factors and checkboxLabels. The most common are the first and third rows of the table below. When checkboxLabels = TRUE, either the coded value or the labelled value is returned if the box is checked, or an empty string if it is not.

`factors`	`checkboxLabels`	Output
`FALSE`	`FALSE`	0 / 1
`FALSE`	`TRUE`	"" / value
`TRUE`	`FALSE`	Unchecked / Checked
`TRUE`	`TRUE`	"" / label

REDCap API Documentation (6.5.0)

This function allows you to export a set of records for a project

Note about export rights (6.0.0+): Please be aware that Data Export user rights will be applied to this API request. For example, if you have "No Access" data export rights in the project, then the API data export will fail and return an error. And if you have "De-Identified" or "Remove all tagged Identifier fields" data export rights, then some data fields *might* be removed and filtered out of the data set returned from the API. To make sure that no data is unnecessarily filtered out of your API request, you should have "Full Data Set" export rights in the project.

REDCap Version

5.8.2 (Perhaps earlier)

Known REDCap Limitations

None

Deidentified Batched Calls

Batched calls to the API are not a feature of the REDCap API, but may be imposed by making multiple calls to the API. The process of batching the export requires that an initial call be made to the API to retrieve only the record IDs. The list of IDs is then broken into chunks, each about the size of batch.size. The batched calls then force the records argument in each call.

When a user's permissions require a de-identified data export, a batched call should be expected to fail. This is because, upon export, REDCap will hash the identifiers. When R attempts to pass the hashed identifiers back to REDCap, REDCap will try to match the hashed identifiers to the unhashed identifiers in the database. No matches will be found, and the export will fail.

Users who are exporting de-identified data will have to settle for using unbatched calls to the API (ie, batch.size = -1)

Author

Jeffrey Horner

Details

A record of exports through the API is recorded in the Logging section of the project.

The 'offline' version of the function operates on the raw (unlabeled) data file downloaded from REDCap along with the data dictionary. This is made available for instances where the API can not be accessed for some reason (such as waiting for API approval from the REDCap administrator).

It is unnecessary to include "redcap_event_name" in the fields argument. This field is automatically exported for any longitudinal database. If the user does include it in the fields argument, it is removed quietly in the parameter checks.

A 'batched' export is one where the export is performed over a series of API calls rather than one large call. For large projects on small servers, this may prevent a single user from tying up the server and forcing others to wait on a larger job. The batched export is performed by first calling the API to export the subject identifier field (the first field in the meta data). The unique ID's are then assigned a batch number with no more than batch.size ID's in any single batch. The batches are exported from the API and stacked together.

In longitudinal projects, batch.size may not necessarily be the number of records exported in each batch. If batch.size is 10 and there are four records per patient, each batch will consist of 40 records. Thus, if you are concerned about tying up the server with a large, longitudinal project, it would be prudent to use a smaller batch size.