This functions imports data from various file types. It is a small wrapper
around haven::read_spss()
, haven::read_stata()
, haven::read_sas()
,
readxl::read_excel()
and data.table::fread()
resp. readr::read_delim()
(the latter if package data.table is not installed). Thus, supported file
types for importing data are data files from SPSS, SAS or Stata, Excel files
or text files (like '.csv' files). All non-supported file types are passed
to rio::import()
.
data_read(path, path_catalog = NULL, encoding = NULL, verbose = TRUE, ...)
A data frame.
Character string, the file path to the data file.
Character string, path to the catalog file. Only relevant for SAS data files.
The character encoding used for the file. Usually not needed.
Toggle warnings and messages.
Arguments passed to the related read_*()
function.
data_read()
is a wrapper around the haven, data.table, readr
readxl and rio packages. Currently supported file types are .txt
,
.csv
, .xls
, .xlsx
, .sav
, .por
, .dta
and .sas
(and related
files). All other file types are passed to rio::import()
.
data_read()
can also read the above mentioned files from URLs or from
inside zip-compressed files. Thus, path
can also be a URL to a file like
"http://www.url.com/file.csv"
. When path
points to a zip-compressed file,
and there are multiple files inside the zip-archive, then the first supported
file is extracted and loaded.
data_read()
detects the appropriate read_*()
function based on the
file-extension of the data file. Thus, in most cases it should be enough to
only specify the path
argument. However, if more control is needed, all
arguments in ...
are passed down to the related read_*()
function.
data_read()
is most comparable to rio::import()
. For data files from
SPSS, SAS or Stata, which support labelled data, variables are converted into
their most appropriate type. The major difference to rio::import()
is
that data_read()
automatically converts variables into factors, unless
the variables are only partially labelled, in which case variables are
converted to numerics. Character vectors are preserved. Hence, variables,
where all values are labelled, will be converted into factors, where
imported value labels will be set as factor levels. Else, if a variable
has no value labels or less value labels than values, the variable is
either converted into numeric or character vector. Value labels are then
preserved as "labels"
attribute.