Learn R Programming

DDIwR (version 0.9)

convert: Convert a dataset from one statistical software to another

Description

This function converts (or transfers) between R, Stata, SPSS, SAS, Excel and DDI XML files. Unlike the regular import / export functions from packages haven or rio, this function uses the DDI standard as an exchange platform and facilitates a consistent conversion of the missing values.

Usage

convert(from, to = NULL, declared = TRUE, recode = TRUE, embed = TRUE, ...)

Arguments

from

A path to a file, or a data frame object

to

Character, the name of a software package or a path to a specific file

declared

Logical, return the resulting data frame as a declared object

recode

Logical, recode missing values

embed

Boolean, embed the data when generating a DDI XML file

...

Additional parameters passed to exporting functions, see the Details section

Details

When the argument to specifies a certain statistical package ("R", "Stata", "SPSS", "SAS") or "Excel", the name of the destination file will be idential to the one in the argument from, with an automatically added software specific extension.

Alternatively, the argument to can be specified as a path to a specific file, in which case the software package is determined from its file extension. The following extentions are currently recognized: .xml for DDI, .rds for R, .dta for Stata, .sav for SPSS, .sas7bdat for SAS, and .xlsx for Excel.

Additional parameters can be specified via the three dots argument ..., that are passed to the respective functions from packages haven and readxl. For instance the function write_dta() has an additional argument called version when writing a Stata file.

The most important argument to consider is called user_na, part of the function read_sav(). Although it is defaulted to FALSE in package haven, in package DDIwR it is used as having the value of TRUE. Users who really want to deactivate it should explicitly specify use_na = FALSE in function {convert}().

If the argument to is left to NULL, the data is (invisibly) returned to the R enviroment. Conversion to R, either in the working space or as a data file, will result (by default) in a data frame containing declared labelled variables, as defined in package declared.

The current version reads and creates DDI Codebook version 2.5, with future versions to extend the functionality for DDI Lifecycle versions 3.x and link to the future package DDI4R for the UML model based version 4. It extends the standard DDI Codebook by offering the possibility to embed a CSV version of the raw data into the XML file containing the Codebook, into a notes child of the fileDscr component. This type of Codebook is unique to this package and automatically detected when converting to another statistical software.

Converting the missing values to SAS is not tested, but it relies on the same package haven using the ReadStat C library. Should it not work, it is also possible to use a setup file produced by function setupfile() and run the commands manually.

When the argument embed is deactivated, a CSV file will be produced in the same directory, using the same file name as the XML file.

The argument recode controls how missing values are treated. If the input file has SPSS like numeric codes, they will be recoded to extended (a-z) missing types when converting to Stata or SAS. If the input has Stata like extended codes, they will be recoded to SPSS like numeric codes.

References

DDI - Data Documentation Initiative, see https://ddialliance.org/

See Also

setupfile, getMetadata, declared, labelled

Examples

Run this code
# NOT RUN {
# Assuming an SPSS file called test.sav is located in the working directory
# the following command will extract the metadata in a DDI Codebook and
# produce a test.xml file in the same directory
convert("test.sav", to = "DDI")

# It is possible to include the data in the XML file, using:
convert("test.sav", to = "DDI", embed = TRUE)

# To produce a Stata file:
convert("test.sav", to = "Stata")

# To produce an R file:
convert("test.sav", to = "R")

# To produce an Excel file:
convert("test.sav", to = "Excel")
# }

Run the code above in your browser using DataLab