Learn R Programming

⚠️There's a newer version (2.5.4) of this package.Take me there.

Haven

Haven allows you to load foreign data formats (SAS, Spss and Stata) in to R by wrapping the fantastic ReadStat C library written by Evan Miller. Haven offers similar functionality to the base foreign package but:

  • Can read SAS's proprietary binary format (SAS7BDAT). The one other package on CRAN that does that, sas7bdat, was created to document the reverse-engineering effort. Thus its implementation is designed for experimentation, rather than efficiency. Haven is significantly faster and should also support a wider range of SAS files, and works with SAS7BCAT files.

  • It can be faster. Some spss files seem to load about 4x faster, but others load slower. If you have a lot of SPSS files to import, you might want to benchmark both and pick the fastest.

  • Works with Stata 13 files (foreign only works up to Stata 12).

  • Can also write SPSS and Stata files (This is hard to test so if you run into any problems, please let me know).

  • Can only read the data from the most common statistical packages (SAS, Stata and SPSS).

  • You always get a data frame, date times are converted to corresponding R classes and labelled vectors are returned as new labelled class. You can easily coerce to factors or replace labelled values with missings as appropriate. If you also use dplyr, you'll notice that large data frames are printed in a convenient way.

  • Uses underscores instead of dots ;)

Haven is still a work in progress so please file an issue if it fails to correctly load a file that you're interested in.

Installation

# Install the released version from CRAN:
install.packages("haven")

# Install the cutting edge development version from GitHub:
# install.packages("devtools")
devtools::install_github("hadley/haven")

Usage

  • SAS: read_sas("path/to/file")
  • SPSS: read_por("path/to/file"), read_sav("path/to/file")
  • Stata: read_dta("path/to/file")

Updating readstat

If you're working on the development version of haven, and you'd like to update the embedded ReadStat library, you can run the following code. It is not necessary if you're just using the package.

tmp <- tempfile()
download.file("https://github.com/WizardMac/ReadStat/archive/master.zip", tmp, 
  method = "wget")
unzip(tmp, exdir = tempdir())

src <- dir(file.path(tempdir(), "ReadStat-master", "src"), "\\.[ch]$", full.name = TRUE)
file.copy(src, "src/", overwrite = TRUE)
unlink(c("src/readstat_rdata.c", "src/readstat_rdata.h"))

Copy Link

Version

Install

install.packages('haven')

Monthly Downloads

713,526

Version

0.2.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Hadley Wickham

Last Published

April 8th, 2015

Functions in haven (0.2.0)

labelled

Create a labelled vector.
read_dta

Read and write Stata DTA files.
read_sas

Read SAS files.
hms

Hours, minutes, seconds.
read_spss

Read SPSS (POR and SAV) files. Write SAV files.
zap_empty

Convert empty strings into missing values.
as_factor

Convert input to a factor.