Learn R Programming

⚠️There's a newer version (1.3.1) of this package.Take me there.

tidyr

tidyr is a reframing of reshape2 designed to accompany the tidy data framework, and to work hand-in-hand with magrittr and dplyr to build a solid pipeline for data analysis.

Just as reshape2 did less than reshape, tidyr does less than reshape2. It's designed specifically for tidying data, not the general reshaping that reshape2 does, or the general aggregation that reshape did. In particular, built-in methods only work for data frames, and tidyr provides no margins or aggregation.

There are two fundamental verbs of data tidying:

  • gather() takes multiple columns, and gathers them into key-value pairs: it makes "wide" data longer.

  • spread(). takes two columns (key & value) and spreads in to multiple columns, it makes "long" data wider.

These verbs have a number of synonyms:

tidyrgatherspread
reshape(2)meltcast
spreadsheetsunpivotpivot
databasesfoldunfold

tidyr also provides separate() and extract() functions which makes it easier to pull apart a column that represents multiple variables. The complement to separate() is unite().

Installation

tidyr is available from CRAN. Install it with:

install.packages("tidyr")

The development version can be installed using:

# install.packages("devtools")
devtools::install_github("hadley/tidyr")

Getting started

To get started, read the tidy data vignette (vignette("tidy-data")) and check out the demos, demo(package = "tidyr")).

Note that tidyr is designed for use in conjunction with dplyr, so you should always load both:

library(tidyr)
library(dplyr)

References

If you'd like to learn more about these data reshaping operators, I'd recommend the following papers:

Copy Link

Version

Install

install.packages('tidyr')

Monthly Downloads

1,202,383

Version

0.6.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

August 12th, 2016

Functions in tidyr (0.6.0)

complete_

Standard-evaluation version of complete.
expand

Expand data frame to include all combinations of values
complete

Complete a data frame with missing combinations of data.
expand_

Expand (standard evaluation).
fill

Fill in missing values.
separate_rows_

Standard-evaluation version of separate_rows.
replace_na

Replace missing values
full_seq

Create the full sequence of values in a vector.
nest_

Standard-evaluation version of nest.
%>%

Pipe operator
separate_

Standard-evaluation version of separate.
gather_

Gather (standard-evaluation).
gather

Gather columns into key-value pairs.
nest

Nest repeated values in a list-variable.
table1

Example tabular representations
unite_

Standard-evaluation version of unite
unnest

Unnest a list column.
smiths

Some data about the Smith family.
spread_

Standard-evaluation version of spread.
separate_rows

Separate a collapsed column into multiple rows.
unite

Unite multiple columns into one.
unnest_

Standard-evaluation version of unnest.
separate

Separate one column into multiple columns.
spread

Spread a key-value pair across multiple columns.
who

World Health Organization TB data
extract_

Standard-evaluation version of extract.
extract_numeric

Extract numeric component of variable.
fill_

Standard-evaluation version of fill.
drop_na_

Standard-evaluation version of drop_na.
drop_na

Drop rows containing missing values
extract

Extract one column into multiple columns.