Learn R Programming

⚠️There's a newer version (1.7.1) of this package.Take me there.

workflowr: organized + reproducible + shareable data science in R

The workflowr R package makes it easier for researchers to organize their projects and share their results with colleagues.

Install the latest release (v0.11.0) by running this command in R or RStudio:

devtools::install_github("jdblischak/workflowr", build_vignettes = TRUE)

If you are already writing R code to analyze data, and know the basics of Git and GitHub, you can start taking advantage of workflowr immediately. In a matter of minutes, you can create a research website like this. (See also the Divvy data exploration project for a more elaborate example of a workflowr project.)

If you find any problems, or would like to suggest new features, please open an Issue.

Why use workflowr?

First, hopefully you don't need much convincing to write your analyses in R Markdown. It allows you to combine your R code, text, and figures in the same document! See the website to learn about all the cool features. Second, building a website with the rmarkdown package (as opposed to using knitr to produce Markdown files and passing these to a static site generator) enables you to use all the latest R packages (e.g. htmlwidgets) directly in your analyses. Third, the workflowr package provides functions to make it easier for a researcher to maintain a version-controlled R Markdown website:

  • A function to start a project with all the necessary files (see ?wflow_start)
  • Includes an R Markdown template that will automatically insert the date and most recent Git commit ID (i.e. SHA1) at the top of the file to aid reproducibility (see ?wflow_open)
  • Saves generated figures into an organized directory structure
  • A function that handles all the version control operations to track code development and also ensures all the R Markdown files are built in a reproducible manner (see ?wflow_publish)

Quick start

Workflowr builds on several software tools including Git, pandoc and knitr, but you do not need to have experience using any of these tools to get started with workflowr. You only need to know how to code in R and be generally familiar with the R Markdown format. A basic understanding of git as well as the UNIX command line is not essential, but helpful.

Here is a minimal set of steps to get you started with workflowr. If you are already using R and/or Git, you may be able to skip some of these steps.

  1. Install R (instructions from Software Carpentry).

  2. Install pandoc using one of the following methods:

    a. (Recommended) Install RStudio. RStudio includes an installation of pandoc. Furthermore, workflowr takes advantage of some RStudio features (however RStudio is not required to use workflowr).

    b. Install only pandoc following these instructions.

  3. (Optional) Install Git (instructions from [Software

Carpentry]swc). You do not need to install Git to start using workflowr. You only need to install Git if you want to perform more advanced Git operations, which you are unlikely to need at the beginning of your project.

  1. Create an account on GitHub.

  2. Install the latest stable release of workflowr from

GitHub using devtools:

```r
# Devtools must be installed first
#install.packages("devtools")
library("devtools")
# Install a compatible version of git2r
install_version("git2r", "0.21.0")
# If you receive an error on macOS or Windows, try specifying type = "binary"
#install_version("git2r", "0.21.0", type = "binary")
# Install workflowr from GitHub
install_github("jdblischak/workflowr", build_vignettes = TRUE)
```
  1. Work through the vignette,

"Getting started with workflowr", to learn how to set up a workflowr project. (You can view all the available vignettes locally with browseVignettes("workflowr").)

  1. Alternatively, if you have already started your project, read the

vignette "Migrating an existing project to use workflowr" to learn how to convert your project to a workflowr project.

  1. Learn more about how to Customize your research website.

  2. If you find any unexpected behavior or think of an additional

feature that would be nice to have, please open an Issue here. When writing your bug report or feature request, please note the version of workflowr you are using (which you can obtain by running packageVersion("workflowr")).

Upgrading

To upgrade workflowr to the most recent stable release, follow these steps:

devtools::install_github("jdblischak/workflowr", build_vignettes = TRUE)
  • Preview potential changes to your project files with wflow_update():
library("workflowr")
wflow_update()
  • To implement these changes, set dry_run = FALSE:
wflow_update(dry_run = FALSE)

More about this repository

This repository contains the workflowr R package. If your goal is to create a workflowr project, you do not need to fork this repository. Instead, follow the Quick start instructions above.

For the most part, I try to follow the guidelines from R packages by Hadley Wickham. The unit tests are performed with testthat, the documentation is built with roxygen2, the online package documentation is created with pkgdown, continuous integration testing is performed for Linux and macOS by Travis CI and for Windows by AppVeyor, and code coverage is calculated with covr and Codecov.

The template files used by wflow_start() to populate a new project are located in inst/infrastructure/. The R Markdown templates used by wflow_open() are located in inst/rmarkdown/templates/. The RStudio project template is configured by inst/rstudio/templates/project/wflow_start.dcf. The repository contains the files LICENSE and LICENSE.md to both adhere to R package conventions for defining the license and also to make the license clear in a more conventional manner (suggestions for improvement welcome). document.R is a convenience script for regenerating the documentation. build.sh is a convenience script for running R CMD check. The remaining directories are standard for R packages as described in the manual Writing R Extensions.

If you are interested in contributing to this project, please see these instructions.

Background and related work

There is lots of interest and development around reproducible research with R. Projects like workflowr are possible due to two key developments. First, the R packages knitr and rmarkdown have made it easy for any R programmer to generate reports that combine text, code, output, and figures. Second, the version control software Git, the Git hosting site GitHub, and the static website hosting service GitHub Pages have made it easy to share not only source code but also static HTML files (i.e. no need to purchase a domain name, setup a server, etc).

My first attempt at sharing a reproducible project online was singleCellSeq. Basically, I started by copying the documentation website of rmarkdown and added some customizations to organize the generated figures and to insert the status of the Git repository directly into the HTML pages. The workflowr R package is my attempt to simplify my previous workflow and provide helper functions so that any researcher can take advantage of this workflow.

Workflowr encompasses multiple functions: 1) provides a project template, 2) version controls the R Markdown and HTML files, and 3) builds a website. Furthermore, it provides R functions to perform each of these steps. There are many other related works that provide similar functionality. Some are templates to be copied, some are R packages, and some involve more complex software (e.g. static blog software). Depending on your use case, one of the related works listed below may better suit your needs. Please check them out!

If you know of other related works I should include, please send a pull request to the "dev" branch.

Credits

Workflowr was developed, and is maintained, by John Blischak, a postdoctoral researcher in the laboratory of Matthew Stephens at The University of Chicago. He is funded by a grant from the Gordon and Betty Moore Foundation to MS. Peter Carbonetto and Matthew Stephens are co-authors.

We are very thankful to workflowr contributors for helping improve the package. We are also grateful for workflowr users for testing the package and providing feedback---thanks especially to Lei Sun, Xiang Zhu, Wei Wang, and other members (past and present) of the Stephens lab.

The workflowr package uses many great open source packages. Especially critical for this project are the R packages git2r, knitr, and rmarkdown. Please see the vignette How the workflowr package works to learn about the software that makes workflowr possible.

License

Workflowr is available under the MIT license.

Citation

To cite workflowr in publications use:

John D. Blischak, Peter Carbonetto and Matthew Stephens (2017). The workflowr R package: a framework for reproducible and collaborative data science. R package version 0.11.0. https://github.com/jdblischak/workflowr

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {The workflowr R package: a framework for reproducible and collaborative data science},
    author = {John D. Blischak and Peter Carbonetto and Matthew Stephens},
    note = {R package version 0.11.0},
    year = {2017},
    url = {https://github.com/jdblischak/workflowr},
  }

Pronunciation and spelling

It is common for R packages to end with an "r", and I tend to pronounce this as if it was "er" because I personally find this the easiest. Thus I pronounce the package "workflow + er". Other equally good options are "workflow + R" or "work + flower".

Workflowr should be capitalized at the beginning of a sentence, but otherwise the lowercase workflowr should be the preferred option.

Copy Link

Version

Install

install.packages('workflowr')

Monthly Downloads

473

Version

0.11.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

August 23rd, 2023

Functions in workflowr (0.11.0)

extract_commit

Extract a commit from a Git repository
wflow_convert

Convert R Markdown files to workflowr template.
wflow_git_config

Configure Git settings
create_links_page

Create a results page with links to analysis files
wflow_build

Build the site
wflow_publish

Publish the site
wflow_open

Open R Markdown analysis file(s)
wflow_commit

Commit files
wflow_status

Report status of workflowr project.
wflow_git_pull

Pull files from remote repository
wflow_git_push

Push files to remote repository
workflowr

workflowr: A workflow template for creating a research website
wflow_update

Update a workflowr project.
wflow_view

View research website locally.
wflow_remotes

Manage remote Git repositories.
wflow_remove

Remove files
wflow_start

Start a new workflowr project.