Together, the checkpoint package and the checkpoint server act as a CRAN time machine.
The create_checkpoint
function installs the packages referenced in the specified project to a local library exactly as they existed at the specified point in time. Only those packages are available to your session, thereby avoiding any package updates that came later and may have altered your results. In this way, anyone using the use_checkpoint
function can ensure the reproducibility of your scripts or projects at any time. The checkpoint
function serves as a simple umbrella interface to these functions. It first tests if the checkpoint exists, creates it if necessary with create_checkpoint
, and then calls use_checkpoint
.
checkpoint(
snapshot_date,
r_version = getRversion(),
checkpoint_location = "~",
...
)create_checkpoint(
snapshot_date,
r_version = getRversion(),
checkpoint_location = "~",
project_dir = ".",
mran_url = getOption("checkpoint.mranUrl", "https://mran.microsoft.com"),
scan_now = TRUE,
scan_r_only = FALSE,
scan_rnw_with_knitr = TRUE,
scan_rprofile = TRUE,
force = FALSE,
log = TRUE,
num_workers = getOption("Ncpus", 1),
config = list(),
...
)
use_checkpoint(
snapshot_date,
r_version = getRversion(),
checkpoint_location = "~",
mran_url = getOption("checkpoint.mranUrl", "https://mran.microsoft.com"),
prepend = FALSE,
...
)
delete_checkpoint(
snapshot_date,
r_version = getRversion(),
checkpoint_location = "~",
confirm = TRUE
)
delete_all_checkpoints(checkpoint_location = "~", confirm = TRUE)
uncheckpoint()
Date of snapshot to use in YYYY-MM-DD
format, e.g. "2020-01-01"
. Specify a date on or after "2014-09-17"
. MRAN takes one snapshot per day. To list all valid snapshot dates on MRAN, use list_mran_snapshots
.
Optional character string, e.g. "3.6.2"
. If specified, this is compared to the current R.version
, and if they differ, a warning is issued. The benefit of supplying this argument is that checkpoint can alert you when your R version changes while you are working on a project; this can just as easily lead to reproducibility issues as changes in third-party code. Consider supplying an explicit value for this argument, although checkpoint will still function without it.
File path where the checkpoint library is stored. Default is "~"
, i.e. your home directory. Use cases for changing this include creating a checkpoint library on a portable drive (e.g. USB drive), or creating per-project checkpoints. The actual checkpoints will be created under a .checkpoint
directory at this location.
For checkpoint
, further arguments to pass to create_checkpoint
and use_checkpoint
. Ignored for create_checkpoint
and use_checkpoint
.
A project path. This is the path to the root of the project that references the packages to be installed from the MRAN snapshot for the date specified for snapshotDate
. Defaults to the current working directory.
The base MRAN URL. The default is taken from the system option checkpoint.mranUrl
, or if this is unset, https://mran.microsoft.com
. Currently checkpoint 1.0 does not support local MRAN mirrors.
If TRUE
, scans for packages in the project folder (see 'Details'). If FALSE
, skips the scanning process. Set this to FALSE
if you only want to create the checkpoint subdirectory structure.
If TRUE
, limits the scanning of project files to R scripts only (those with the extension ".R").
If TRUE
, scans Sweave files (those with extension ".Rnw") with knitr::knitr
, otherwise with utils::Stangle
. Ignored if scan_r_only=TRUE
.
if TRUE
, includes the ~/.Rprofile
startup file in the scan. See Startup.
If TRUE
, suppresses the confirmation prompt if create_checkpoint
is run with project directory set to the user home directory.
If TRUE
, writes logging information (mostly the output from the methods of pkgdepends::pkg_installation_proposal
) to the checkpoint directory.
The number of parallel workers to use for installing packages. Defaults to the value of the system option Ncpus
, or if this is unset, 1.
A named list of additional configuration options to pass to pkgdepends::new_pkg_installation_proposal
. See 'Configuration' below.
If TRUE
, adds the checkpoint directory to the beginning of the library search path. The default is FALSE
, where the checkpoint directory replaces all but the system entries (the values of .Library
and .Library.site
) in the search path; this is to reduce the chances of accidentally calling non-checkpointed code. See .libPaths
.
For delete_checkpoint
and delete_all_checkpoints
, whether to ask for confirmation first.
These functions are run mostly for their side-effects; however create_checkpoint
invisibly returns an object of class pkgdepends::pkg_installation_proposal
if scan_now=TRUE
, and NULL
otherwise. checkpoint
returns the result of create_checkpoint
if the checkpoint had to be created, otherwise NULL
.
The pkgdepends package which powers checkpoint allows you to customise the installation process via a list of configuration options. When creating a checkpoint, you can pass these options to pkgdepends via the config
argument. A full list of options can be found at pkgdepends::pkg_config
; note that create_checkpoint
automatically sets the values of cran-mirror
, library
and r-version
.
One important use case for the config
argument is when you are using Windows or MacOS, and the snapshot date does not include binary packages for your version of R. This can occur if either your version of R is too old, or the snapshot date is too far in the past. In this case, you should specify config=list(platforms="source")
to get checkpoint to download the source packages instead (and then compile them locally). Note that if your packages include C, C++ or Fortran code, you will need to have the requisite compilers installed on your machine.
The create_checkpoint
and use_checkpoint
functions store a marker in the snapshot folder every time the function gets called. This marker contains the system date, thus indicating the the last time the snapshot was accessed.
create_checkpoint
creates a local library (by default, located under your home directory) into which it installs copies of the packages required by your project as they existed on CRAN on the specified snapshot date. To determine the packages used in your project, the function scans all R code (.R
, .Rmd
, .Rnw
, .Rhtml
and .Rpres
files) for library
and require
statements, as well as the namespacing operators ::
and :::
.
create_checkpoint
will automatically add the rmarkdown
package as a dependency if it finds any Rmarkdown-based files (those with extension .Rmd
, .Rpres
or .Rhtml
) in your project. This allows you to continue working with such documents after checkpointing.
Checkpoint only installs packages that can be found on CRAN. This includes third-party packages, as well as those distributed as part of R that have the "Recommends" priority. Base-priority packages (the workhorse engine of R, including utils, graphics, methods and so forth) are not checkpointed (but see the r_version
argument above).
The package installation is carried out via the pkgdepends package, which has many features including cached downloads, parallel installs, and comprehensive reporting of outcomes. It also solves many problems that previous versions of checkpoint struggled with, such as being able to install packages that are in use, and reliably detecting the outcome of the installation process.
use_checkpoint
modifies your R session to use only the packages installed by create_checkpoint
. Specifically, it changes your library search path via .libPaths()
to point to the checkpointed library, and then calls use_mran_snapshot
to set the CRAN mirror for the session.
checkpoint
is a convenience function that calls create_checkpoint
if the checkpoint directory does not exist, and then use_checkpoint
.
delete_checkpoint
deletes a checkpoint, after ensuring that it is no longer in use. delete_all_checkpoints
deletes all checkpoints under the given checkpoint location.
uncheckpoint
is the reverse of use_checkpoint
. It restores your library search path and CRAN mirror option to their original values, as they were before checkpoint was loaded. Call this before calling delete_checkpoint
and delete_all_checkpoints
.
# NOT RUN {
# Create temporary project and set working directory
example_project <- paste0("~/checkpoint_example_project_", Sys.Date())
dir.create(example_project, recursive = TRUE)
# Write dummy code file to project
cat("
library(MASS)
library(foreach)
", file="checkpoint_example_code.R")
# Create a checkpoint by specifying a snapshot date
# recommended practice is to specify the R version explicitly
rver <- getRversion()
create_checkpoint("2014-09-17", r_version=rver, project_dir=example_project)
use_checkpoint("2014-09-17", r_version=rver)
# more terse alternative is checkpoint(), which is equivalent to
# calling create_checkpoint() and then use_checkpoint() in sequence
checkpoint("2014-09-17", r_version=rver, project_dir=example_project)
# Check that CRAN mirror is set to MRAN snapshot
getOption("repos")
# Check that (1st) library path is set to ~/.checkpoint
.libPaths()
# Check which packages are installed in checkpoint library
installed.packages()
# restore initial state
uncheckpoint()
# delete the checkpoint
delete_checkpoint("2014-09-17", r_version=rver)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab