googlesheets4
Overview
googlesheets4 provides an R interface to Google Sheets via the Sheets API v4. It is a reboot of an earlier package called googlesheets.
Why 4? Why googlesheets4? Did I miss googlesheets1 through 3? No. The idea is to name the package after the corresponding version of the Sheets API. In hindsight, the original googlesheets should have been googlesheets3.
Installation
You can install the released version of googlesheets4 from CRAN with:
install.packages("googlesheets4")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("tidyverse/googlesheets4")
Auth
googlesheets4 will, by default, help you interact with Sheets as an
authenticated Google user. If you don’t plan to write Sheets or to read
private Sheets, use gs4_deauth()
to indicate there is no need for a
token. See the article googlesheets4
auth
for more.
For this overview, we’ve logged into Google as a specific user in a hidden chunk.
Attach googlesheets4
library(googlesheets4)
Read
The main “read” function of the googlesheets4 package goes by two names, because we want it to make sense in two contexts:
read_sheet()
evokes other table-reading functions, likereadr::read_csv()
andreadxl::read_excel()
. Thesheet
in this case refers to a Google (spread)Sheet.range_read()
is the right name according to the naming convention used throughout the googlesheets4 package.
read_sheet()
and range_read()
are synonyms and you can use either
one. Here we’ll use read_sheet()
.
googlesheets4 is pipe-friendly (and
reexports %>%
), but works just fine without the pipe.
Read from
- a URL
- a Sheet ID
- a
dribble
produced by the googledrive package, which can lookup by file name
These all achieve the same thing:
# URL
read_sheet("https://docs.google.com/spreadsheets/d/1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY/edit#gid=780868077")
#> ✓ Reading from "gapminder".
#> ✓ Range 'Africa'.
#> # A tibble: 624 x 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # … with 619 more rows
# Sheet ID
read_sheet("1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY")
#> ✓ Reading from "gapminder".
#> ✓ Range 'Africa'.
#> # A tibble: 624 x 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # … with 619 more rows
# a googledrive "dribble"
googledrive::drive_get("gapminder") %>%
read_sheet()
#> ✓ The input `path` resolved to exactly 1 file.
#> ✓ Reading from "gapminder".
#> ✓ Range 'Africa'.
#> # A tibble: 624 x 6
#> country continent year lifeExp pop gdpPercap
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Algeria Africa 1952 43.1 9279525 2449.
#> 2 Algeria Africa 1957 45.7 10270856 3014.
#> 3 Algeria Africa 1962 48.3 11000948 2551.
#> 4 Algeria Africa 1967 51.4 12760499 3247.
#> 5 Algeria Africa 1972 54.5 14760787 4183.
#> # … with 619 more rows
Note: the only reason we can read a sheet named “gapminder” (the last example) is because the account we’re logged in as has a Sheet named “gapminder”.
See the article Find and Identify Sheets for more about specifying the Sheet you want to address. See the article Read Sheets for more about reading from specific sheets or ranges, setting column type, and getting low-level cell data.
Write
gs4_create()
creates a brand new Google Sheet and can optionally send
some initial data.
(ss <- gs4_create("fluffy-bunny", sheets = list(flowers = head(iris))))
#> ✓ Creating new Sheet: "fluffy-bunny".
#> Spreadsheet name: fluffy-bunny
#> ID: 1GtTKAaC0R2WI6gaGitsj1v_IGuDC6n6uVBwk8-aFExg
#> Locale: en_US
#> Time zone: Etc/GMT
#> # of sheets: 1
#>
#> (Sheet name): (Nominal extent in rows x columns)
#> flowers: 7 x 5
sheet_write()
(over)writes a whole data frame into a (work)sheet
within a (spread)Sheet.
head(mtcars) %>%
sheet_write(ss, sheet = "autos")
#> ✓ Writing to "fluffy-bunny".
#> ✓ Writing to sheet 'autos'.
ss
#> Spreadsheet name: fluffy-bunny
#> ID: 1GtTKAaC0R2WI6gaGitsj1v_IGuDC6n6uVBwk8-aFExg
#> Locale: en_US
#> Time zone: Etc/GMT
#> # of sheets: 2
#>
#> (Sheet name): (Nominal extent in rows x columns)
#> flowers: 7 x 5
#> autos: 7 x 11
sheet_append()
, range_write()
, range_flood()
, and range_clear()
are more specialized writing functions. See the article Write
Sheets
for more about writing to Sheets.
Where to learn more
Get started is a more extensive general introduction to googlesheets4.
Browse the articles index to find articles that cover various topics in more depth.
See the function index for an organized, exhaustive listing.
Contributing
If you’d like to contribute to the development of googlesheets4, please read these guidelines.
Please note that the googlesheets4 project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Privacy
Context
googlesheets4 draws on and complements / emulates other packages in the tidyverse:
- googlesheets is the package that googlesheets4 replaces. Main improvements in googlesheets4: (1) wraps the current, most modern Sheets API; (2) leaves all “whole file” operations to googledrive; and (3) uses shared infrastructure for auth and more, from the gargle package. The v3 API wrapped by googlesheets is deprecated. Starting in April/May 2020, features will gradually be disabled and it’s anticipated the API will fully shutdown in September 2020. At that point, the original googlesheets package must be retired.
- googledrive provides a fully-featured interface to the Google Drive API. Any “whole file” operations can be accomplished with googledrive: upload or download or update a spreadsheet, copy, rename, move, change permission, delete, etc. googledrive supports Team Drives.
- readxl is the tidyverse package for reading Excel files (xls or xlsx) into an R data frame. googlesheets4 takes cues from parts of the readxl interface, especially around specifying which cells to read.
- readr is the tidyverse package for reading delimited files (e.g., csv or tsv) into an R data frame. googlesheets4 takes cues from readr with respect to column type specification.