Learn R Programming

bigQueryR (version 0.5.0)

bqr_partition: Convert date-sharded tables to a single partitioned table

Description

Moves the old style date-sharded tables such as [TABLE_NAME]_YYYYMMDD to the new date partitioned format.

Usage

bqr_partition(sharded, partition, projectId = bqr_get_global_project(),
  datasetId = bqr_get_global_dataset())

Arguments

sharded

The prefix of date-sharded tables to merge into one partitioned table

partition

Name of partitioned table. Will create if not present already

projectId

The project ID

datasetId

The dataset ID

Value

A list of copy jobs for the sharded tables that will be copied to one partitioned table

Details

Performs lots of copy table operations via bqr_copy_table

Before partitioned tables became available, BigQuery users would often divide large datasets into separate tables organized by time period; usually daily tables, where each table represented data loaded on that particular date.

Dividing a dataset into daily tables helped to reduce the amount of data scanned when querying a specific date range. For example, if you have a a year's worth of data in a single table, a query that involves the last seven days of data still requires a full scan of the entire table to determine which data to return. However, if your table is divided into daily tables, you can restrict the query to the seven most recent daily tables.

Daily tables, however, have several disadvantages. You must manually, or programmatically, create the daily tables. SQL queries are often more complex because your data can be spread across hundreds of tables. Performance degrades as the number of referenced tables increases. There is also a limit of 1,000 tables that can be referenced in a single query. Partitioned tables have none of these disadvantages.

See Also

Partitioned Tables Help

Examples

Run this code
# NOT RUN {
# }
# NOT RUN {
 
 bqr_partition("ga_sessions_", "ga_partition")

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab