TWIT_paginate_max_id: Pagination

Description

These are internal functions used for pagination inside of rtweet.

Usage

TWIT_paginate_max_id(
  token,
  api,
  params,
  get_id = function(x) x$id_str,
  n = 1000,
  page_size = 200,
  since_id = NULL,
  max_id = NULL,
  count_param = "count",
  retryonratelimit = NULL,
  verbose = TRUE
)
TWIT_paginate_cursor(
  token,
  api,
  params,
  n = 5000,
  page_size = 5000,
  cursor = "-1",
  get_id = function(x) x$ids,
  retryonratelimit = NULL,
  verbose = TRUE
)
TWIT_paginate_chunked(
  token,
  api,
  params_list,
  retryonratelimit = NULL,
  verbose = TRUE
)
TWIT_paginate_premium(
  token,
  api,
  params,
  n = 100,
  page_size = 100,
  cursor = "next",
  retryonratelimit = NULL,
  verbose = TRUE
)

Value

A list with the json output of the API.

Arguments

token

Use this to override authentication for a single API call. In many cases you are better off changing the default for all calls. See auth_as() for details.

get_id

A single argument function that returns a vector of ids given the JSON response. The defaults are chosen to cover the most common cases, but you'll need to double check whenever implementing pagination for a new endpoint.

n

Desired number of results to return. Results are downloaded in pages when n is large; the default value will download a single page. Set n = Inf to download as many results as possible.

The Twitter API rate limits the number of requests you can perform in each 15 minute period. The easiest way to download more than that is to use retryonratelimit = TRUE.

You are not guaranteed to get exactly n results back. You will get fewer results when tweets have been deleted or if you hit a rate limit. You will get more results if you ask for a number of tweets that's not a multiple of page size, e.g. if you request n = 150 and the page size is 200, you'll get 200 results back.

since_id

Supply a vector of ids or a data frame of previous results to find tweets newer than since_id.

max_id

Supply a vector of ids or a data frame of previous results to find tweets older than max_id.

retryonratelimit

If TRUE, and a rate limit is exhausted, will wait until it refreshes. Most Twitter rate limits refresh every 15 minutes. If FALSE, and the rate limit is exceeded, the function will terminate early with a warning; you'll still get back all results received up to that point. The default value, NULL, consults the option rtweet.retryonratelimit so that you can globally set it to TRUE, if desired.

If you expect a query to take hours or days to perform, you should not rely solely on retryonratelimit because it does not handle other common failure modes like temporarily losing your internet connection.

verbose

Show progress bars and other messages indicating current progress?

cursor

Which page of results to return. The default will return the first page; you can supply the result from a previous call to continue pagination from where it left off.