Learn R Programming

rvest (version 1.0.1)

session: Simulate a session in web browser

Description

This set of functions allows you to simulate a user interacting with a website, using forms and navigating from page to page.

  • Create a session with session(url)

  • Navigate to a specified url with session_jump_to(), or follow a link on the page with session_follow_link().

  • Submit an html_form with session_submit().

  • View the history with session_history() and navigate back and forward with session_back() and session_forward().

  • Extract page contents with html_element() and html_elements(), or get the complete HTML document with read_html().

  • Inspect the HTTP response with httr::cookies(), httr::headers(), and httr::status_code().

Usage

session(url, ...)

is.session(x)

session_jump_to(x, url, ...)

session_follow_link(x, i, css, xpath, ...)

session_back(x)

session_forward(x)

session_history(x)

session_submit(x, form, submit = NULL, ...)

Arguments

url

A URL, either relative or absolute, to navigate to.

...

Any additional httr config to use throughout the session.

x

A session.

i

A integer to select the ith link or a string to match the first link containing that text (case sensitive).

css

Elements to select. Supply one of css or xpath depending on whether you want to use a CSS selector or XPath 1.0 expression.

xpath

Elements to select. Supply one of css or xpath depending on whether you want to use a CSS selector or XPath 1.0 expression.

form

An html_form to submit

submit

Which button should be used to submit the form?

  • NULL, the default, uses the first button.

  • A string selects a button by its name.

  • A number selects a button using its relative position.

Examples

Run this code
# NOT RUN {
s <- session("http://hadley.nz")
s %>%
  session_jump_to("hadley-wickham.jpg") %>%
  session_jump_to("/") %>%
  session_history()

s %>%
  session_jump_to("hadley-wickham.jpg") %>%
  session_back() %>%
  session_history()

# }
# NOT RUN {
s %>%
  session_follow_link(css = "p a") %>%
  html_elements("p")
# }

Run the code above in your browser using DataLab