Learn R Programming

⚠️There's a newer version (0.9.9) of this package.Take me there.

pct

The goal of pct is to increase the accessibility and reproducibility of the data produced by the Propensity to Cycle Tool (PCT), a research project and web application hosted at www.pct.bike. For an overview of the data provided by the PCT, clicking on the previous link and trying it out is a great place to start. An academic paper on the PCT provides detail on the motivations for and methods underlying the project.

A major motivation behind the project was making transport evidence more accessible, encouraging evidence-based transport policies. The code base underlying the PCT is publicly available (see github.com/npct). However, the code hosted there is not easy to run or reproduce, which is where this package comes in: it provides quick access to the data underlying the PCT and enables some of the key results to be reproduced quickly. It was developed primarily for educational purposes (including for upcoming PCT training courses) but it may be useful for people to build on the the methods, for example to create a scenario of cycling uptake in their town/city/region.

In summary, if you want to know how PCT works, be able to reproduce some of its results, and build scenarios of cycling uptake to inform transport policies enabling cycling in cities worldwide, this package is for you!

Installation

# from CRAN
install.packages("pct")

You can install the development version of the package as follows:

remotes::install_github("ITSLeeds/pct")

Load the package as follows:

library(pct)

Documentation

Probably the best place to get further information on the PCT is from the package’s website at https://itsleeds.github.io/pct/

There you will find the following vignettes:

You will also find there documentation for each of the functions at itsleeds.github.io/pct/reference/. Below we describe some of the basics.

Get PCT data

From feedback, we hear that the use of the data is critical in decision making. Therefore, one area where the package could be useful is making the data “easily” available to be processed.

  • get_pct: the basic function to obtain data available here.

The rest of these should be self explanatory.

  • get_pct_centroids
  • get_pct_lines
  • get_pct_rnet
  • get_pct_routes_fast
  • get_pct_routes_quiet
  • get_pct_zones
  • uptake_pct_godutch
  • uptake_pct_govtarget

For example, to get the centroids in West Yorkshire:

centroids = get_pct_centroids(region = "west-yorkshire")
plot(centroids[, "geo_name"])

Likewise to download the desire lines for “west-yorkshire”:

lines = get_pct_lines(region = "west-yorkshire")
lines = lines[order(lines$all, decreasing = TRUE), c("all")]
plot(lines[1:10,], lwd = 4)
# view the lines on a map
# mapview::mapview(lines[1:3000, c("geo_name1")])

Estimate cycling uptake

An important part of the PCT is its ability to create model scenarios of cycling uptake. Key to the PCT uptake model is ‘distance decay’, meaning that short trips are more likely to be cycled than long trips. The functions uptake_pct_govtarget() and uptake_pct_godutch() implement uptake models used in the PCT, which use distance and hilliness per desire line as inputs and output the proportion of people who could be expected to cycle if that scenario were realised. The scenarios of cycling uptake produced by these functions are not predictions of what will happen, but illustrative snapshots of what could happen if overall propensity to cycle reached a certain level. The uptake levels produced by Go Dutch and Government Target scenarios (which represent increases in cycling, not final levels) are illustrated in the graph below (other scenarios could be produced, see the source code see how these models work):

distances = 1:20
hilliness = 0:5
uptake_df = data.frame(
  distances = rep(distances, 6),
  hilliness = rep(hilliness, each = 20)
)
p_govtarget = uptake_pct_govtarget(
    distance = uptake_df$distances,
    gradient = uptake_df$hilliness
    )
p_godutch = uptake_pct_godutch(
    distance = uptake_df$distances,
    gradient = uptake_df$hilliness
    )
uptake_df = rbind(
  cbind(uptake_df, scenario = "govtarget", pcycle = p_govtarget),
  cbind(uptake_df, scenario = "godutch", pcycle = p_godutch)
)
library(ggplot2)
ggplot(uptake_df) +
  geom_line(aes(
    distances,
    pcycle,
    linetype = scenario,
    colour = as.character(hilliness)
  )) +
  scale_color_discrete("Gradient (%)")

The proportion of trips made by cycling along each origin-destination (OD) pair therefore depends on the trip distance and hilliness. The equivalent plot for hilliness is as follows:

distances = c(1, 3, 6, 10, 15, 21)
hilliness = seq(0, 10, by = 0.2)
uptake_df = 
  data.frame(
    expand.grid(distances, hilliness)
  )
names(uptake_df) = c("distances", "hilliness")
p_govtarget = uptake_pct_govtarget(
    distance = uptake_df$distances,
    gradient = uptake_df$hilliness
    )
p_godutch = uptake_pct_godutch(
    distance = uptake_df$distances,
    gradient = uptake_df$hilliness
    )
uptake_df = rbind(
  cbind(uptake_df, scenario = "govtarget", pcycle = p_govtarget),
  cbind(uptake_df, scenario = "godutch", pcycle = p_godutch)
)
ggplot(uptake_df) +
  geom_line(aes(
    hilliness,
    pcycle,
    linetype = scenario,
    colour = formatC(distances, flag = "0", width = 2)
  )) +
  scale_color_discrete("Distance (km)")

Note: if distances or gradient values appear to be provided in incorrect units, they will automatically be updated:

distances = uptake_df$distances * 1000
hilliness = uptake_df$hilliness / 100
res = uptake_pct_godutch(distances, hilliness, verbose = TRUE)
#> Distance assumed in m, switching to km
#> Gradient assumed to be gradient, switching to % (*100)

The main input dataset into the PCT is OD data and, to convert each OD pair into a geographic desire line, geographic zone or centroids. Typical input data is provided in packaged datasets od_leeds and zones_leeds, as shown in the next section.

Reproduce PCT for Leeds

This example shows how scenarios of cycling uptake, and how ‘distance decay’ works (short trips are more likely to be cycled than long trips).

The input data looks like this (origin-destination data and geographic zone data):

class(od_leeds)
#> [1] "tbl_df"     "tbl"        "data.frame"
od_leeds[c(1:3, 12)]
#> # A tibble: 10 x 4
#>    area_of_residence area_of_workplace   all bicycle
#>    <chr>             <chr>             <dbl>   <dbl>
#>  1 E02002363         E02006875           922      43
#>  2 E02002373         E02006875          1037      73
#>  3 E02002384         E02006875           966      13
#>  4 E02002385         E02006875           958      52
#>  5 E02002392         E02006875           753      19
#>  6 E02002404         E02006875          1145      10
#>  7 E02002411         E02006875           929      27
#>  8 E02006852         E02006875          1221      99
#>  9 E02006861         E02006875          1177      56
#> 10 E02006876         E02006875          1035      10
class(zones_leeds)
#> [1] "sf"         "data.frame"
zones_leeds[1:3, ]
#> Simple feature collection with 3 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -1.727245 ymin: 53.90046 xmax: -1.294313 ymax: 53.94589
#> Geodetic CRS:  WGS 84
#>      objectid  msoa11cd  msoa11nm msoa11nmw st_areasha st_lengths
#> 2270     2270 E02002330 Leeds 001 Leeds 001    3460674  10002.983
#> 2271     2271 E02002331 Leeds 002 Leeds 002   21870986  26417.665
#> 2272     2272 E02002332 Leeds 003 Leeds 003    2811303   8586.548
#>                            geometry
#> 2270 MULTIPOLYGON (((-1.392046 5...
#> 2271 MULTIPOLYGON (((-1.340405 5...
#> 2272 MULTIPOLYGON (((-1.682211 5...

The stplanr package can be used to convert the non-geographic OD data into geographic desire lines as follows:

library(sf)
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
desire_lines = stplanr::od2line(flow = od_leeds, zones = zones_leeds[2])
#> Creating centroids representing desire line start and end points.
plot(desire_lines[c(1:3, 12)])

We can convert these straight lines into routes with a routing service, e.g.:

segments_fast = stplanr::route(l = desire_lines, route_fun = cyclestreets::journey)
#> Most common output is sf

We got useful information from this routing operation, we will convert the route segments into complete routes with dplyr:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
routes_fast = segments_fast %>% 
  group_by(area_of_residence, area_of_workplace) %>% 
  summarise(
    all = unique(all),
    bicycle = unique(bicycle),
    length = sum(distances),
    av_incline = mean(gradient_smooth) * 100
  ) 
#> `summarise()` has grouped output by 'area_of_residence'. You can override using the `.groups` argument.
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar
#> although coordinates are longitude/latitude, st_union assumes that they are planar

The results at the route level are as follows:

plot(routes_fast)

Now we estimate cycling uptake:

routes_fast$uptake = uptake_pct_govtarget(distance = routes_fast$length, gradient = routes_fast$av_incline)
routes_fast$bicycle_govtarget = routes_fast$bicycle +
  round(routes_fast$uptake * routes_fast$all)

Let’s see how many people started cycling:

sum(routes_fast$bicycle_govtarget) - sum(routes_fast$bicycle)
#> [1] 400

Nearly 1000 more people cycling to work, just in 10 desire is not bad! What % cycling is this, for those routes?

sum(routes_fast$bicycle_govtarget) / sum(routes_fast$all)
#> [1] 0.07906931
sum(routes_fast$bicycle) / sum(routes_fast$all)
#> [1] 0.03963324

It’s gone from 4% to 11%, a realistic increase if cycling were enabled by good infrastructure and policies.

Now: where to prioritise that infrastructure and those policies?

routes_fast_linestrings = sf::st_cast(routes_fast, "LINESTRING")
#> Warning in st_cast.sf(routes_fast, "LINESTRING"): repeating attributes for all
#> sub-geometries for which they may not be constant
rnet = stplanr::overline(routes_fast_linestrings, attrib = c("bicycle", "bicycle_govtarget"))
lwd = rnet$bicycle_govtarget / mean(rnet$bicycle_govtarget)
plot(rnet["bicycle_govtarget"], lwd = lwd)

We can view the results in an interactive map and share with policy makers, stakeholders, and the public! E.g. (see interactive map here):

mapview::mapview(rnet, zcol = "bicycle_govtarget", lwd = lwd * 2)

Current limitations

  • This package currently does not estimate cycling uptake associated with intrazonal flows and people with no fixed job data
  • This package currently does not estimate health benefits

Next steps and further resources (work in progress)

  • Add additional scenarios of cycling uptake from different places (e.g. goCambridge)
  • Add additional distance decay functions
  • Make it easy to use data from other cities around the world
  • Show how to create raster tiles of cycling uptake

Copy Link

Version

Install

install.packages('pct')

Monthly Downloads

579

Version

0.9.1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

June 25th, 2021

Functions in pct (0.9.1)

get_pct_routes_quiet

Get quiet road network results from the PCT
desire_lines_leeds

Cycle route desire lines for Leeds
get_od

Get origin destination data from the 2011 Census
get_pct_routes_fast

Get fast road network results from the PCT
get_pct_lines

Get desire lines results from the PCT
get_desire_lines

Desire lines
get_pct_centroids

Get centroid results from the PCT
get_pct_zones

Get zone results from the PCT
pct_regions_lookup

Lookup table matching PCT regions to local authorities
get_pct_rnet

Get road network results from the PCT
od_leeds

Example OD data for Leeds
wight_od

Official origin-destination data for the Isle of Wight
wight_routes_30

Cycle route data for the Isle of Wight
leeds_uber_sample

Top 15 min mean journy times within Leeds from Uber
uptake_pct_govtarget

Calculate cycling uptake for UK 'Government Target' scenario
pct_regions

PCT regions from www.pct.bike
wight_lines_30

Desire lines from the PCT for the Isle of Wight
routes_fast_leeds

Fastest cycle routes for the desire_lines_leeds
santiago_lines

Desire lines in central Santiago
mode_names

Mode names in the Census
santiago_zones

Zones in central Santiago
model_pcycle_pct_2020

Model cycling levels as a function of explanatory variables
uptake_pct_godutch

Calculate cycling uptake for UK 'Go Dutch' scenario
wight_zones

Zones and centroid data from the PCT for the Isle of Wight
zones_leeds

Zone data for Leeds
santiago_routes_cs

200 cycle routes in central Santiago, Chile
santiago_od

OD data in central Santiago
rnet_leeds

Route network for Leeds
get_centroids_ew

Download MSOA centroids for England and Wales
get_pct

Generic function to get regional data from the PCT