Lets start with the city A/B street began, Seattle. U.S. The seaport
city is home to over 700,000 people, including A/B Street creator,
Dustin Carlino, who has been developing tools to empirically study the
impact of small changes within the road network: this means “you can
transform that street parking into a bus lane or fix that pesky left
turn at a traffic signal, measure the effects, then propose actually
making the change”. For the past two years, Seattle has been a key are
of study within the A/B street simuverse and thus makes it a great
starting point in understanding the utilities of abstr
to
generate site data for A/B street.
This example demonstrates how to wrangle the data for the three key
components needsed to generate scenarios for A/B street. These
components are: OD data, site zones and site buildings. With these
components, the abstr
package is ready to convert
dataframes into simulations! The kinds of data processing steps
demonstrated in this vignette could be applied to generate
‘scenario.json’ files for other cities, and represent a reproducibility
challenge: can you reproduce the results shown in the final .json file
at the end of the vignette?
Lets get started.
Now lets start with fetching the polygon area for Montlake. To be consistent with whats on A/B street currently, we can grab the official polygon from the github repo. Following this we can clean the data and convert it to WGS84.
montlake_poly_url = "https://raw.githubusercontent.com/a-b-street/abstreet/master/importer/config/us/seattle/montlake.poly"
raw_boundary_vec = readr::read_lines(montlake_poly_url)
boundary_matrix = raw_boundary_vec[(raw_boundary_vec != "boundary") & (raw_boundary_vec != "1") & (raw_boundary_vec != "END")] %>%
stringr::str_trim() %>%
tibble::as_tibble() %>%
dplyr::mutate(y_boundary = as.numeric(lapply(stringr::str_split(value, " "), `[[`, 1)),
x_boundary = as.numeric(lapply(stringr::str_split(value, " "), `[[`, 2))) %>%
dplyr::select(-value) %>%
as.matrix()
boundary_sf_poly = sf::st_sf(geometry = sf::st_sfc(sf::st_polygon(list(boundary_matrix)), crs = 4326))
Next, we fetch zone data for the Seattle district, this comes from soundcast and needs to be parsed based on our polygon boundary.
Now we need to get some OD data into the mix. Finding this data for some cities can be tricky, luckily soundcast provides granular data for trips in Seattle for 2014. This data is then converted to an OD matrix and is filtered by trips that start or finish in Montlake zones. Furthermore, the data is then transformed into a wide format and filtered to only include OD entries with greater than 25 trips. Voila, the OD data is ready to go. The OD data is then parsed against all zones in the montlake area.
## process the disagreggated soundcast trips data
all_trips_tbl = readr::read_csv("http://abstreet.s3-website.us-east-2.amazonaws.com/dev/data/input/us/seattle/trips_2014.csv.gz")
## create a OD matrix
od_tbl_long = dplyr::select(all_trips_tbl, otaz, dtaz, mode) %>%
dplyr::mutate(mode = dplyr::case_when(mode %in% c(1, 9) ~ "Walk",
mode == 2 ~ "Bike",
mode %in% c(3, 4, 5) ~ "Drive",
mode %in% c(6, 7, 8) ~ "Transit",
TRUE ~ as.character(NA))) %>%
dplyr::filter(!is.na(mode)) %>%
dplyr::group_by(otaz, dtaz, mode) %>%
dplyr::summarize(n = n()) %>%
dplyr::ungroup() %>%
# only keep an entry if the origin or destination is in a Montlake zone
dplyr::filter((otaz %in% zones_in_boundary_tbl$TAZ) | (dtaz %in% zones_in_boundary_tbl$TAZ))
# create a wide OD matrix and filter out any OD entries with under 25 trips in it
montlake_od_tbl = tidyr::pivot_wider(od_tbl_long, names_from = mode, values_from = n, values_fill = 0) %>%
dplyr::rename(o_id = otaz, d_id = dtaz) %>%
dplyr::mutate(total = Drive + Transit + Bike + Walk) %>%
dplyr::filter(total >= 25) %>%
dplyr::select(-total)
montlake_zone_tbl = dplyr::right_join(all_zones_tbl,
tibble::tibble("TAZ" = unique(c(montlake_od_tbl$o_id, montlake_od_tbl$d_id))),
by = "TAZ") %>%
dplyr::select(TAZ) %>%
dplyr::rename(id = TAZ)
A/B street functions by generating buildings based on OSM entries,
luckily the osmextract
makes this an easy process in R. OSM
buildings must be valid sf
objects so that they can be
parsed against the zone areas. To speed things up, the later part of
this chunk selects 20% of buildings in each zone.
osm_polygons = osmextract::oe_read("http://download.geofabrik.de/north-america/us/washington-latest.osm.pbf", layer = "multipolygons")
building_types = c("yes", "house", "detached", "residential", "apartments",
"commercial", "retail", "school", "industrial", "semidetached_house",
"church", "hangar", "mobile_home", "warehouse", "office",
"college", "university", "public", "garages", "cabin", "hospital",
"dormitory", "hotel", "service", "parking", "manufactured",
"civic", "farm", "manufacturing", "floating_home", "government",
"bungalow", "transportation", "motel", "manufacture", "kindergarten",
"house_boat", "sports_centre")
osm_buildings = osm_polygons %>%
dplyr::filter(building %in% building_types) %>%
dplyr::select(osm_way_id, name, building)
osm_buildings_valid = osm_buildings[sf::st_is_valid(osm_buildings),]
montlake_osm_buildings_all = osm_buildings_valid[montlake_zone_tbl,]
# # use to visualize the building data
# tmap::tm_shape(boundary_sf_poly) + tmap::tm_borders() +
# tmap::tm_shape(montlake_osm_buildings) + tmap::tm_polygons(col = "building")
# Filter down large objects for package -----------------------------------
montlake_osm_buildings_all_joined = montlake_osm_buildings_all %>%
sf::st_join(montlake_zone_tbl)
set.seed(2021)
# select 20% of buildings in each zone to reduce file size for this example
# remove this filter or increase the sampling to include more buildings
montlake_osm_buildings_sample = montlake_osm_buildings_all_joined %>%
dplyr::filter(!is.na(osm_way_id)) %>%
sf::st_drop_geometry() %>%
dplyr::group_by(id) %>%
dplyr::sample_frac(0.20) %>%
dplyr::ungroup()
montlake_osm_buildings_tbl = montlake_osm_buildings_all %>%
dplyr::filter(osm_way_id %in% montlake_osm_buildings_sample$osm_way_id)
abstr
So now we are ready to generate simulation files. To do this, lets
combine each of the elements outlined above, the zone
(montlake_zone_tbl
), building
(montlake_osm_buildings_tbl
) and OD
(montlake_od_tbl
) data. We do this using the ab_scenario()
function in the abstr package, which generates a data frame representing
travel between the montlake_buildings. While the OD data contains
information on origin and destination zone, ab_scenario()
‘disaggregates’ the data and randomly selects building within each
origin and destination zone to simulate travel at the individual level,
as illustrated in the chunk below which uses only a sample of the
montlake_od data, showing travel between three pairs of zones, to
illustrate the process:
# use subset of OD data for speed
set.seed(42)
montlake_od_minimal = montlake_od_tbl[sample(nrow(montlake_od_tbl), size = 3), ]
output_sf = ab_scenario(
od = montlake_od_minimal,
zones = montlake_zone_tbl,
zones_d = NULL,
origin_buildings = montlake_osm_buildings_tbl,
destination_buildings = montlake_osm_buildings_tbl,
# destinations2 = NULL,
pop_var = 3,
time_fun = ab_time_normal,
output = "sf",
modes = c("Walk", "Bike", "Drive", "Transit"))
# # visualize the results
# tmap::tm_shape(res) + tmap::tm_lines(col="mode") +
# tmap::tm_shape(montlake_zone_tbl) + tmap::tm_borders()
# build json output
ab_save(ab_json(output_sf, time_fun = ab_time_normal,
scenario_name = "Montlake Example"),
f = "montlake.json")
Let’s see what is in the file:
The first trip schedule should look something like this, matching A/B Street’s schema.
After generating a montlake_scenario.json
, you can
import and simulate it as follows.
montlake_scenario.json
file.After you successfully import this file once, it will be available in
the list of scenarios, under the “Montlake Example” name, or whatever
name
specified by the JSON file.