Building on the initial exploratory data analysis on variable distribution, I’d like to join the spatial information from ZAT folder with the variables from ZAT_INDICADORES.xlsx and visualize the geo-referenced data.
Files included
ZAT/
This folder contains boundary files for the ZAT units – a table of 1141 rows, 6 cols (each row being a ZAT unit); (dropping z dimension to create interactive maps).
Reading layer `ZAT' from data source
`D:\LocalGitHub\BEPIDL\data\ZAT\ZAT_geo\ZAT.shp' using driver `ESRI Shapefile'
Simple feature collection with 1141 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension: XY, XYZ
Bounding box: xmin: -74.88439 ymin: 3.70344 xmax: -73.05211 ymax: 5.83076
z_range: zmin: 0 zmax: 0
Geodetic CRS: MAGNA-SIRGAS
Code
zat %>%st_zm() %>%leaflet() %>%addProviderTiles(providers$CartoDB.Voyager) %>%addPolygons()
ZAT_INDICADORES.xlsx
This is a data table of 919 rows, 18 cols (each row being a ZAT unit).
According to the codebook, the indicators in this table seem to be the “old” version. For all the ZAT included, there are no missing data, while some of the values are zeros. As seen from the row count, it contains fewer ZAT units than the shapefiles. (first 10 rows shown below)
Here we’ll join the ZAT shapefiles with the ZAT_INDICADORES.xlsx by ZAT ID, keeping the ZAT units that we have data for, resulting in a table of 919 rows and 23 columns.
Code
georef_zat <- zat_data %>%left_join(zat, by ="ZAT") %>%st_as_sf()
Visualizing Indicators in Maps
We’ll use the georeferenced data to visualize the indicators in maps (dropping z dimension, datum set to WGS84). With loose categorizing the variables into 3 groups: public transits, road features and traffic lights, we’ll show interactive maps for one variable for each group along with distributions of all variables.
Public transit
Code
library(glue)# set color palette for circlespal <-colorNumeric(palette =c("orange","navy"),domain = georef_zat$BUSTOPDENS)zat_label <-glue("ZAT{georef_zat$ZAT} Bus stop density: {round(georef_zat$BUSTOPDENS,2)}/km2")georef_zat %>%st_zm() %>%st_transform(crs =st_crs("+proj=longlat +datum=WGS84")) %>%leaflet() %>%addProviderTiles(providers$CartoDB.Voyager) %>%addPolygons(color ="white", weight =0.5,smoothFactor =0.5,opacity =1,fillColor =~pal(BUSTOPDENS),fillOpacity =0.8,highlightOptions =highlightOptions(weight =5,color ="#666",fillOpacity =0.8,bringToFront =TRUE),label = zat_label,labelOptions =labelOptions(style =list("font-family"="Fira Sans, sans-serif","font-size"="1.2em" )) ) %>%addLegend("bottomleft",pal = pal,values =~BUSTOPDENS,title ="Bus Stop Density in Bogotá</br>(ZAT-level)",opacity =1)
Code
zat_data_long <- zat_data %>%select(-2, -3) %>%pivot_longer(-ZAT, names_to ="indicator")bus <-c('BUSTOPDENS', 'NUMRBP', 'LONGRBP', 'NUMRT', 'LONGRT')onroad <-c('LONGMV', 'LRDENS', 'BPRDRATE', 'NUMINT', 'INTDENS')tlight <-c('NUMTTFLIGH', 'NUMPTFLIGH')byroad <-c('NUMTTREES', 'NUMSTTREES', 'NUMBRIDGES')zat_data_long %>%filter(indicator%in%all_of(bus)) %>%ggplot(aes(x=value)) +geom_histogram(fill ="skyblue", color ="blue")+facet_wrap(~indicator, scales ="free") +theme_minimal() +labs(title ="Distribution of Indicators related to Bus/BRT")+theme(strip.background =element_rect(fill ="#dadada", color ="white"),panel.grid.minor =element_blank(),plot.title =element_text(size =15, face ="bold") )
zat_data_long %>%filter(indicator%in%all_of(onroad)) %>%ggplot(aes(x=value)) +geom_histogram(fill ="skyblue", color ="blue")+facet_wrap(~indicator, scales ="free") +theme_minimal()+labs(title ="Distribution of Indicators related to Features on the Road") +theme(strip.background =element_rect(fill ="#dadada", color ="white"),panel.grid.minor =element_blank(),plot.title =element_text(size =15, face ="bold") )
Code
zat_data_long %>%filter(indicator%in%all_of(byroad)) %>%ggplot(aes(x=value)) +geom_histogram(fill ="skyblue", color ="blue", alpha =0.7)+facet_wrap(~indicator, scales ="free") +theme_minimal()+labs(title ="Distribution of Indicators related to Features by the Road") +theme(strip.background =element_rect(fill ="#dadada", color ="white"),panel.grid.minor =element_blank(),plot.title =element_text(size =15, face ="bold") )
zat_data_long %>%filter(indicator%in%all_of(tlight)) %>%ggplot(aes(x=value)) +geom_histogram(fill ="skyblue", color ="blue", alpha =0.7)+facet_wrap(~indicator, scales ="free") +theme_minimal()+labs(title ="Distribution of Indicators about Traffic Lights") +theme(strip.background =element_rect(fill ="#dadada", color ="white"),panel.grid.minor =element_blank(),plot.title =element_text(size =15, face ="bold") )
Initial takeaway: some of the indicators seem to show similar spatial patterns, though there is a significant number of lower count values in the dataset which might make the visualization less clear.