As an essential part of census data analysis, spatial analysis provides valuable insight into geographic distribution, demographic dynamics and socioeconomic disparities. Derived from census variables, SVI is designed to represent the relative resilience of communities within a given area. Interpreting SVI in the spatial context not only is important for managing public heath crisis, but also helps researchers/policymakers/urban planners to better understand the interactions between SVI variables and geographic factors (such as transportation networks, land use patterns, environmental factors, and access to amenities or public services).
CDC/ATSDR Social Vulnerability Index (SVI) database includes county-/census tract-level SVI for US and each state, in the format of both table(.csv) and map(shapefile). In addition, we can explore the SVI maps and tables on CDC/ATSDR SVI Interactive Map.
For spatial analysis at a different geographic level or using more
up-to-date census data, we can use findSVI to obtain a SVI table with
geometry information, and create maps in R or export to other tools
(such as QGIS) using sf::st_write()
from the sf package.
library(findSVI)
library(dplyr)
library(purrr)
library(leaflet)
library(tidyr)
library(glue)
library(htmltools)
library(sf)
Get census data with geometry
First, we are using get_census_data()
to obtain
ZCTA-level data with simple feature geometry for PA for 2020 (Census API
required).
pa_zcta_2020_geo_data <- get_census_data(
year = 2020,
state = "PA",
geography = "zcta",
geometry = TRUE)
Here, we are showing the first 10 rows of the data. With the
geometry = TRUE
, we’ll get a tibble with an additional
column containing simple feature geometry
(MULTIPOLYGON
).
#> Simple feature collection with 10 features and 134 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -80.48268 ymin: 40.10397 xmax: -79.73383 ymax: 40.83308
#> Geodetic CRS: NAD83
#> # A tibble: 10 × 135
#> GEOID NAME B06009_002E B06009_002M B09001_001E B09001_001M B11012_010E
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 15001 ZCTA5 15001 1655 417 5450 582 724
#> 2 15003 ZCTA5 15003 441 125 2352 245 250
#> 3 15004 ZCTA5 15004 62 43 63 70 21
#> 4 15005 ZCTA5 15005 293 92 1380 200 65
#> 5 15006 ZCTA5 15006 37 58 13 28 0
#> 6 15007 ZCTA5 15007 0 11 22 35 0
#> 7 15009 ZCTA5 15009 404 121 3253 384 206
#> 8 15010 ZCTA5 15010 1237 242 5636 444 719
#> 9 15012 ZCTA5 15012 638 180 2750 302 336
#> 10 15014 ZCTA5 15014 136 75 527 96 105
#> # ℹ 128 more variables: B11012_010M <dbl>, B11012_015E <dbl>,
#> # B11012_015M <dbl>, B16005_007E <dbl>, B16005_007M <dbl>, B16005_008E <dbl>,
#> # B16005_008M <dbl>, B16005_012E <dbl>, B16005_012M <dbl>, B16005_013E <dbl>,
#> # B16005_013M <dbl>, B16005_017E <dbl>, B16005_017M <dbl>, B16005_018E <dbl>,
#> # B16005_018M <dbl>, B16005_022E <dbl>, B16005_022M <dbl>, B16005_023E <dbl>,
#> # B16005_023M <dbl>, B16005_029E <dbl>, B16005_029M <dbl>, B16005_030E <dbl>,
#> # B16005_030M <dbl>, B16005_034E <dbl>, B16005_034M <dbl>, …
Get SVI with geometry
After getting the data ready, we can supply this tibble with simple
feature geometry to get_svi()
.
pa_zcta_2020_geo_svi <- get_svi(
year = 2020,
data = pa_zcta_2020_geo_data
)
get_svi()
will return the full SVI table with every SVI
variables, intermediate percentile ranks, theme-specific SVIs and
overall SVI (consistent with CDC/ATSDR SVI database, without MOE).
pa_zcta_2020_geo_svi %>% glimpse()
#> Rows: 1,776
#> Columns: 64
#> $ GEOID <chr> "15001", "15003", "15004", "15005", "15006", "15007", "150…
#> $ NAME <chr> "ZCTA5 15001", "ZCTA5 15003", "ZCTA5 15004", "ZCTA5 15005"…
#> $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-80.43758 4..., MULTIPOLYGON …
#> $ E_TOTPOP <dbl> 31129, 11212, 380, 9191, 292, 629, 15114, 27205, 15243, 30…
#> $ E_HU <dbl> 16070, 6084, 207, 4275, 131, 167, 7057, 12699, 7709, 1671,…
#> $ E_HH <dbl> 14093, 5104, 138, 3948, 131, 167, 6509, 11363, 6883, 1450,…
#> $ E_POV150 <dbl> 5567, 2656, 75, 916, 0, 387, 1904, 4761, 2639, 761, 24, 14…
#> $ E_UNEMP <dbl> 787, 212, 0, 123, 0, 0, 323, 811, 376, 41, 11, 201, 0, 116…
#> $ E_HBURD <dbl> 3037, 1234, 6, 609, 0, 51, 1424, 2322, 1186, 334, 39, 1729…
#> $ E_NOHSDP <dbl> 1655, 441, 62, 293, 37, 0, 404, 1237, 638, 136, 25, 542, 2…
#> $ E_UNINSUR <dbl> 1319, 534, 38, 250, 0, 19, 315, 923, 470, 175, 21, 454, 13…
#> $ E_AGE65 <dbl> 6395, 2090, 42, 2534, 58, 414, 3718, 5815, 3595, 464, 277,…
#> $ E_AGE17 <dbl> 5450, 2352, 63, 1380, 13, 22, 3253, 5636, 2750, 527, 348, …
#> $ E_DISABL <dbl> 4871, 1656, 61, 1288, 0, 295, 2196, 4584, 2629, 379, 94, 1…
#> $ E_SNGPNT <dbl> 815, 356, 21, 92, 0, 0, 271, 820, 348, 162, 35, 319, 7, 3,…
#> $ E_LIMENG <dbl> 142, 86, 0, 19, 0, 0, 57, 42, 86, 0, 0, 46, 182, 0, 0, 4, …
#> $ E_MINRTY <dbl> 5764, 1771, 30, 459, 0, 0, 869, 3819, 1048, 264, 65, 1684,…
#> $ E_MUNIT <dbl> 955, 196, 0, 134, 0, 0, 528, 811, 353, 48, 10, 1290, 0, 0,…
#> $ E_MOBILE <dbl> 537, 38, 25, 23, 0, 0, 18, 251, 445, 0, 0, 3, 0, 93, 24, 3…
#> $ E_CROWD <dbl> 94, 8, 0, 0, 0, 0, 23, 135, 22, 7, 0, 70, 9, 0, 0, 23, 52,…
#> $ E_NOVEH <dbl> 1189, 587, 0, 135, 0, 0, 335, 1243, 481, 136, 10, 781, 46,…
#> $ E_GROUPQ <dbl> 619, 26, 0, 213, 0, 0, 304, 1268, 144, 91, 0, 438, 0, 0, 0…
#> $ EP_POV150 <dbl> 18.2, 23.7, 19.7, 10.2, 0.0, 61.5, 12.8, 18.3, 17.3, 26.0,…
#> $ EP_UNEMP <dbl> 4.8, 3.5, 0.0, 2.5, 0.0, 0.0, 4.5, 6.1, 5.0, 2.5, 1.8, 2.5…
#> $ EP_HBURD <dbl> 21.5, 24.2, 4.3, 15.4, 0.0, 30.5, 21.9, 20.4, 17.2, 23.0, …
#> $ EP_NOHSDP <dbl> 7.0, 5.5, 22.5, 4.1, 13.3, 0.0, 3.6, 6.6, 5.7, 6.2, 2.9, 4…
#> $ EP_UNINSUR <dbl> 4.3, 4.8, 10.0, 2.8, 0.0, 3.0, 2.1, 3.4, 3.1, 5.9, 1.7, 3.…
#> $ EP_AGE65 <dbl> 20.5, 18.6, 11.1, 27.6, 19.9, 65.8, 24.6, 21.4, 23.6, 15.2…
#> $ EP_AGE17 <dbl> 17.5, 21.0, 16.6, 15.0, 4.5, 3.5, 21.5, 20.7, 18.0, 17.3, …
#> $ EP_DISABL <dbl> 15.9, 14.8, 16.1, 14.3, 0.0, 46.9, 14.8, 17.1, 17.2, 12.8,…
#> $ EP_SNGPNT <dbl> 5.8, 7.0, 15.2, 2.3, 0.0, 0.0, 4.2, 7.2, 5.1, 11.2, 7.1, 4…
#> $ EP_LIMENG <dbl> 0.5, 0.8, 0.0, 0.2, 0.0, 0.0, 0.4, 0.2, 0.6, 0.0, 0.0, 0.3…
#> $ EP_MINRTY <dbl> 18.5, 15.8, 7.9, 5.0, 0.0, 0.0, 5.7, 14.0, 6.9, 8.6, 5.1, …
#> $ EP_MUNIT <dbl> 5.9, 3.2, 0.0, 3.1, 0.0, 0.0, 7.5, 6.4, 4.6, 2.9, 2.0, 17.…
#> $ EP_MOBILE <dbl> 3.3, 0.6, 12.1, 0.5, 0.0, 0.0, 0.3, 2.0, 5.8, 0.0, 0.0, 0.…
#> $ EP_CROWD <dbl> 0.7, 0.2, 0.0, 0.0, 0.0, 0.0, 0.4, 1.2, 0.3, 0.5, 0.0, 1.0…
#> $ EP_NOVEH <dbl> 8.4, 11.5, 0.0, 3.4, 0.0, 0.0, 5.1, 10.9, 7.0, 9.4, 2.0, 1…
#> $ EP_GROUPQ <dbl> 2.0, 0.2, 0.0, 2.3, 0.0, 0.0, 2.0, 4.7, 0.9, 3.0, 0.0, 2.9…
#> $ EPL_POV150 <dbl> 0.5313, 0.7039, 0.5837, 0.2295, 0.0000, 0.9920, 0.3109, 0.…
#> $ EPL_UNEMP <dbl> 0.5595, 0.3761, 0.0000, 0.2593, 0.0000, 0.0000, 0.5259, 0.…
#> $ EPL_HBURD <dbl> 0.5174, 0.6615, 0.0490, 0.2234, 0.0000, 0.8627, 0.5436, 0.…
#> $ EPL_NOHSDP <dbl> 0.4001, 0.2798, 0.9393, 0.1896, 0.7911, 0.0000, 0.1635, 0.…
#> $ EPL_UNINSUR <dbl> 0.4804, 0.5417, 0.8597, 0.2890, 0.0000, 0.3186, 0.2186, 0.…
#> $ EPL_AGE65 <dbl> 0.5516, 0.4240, 0.0959, 0.8678, 0.5091, 0.9881, 0.7832, 0.…
#> $ EPL_AGE17 <dbl> 0.3258, 0.5840, 0.2690, 0.1986, 0.0545, 0.0499, 0.6317, 0.…
#> $ EPL_DISABL <dbl> 0.5877, 0.4991, 0.5997, 0.4594, 0.0000, 0.9903, 0.4991, 0.…
#> $ EPL_SNGPNT <dbl> 0.6929, 0.7989, 0.9641, 0.2963, 0.0000, 0.0000, 0.5322, 0.…
#> $ EPL_LIMENG <dbl> 0.6555, 0.7469, 0.0000, 0.5312, 0.0000, 0.0000, 0.6226, 0.…
#> $ EPL_MINRTY <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ EPL_MUNIT <dbl> 0.7801, 0.6724, 0.0000, 0.6684, 0.0000, 0.0000, 0.8199, 0.…
#> $ EPL_MOBILE <dbl> 0.4422, 0.2872, 0.7499, 0.2775, 0.0000, 0.0000, 0.2479, 0.…
#> $ EPL_CROWD <dbl> 0.5305, 0.3886, 0.0000, 0.0000, 0.0000, 0.0000, 0.4479, 0.…
#> $ EPL_NOVEH <dbl> 0.7071, 0.8234, 0.0000, 0.3613, 0.0000, 0.0000, 0.5105, 0.…
#> $ EPL_GROUPQ <dbl> 0.8031, 0.5585, 0.0000, 0.8212, 0.0000, 0.0000, 0.8031, 0.…
#> $ SPL_theme1 <dbl> 2.4887, 2.5630, 2.4317, 1.1908, 0.7911, 2.1733, 1.7625, 2.…
#> $ SPL_theme2 <dbl> 2.8135, 3.0529, 1.9287, 2.3533, 0.5636, 2.0283, 3.0688, 3.…
#> $ SPL_theme3 <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ SPL_theme4 <dbl> 3.2630, 2.7301, 0.7499, 2.1284, 0.0000, 0.0000, 2.8293, 3.…
#> $ RPL_theme1 <dbl> 0.5028, 0.5380, 0.4745, 0.0999, 0.0392, 0.3785, 0.2236, 0.…
#> $ RPL_theme2 <dbl> 0.7418, 0.8473, 0.2520, 0.4597, 0.0148, 0.2985, 0.8593, 0.…
#> $ RPL_theme3 <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ RPL_theme4 <dbl> 0.8224, 0.6515, 0.1266, 0.4631, 0.0000, 0.0000, 0.6850, 0.…
#> $ SPL_themes <dbl> 9.3825, 9.1320, 5.7221, 6.1526, 1.3547, 4.2016, 8.1782, 9.…
#> $ RPL_themes <dbl> 0.8059, 0.7707, 0.2287, 0.2707, 0.0136, 0.1022, 0.6118, 0.…
For visualization purposes, we’ll simplify the table and keep only the GEOID, geometry and SVIs.
svi <- pa_zcta_2020_geo_svi %>%
select(GEOID, geometry, contains("RPL_theme"))
svi %>% head(10)
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -80.48268 ymin: 40.10397 xmax: -79.73383 ymax: 40.83308
#> Geodetic CRS: NAD83
#> # A tibble: 10 × 7
#> GEOID geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#> <chr> <MULTIPOLYGON [°]> <dbl> <dbl> <dbl> <dbl>
#> 1 15001 (((-80.43758 40.55899, -80… 0.503 0.742 0.817 0.822
#> 2 15003 (((-80.2366 40.59886, -80.… 0.538 0.847 0.786 0.652
#> 3 15004 (((-80.38652 40.34327, -80… 0.474 0.252 0.612 0.127
#> 4 15005 (((-80.24356 40.65536, -80… 0.0999 0.460 0.480 0.463
#> 5 15006 (((-79.88531 40.63274, -79… 0.0392 0.0148 0 0
#> 6 15007 (((-79.93657 40.65095, -79… 0.378 0.298 0 0
#> 7 15009 (((-80.45619 40.71337, -80… 0.224 0.859 0.518 0.685
#> 8 15010 (((-80.48268 40.73145, -80… 0.478 0.894 0.761 0.922
#> 9 15012 (((-79.90417 40.15197, -79… 0.310 0.863 0.568 0.749
#> 10 15014 (((-79.74922 40.60994, -79… 0.557 0.195 0.633 0.658
#> # ℹ 1 more variable: RPL_themes <dbl>
Interactive maps for overall SVI
With the simple feature geometry, we can visualize SVI patterns and perform spatial SVI analysis with any mapping tool. Here, we’re using a powerful package leaflet for interactive maps.
First we’ll examine the missing value.
missing <- svi %>% filter(is.na(RPL_theme1))
missing %>% glimpse()
#> Rows: 13
#> Columns: 7
#> $ GEOID <chr> "15260", "15616", "15691", "16312", "17077", "17120", "1782…
#> $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-79.95539 4..., MULTIPOLYGON (…
#> $ RPL_theme1 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme2 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme3 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme4 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_themes <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
Looks like there are a few ZCTAs with missing values in SVIs. (Upon checking the full SVI table, we can see that most of the individual SVI variables are 0 for those ZCTAs.) So we’ll remove those ZCTAs for better visualization.
To set up the interactive map for overall SVI, leaflet()
does most of the heavy lifting. Here, we’ll just add some customized
color palette and labels for the appearance.
svi_clean <- svi %>% drop_na()
#above shows CRS NAD83, change to 4326 (WGS84) to avoid warning as below:
#Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
#Need '+proj=longlat +datum=WGS84'
st_crs(svi_clean) <- 4326
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
#> that
#set color palette
pal <- colorNumeric(
palette = c("orange","navy"),
domain = svi_clean$RPL_themes
)
#set label
zcta_label <- glue("<h3 style='margin: 0px'>{svi_clean$GEOID}</h3>
overall SVI: {svi_clean$RPL_themes}") %>%
map(~HTML(.x))
leaflet(svi_clean) %>%
addProviderTiles(providers$CartoDB.Voyager) %>%
addPolygons(color = "white",
weight = 0.5,
smoothFactor = 0.5,
opacity = 1,
fillColor = ~pal(RPL_themes),
fillOpacity = 0.8,
highlightOptions = highlightOptions(
weight = 5,
color = "white",
fillOpacity = 0.8,
bringToFront = TRUE),
label = zcta_label,
labelOptions = labelOptions(
style = list(
"font-family" = "Fira Sans, sans-serif",
"font-size" = "1.2em"
))
) %>%
addLegend("bottomleft",
pal = pal,
values = ~RPL_themes,
title = "Overall SVI in all ZCTAs in PA (2020)",
#labFormat = labelFormat(prefix = "$"),
opacity = 1)
From the interactive map, we can visualize easily how SVI varies in different regions in PA and zoom in to examine specific ZCTAs of interest, making it a helpful approach to explore new ideas, patterns and analyses.