As an essential part of census data analysis, spatial analysis provides valuable insight into geographic distribution, demographic dynamics and socioeconomic disparities. Derived from census variables, SVI is designed to represent the relative resilience of communities within a given area. Interpreting SVI in the spatial context not only is important for managing public heath crisis, but also helps researchers/policymakers/urban planners to better understand the interactions between SVI variables and geographic factors (such as transportation networks, land use patterns, environmental factors, and access to amenities or public services).
CDC/ATSDR Social Vulnerability Index (SVI) database includes county-/census tract-level SVI for US and each state, in the format of both table(.csv) and map(shapefile). In addition, we can explore the SVI maps and tables on CDC/ATSDR SVI Interactive Map.
For spatial analysis at a different geographic level or using more
up-to-date census data, we can use findSVI to obtain a SVI table with
geometry information, and create maps in R or export to other tools
(such as QGIS) using sf::st_write()
from the sf package.
library(findSVI)
library(dplyr)
library(purrr)
library(leaflet)
library(tidyr)
library(glue)
library(htmltools)
library(sf)
Get census data with geometry
First, we are using get_census_data()
to obtain
ZCTA-level data with simple feature geometry for Philadelphia (county),
PA for 2020 (Census API required).
phl_zcta_2020_geo_data <- get_census_data(
year = 2020,
state = "PA",
county = "Philadelphia",
geography = "zcta",
geometry = TRUE)
Here, we are showing the first 10 rows of the data. With the
geometry = TRUE
, we’ll get a tibble with an additional
column containing simple feature geometry
(MULTIPOLYGON
).
#> Simple feature collection with 10 features and 134 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -75.21402 ymin: 39.87734 xmax: -74.97371 ymax: 40.11273
#> Geodetic CRS: NAD83
#> # A tibble: 10 × 135
#> GEOID NAME B06009_002E B06009_002M B09001_001E B09001_001M B11012_010E
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 19102 ZCTA5 19102 0 15 117 114 59
#> 2 19103 ZCTA5 19103 468 122 1503 378 66
#> 3 19104 ZCTA5 19104 3044 411 8411 1246 1891
#> 4 19106 ZCTA5 19106 565 198 715 187 101
#> 5 19107 ZCTA5 19107 961 258 1019 384 118
#> 6 19109 ZCTA5 19109 0 11 0 11 0
#> 7 19111 ZCTA5 19111 6489 746 16620 1410 2115
#> 8 19112 ZCTA5 19112 0 11 0 11 0
#> 9 19114 ZCTA5 19114 2085 500 6085 933 480
#> 10 19115 ZCTA5 19115 2568 551 6851 800 711
#> # ℹ 128 more variables: B11012_010M <dbl>, B11012_015E <dbl>,
#> # B11012_015M <dbl>, B16005_007E <dbl>, B16005_007M <dbl>, B16005_008E <dbl>,
#> # B16005_008M <dbl>, B16005_012E <dbl>, B16005_012M <dbl>, B16005_013E <dbl>,
#> # B16005_013M <dbl>, B16005_017E <dbl>, B16005_017M <dbl>, B16005_018E <dbl>,
#> # B16005_018M <dbl>, B16005_022E <dbl>, B16005_022M <dbl>, B16005_023E <dbl>,
#> # B16005_023M <dbl>, B16005_029E <dbl>, B16005_029M <dbl>, B16005_030E <dbl>,
#> # B16005_030M <dbl>, B16005_034E <dbl>, B16005_034M <dbl>, …
Get SVI with geometry
After getting the data ready, we can supply this tibble with simple
feature geometry to get_svi()
.
phl_zcta_2020_geo_svi <- get_svi(
year = 2020,
data = phl_zcta_2020_geo_data
)
get_svi()
will return the full SVI table with every SVI
variables, intermediate percentile ranks, theme-specific SVIs and
overall SVI (consistent with CDC/ATSDR SVI database, without MOE).
phl_zcta_2020_geo_svi %>% glimpse()
#> Rows: 48
#> Columns: 64
#> $ GEOID <chr> "19102", "19103", "19104", "19106", "19107", "19109", "191…
#> $ NAME <chr> "ZCTA5 19102", "ZCTA5 19103", "ZCTA5 19104", "ZCTA5 19106"…
#> $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-75.16854 3..., MULTIPOLYGON …
#> $ E_TOTPOP <dbl> 5335, 25113, 52480, 13536, 14689, 0, 67159, 0, 31448, 3550…
#> $ E_HU <dbl> 3967, 18038, 20416, 8597, 8920, 0, 26250, 0, 13909, 14976,…
#> $ E_HH <dbl> 3365, 15765, 16508, 7625, 7823, 0, 24355, 0, 13032, 14391,…
#> $ E_POV150 <dbl> 834, 3494, 21918, 894, 3350, 0, 17235, 0, 4817, 5594, 7856…
#> $ E_UNEMP <dbl> 84, 453, 1692, 261, 388, 0, 2208, 0, 856, 1042, 840, 247, …
#> $ E_HBURD <dbl> 931, 4870, 8191, 1670, 2891, 0, 8433, 0, 3906, 4598, 4698,…
#> $ E_NOHSDP <dbl> 0, 468, 3044, 565, 961, 0, 6489, 0, 2085, 2568, 2640, 290,…
#> $ E_UNINSUR <dbl> 165, 608, 2605, 463, 677, 0, 6089, 0, 1143, 2386, 2143, 43…
#> $ E_AGE65 <dbl> 896, 5330, 3771, 2521, 1485, 0, 10956, 0, 5758, 8673, 7151…
#> $ E_AGE17 <dbl> 117, 1503, 8411, 715, 1019, 0, 16620, 0, 6085, 6851, 6812,…
#> $ E_DISABL <dbl> 333, 2147, 5885, 1057, 1723, 0, 11275, 0, 4722, 5561, 5217…
#> $ E_SNGPNT <dbl> 59, 71, 2060, 101, 146, 0, 2562, 0, 589, 1032, 438, 183, 9…
#> $ E_LIMENG <dbl> 0, 490, 877, 156, 732, 0, 6133, 0, 771, 4160, 5210, 133, 2…
#> $ E_MINRTY <dbl> 1270, 7579, 35619, 2864, 7181, 0, 39350, 0, 8930, 12535, 1…
#> $ E_MUNIT <dbl> 3543, 12841, 6474, 5989, 6123, 0, 3367, 0, 2321, 4401, 240…
#> $ E_MOBILE <dbl> 28, 48, 22, 0, 13, 0, 42, 0, 9, 96, 27, 0, 14, 16, 23, 0, …
#> $ E_CROWD <dbl> 42, 265, 750, 187, 380, 0, 1255, 0, 129, 573, 458, 57, 96,…
#> $ E_NOVEH <dbl> 1591, 8644, 8893, 2061, 4527, 0, 3579, 0, 1028, 2009, 1513…
#> $ E_GROUPQ <dbl> 410, 571, 14238, 1246, 1615, 0, 588, 0, 386, 513, 788, 108…
#> $ EP_POV150 <dbl> 16.9, 14.2, 56.3, 7.2, 24.7, NaN, 25.9, NaN, 15.5, 16.1, 2…
#> $ EP_UNEMP <dbl> 2.2, 2.7, 7.4, 2.9, 4.1, NA, 6.6, NA, 5.2, 6.2, 5.1, 4.3, …
#> $ EP_HBURD <dbl> 27.7, 30.9, 49.6, 21.9, 37.0, NaN, 34.6, NaN, 30.0, 32.0, …
#> $ EP_NOHSDP <dbl> 0.0, 2.2, 13.3, 4.7, 8.3, NA, 14.1, NA, 8.9, 9.6, 10.3, 3.…
#> $ EP_UNINSUR <dbl> 3.1, 2.4, 5.0, 3.7, 4.6, NA, 9.1, NA, 3.7, 6.8, 6.4, 4.1, …
#> $ EP_AGE65 <dbl> 16.8, 21.2, 7.2, 18.6, 10.1, NA, 16.3, NA, 18.3, 24.4, 21.…
#> $ EP_AGE17 <dbl> 2.2, 6.0, 16.0, 5.3, 6.9, NaN, 24.7, NaN, 19.3, 19.3, 20.0…
#> $ EP_DISABL <dbl> 6.2, 8.7, 11.4, 8.5, 11.8, NA, 16.9, NA, 15.1, 15.9, 15.6,…
#> $ EP_SNGPNT <dbl> 1.8, 0.5, 12.5, 1.3, 1.9, NaN, 10.5, NaN, 4.5, 7.2, 3.4, 3…
#> $ EP_LIMENG <dbl> 0.0, 2.0, 1.7, 1.2, 5.1, NaN, 9.9, NaN, 2.6, 12.6, 16.2, 1…
#> $ EP_MINRTY <dbl> 23.8, 30.2, 67.9, 21.2, 48.9, NaN, 58.6, NaN, 28.4, 35.3, …
#> $ EP_MUNIT <dbl> 89.3, 71.2, 31.7, 69.7, 68.6, NaN, 12.8, NaN, 16.7, 29.4, …
#> $ EP_MOBILE <dbl> 0.7, 0.3, 0.1, 0.0, 0.1, NA, 0.2, NA, 0.1, 0.6, 0.2, 0.0, …
#> $ EP_CROWD <dbl> 1.2, 1.7, 4.5, 2.5, 4.9, NaN, 5.2, NaN, 1.0, 4.0, 3.6, 1.2…
#> $ EP_NOVEH <dbl> 47.3, 54.8, 53.9, 27.0, 57.9, NA, 14.7, NA, 7.9, 14.0, 11.…
#> $ EP_GROUPQ <dbl> 7.7, 2.3, 27.1, 9.2, 11.0, NaN, 0.9, NaN, 1.2, 1.4, 2.3, 1…
#> $ EPL_POV150 <dbl> 0.2000, 0.0889, 0.9778, 0.0000, 0.4222, NA, 0.4667, NA, 0.…
#> $ EPL_UNEMP <dbl> 0.0000, 0.0222, 0.5111, 0.0444, 0.1333, NA, 0.4222, NA, 0.…
#> $ EPL_HBURD <dbl> 0.2222, 0.3111, 1.0000, 0.0222, 0.6000, NA, 0.5333, NA, 0.…
#> $ EPL_NOHSDP <dbl> 0.0000, 0.0222, 0.6222, 0.1333, 0.2667, NA, 0.6444, NA, 0.…
#> $ EPL_UNINSUR <dbl> 0.0444, 0.0000, 0.2889, 0.0889, 0.2444, NA, 0.8222, NA, 0.…
#> $ EPL_AGE65 <dbl> 0.7556, 0.9333, 0.0222, 0.8667, 0.2000, NA, 0.6889, NA, 0.…
#> $ EPL_AGE17 <dbl> 0.0000, 0.0444, 0.2000, 0.0222, 0.0889, NA, 0.8222, NA, 0.…
#> $ EPL_DISABL <dbl> 0.0000, 0.0667, 0.1778, 0.0444, 0.2000, NA, 0.6444, NA, 0.…
#> $ EPL_SNGPNT <dbl> 0.0667, 0.0222, 0.6667, 0.0444, 0.0889, NA, 0.5778, NA, 0.…
#> $ EPL_LIMENG <dbl> 0.0000, 0.4667, 0.4000, 0.3333, 0.6444, NA, 0.8000, NA, 0.…
#> $ EPL_MINRTY <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ EPL_MUNIT <dbl> 1.0000, 0.9778, 0.8667, 0.9556, 0.9333, NA, 0.5111, NA, 0.…
#> $ EPL_MOBILE <dbl> 0.9111, 0.7111, 0.2667, 0.0000, 0.2667, NA, 0.5333, NA, 0.…
#> $ EPL_CROWD <dbl> 0.1556, 0.3111, 0.8667, 0.5556, 0.9111, NA, 0.9556, NA, 0.…
#> $ EPL_NOVEH <dbl> 0.8889, 0.9778, 0.9556, 0.4667, 1.0000, NA, 0.2222, NA, 0.…
#> $ EPL_GROUPQ <dbl> 0.8667, 0.5556, 1.0000, 0.8889, 0.9556, NA, 0.2889, NA, 0.…
#> $ SPL_theme1 <dbl> 0.4666, 0.4444, 3.4000, 0.2888, 1.6666, NA, 2.8888, NA, 1.…
#> $ SPL_theme2 <dbl> 0.8223, 1.5333, 1.4667, 1.3110, 1.2222, NA, 3.5333, NA, 2.…
#> $ SPL_theme3 <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ SPL_theme4 <dbl> 3.8223, 3.5334, 3.9557, 2.8668, 4.0667, NA, 2.5111, NA, 1.…
#> $ RPL_theme1 <dbl> 0.0444, 0.0222, 0.6889, 0.0000, 0.3333, NA, 0.5778, NA, 0.…
#> $ RPL_theme2 <dbl> 0.0222, 0.2222, 0.2000, 0.1111, 0.0667, NA, 0.8667, NA, 0.…
#> $ RPL_theme3 <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ RPL_theme4 <dbl> 0.9556, 0.9333, 0.9778, 0.7111, 1.0000, NA, 0.5778, NA, 0.…
#> $ SPL_themes <dbl> 5.2001, 5.6889, 9.3780, 4.5110, 7.3555, NA, 9.4443, NA, 5.…
#> $ RPL_themes <dbl> 0.1778, 0.2667, 0.6889, 0.0444, 0.3778, NA, 0.7111, NA, 0.…
For visualization purposes, we’ll simplify the table and keep only the GEOID, geometry and SVIs.
svi <- phl_zcta_2020_geo_svi %>%
select(GEOID, geometry, contains("RPL_theme"))
svi %>% head(10)
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -75.21402 ymin: 39.87734 xmax: -74.97371 ymax: 40.11273
#> Geodetic CRS: NAD83
#> # A tibble: 10 × 7
#> GEOID geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#> <chr> <MULTIPOLYGON [°]> <dbl> <dbl> <dbl> <dbl>
#> 1 19102 (((-75.16854 39.94663, -75… 0.0444 0.0222 0.0889 0.956
#> 2 19103 (((-75.18166 39.95151, -75… 0.0222 0.222 0.178 0.933
#> 3 19104 (((-75.21367 39.96003, -75… 0.689 0.2 0.556 0.978
#> 4 19106 (((-75.15476 39.94573, -75… 0 0.111 0.0444 0.711
#> 5 19107 (((-75.16506 39.95361, -75… 0.333 0.0667 0.4 1
#> 6 19109 (((-75.16434 39.94935, -75… NA NA NA NA
#> 7 19111 (((-75.10653 40.04938, -75… 0.578 0.867 0.511 0.578
#> 8 19112 (((-75.19692 39.90088, -75… NA NA NA NA
#> 9 19114 (((-75.03509 40.07462, -75… 0.178 0.422 0.156 0.111
#> 10 19115 (((-75.07456 40.08912, -75… 0.378 0.822 0.289 0.822
#> # ℹ 1 more variable: RPL_themes <dbl>
Interactive maps for overall SVI
With the simple feature geometry, we can visualize SVI patterns and perform spatial SVI analysis with any mapping tool. Here, we’re using a powerful package leaflet for interactive maps.
First we’ll examine the missing value.
svi %>% filter(is.na(RPL_theme1))
#> Simple feature collection with 2 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -75.19692 ymin: 39.87734 xmax: -75.13541 ymax: 39.95011
#> Geodetic CRS: NAD83
#> # A tibble: 2 × 7
#> GEOID geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#> * <chr> <MULTIPOLYGON [°]> <dbl> <dbl> <dbl> <dbl>
#> 1 19109 (((-75.16434 39.94935, -75.… NA NA NA NA
#> 2 19112 (((-75.19692 39.90088, -75.… NA NA NA NA
#> # ℹ 1 more variable: RPL_themes <dbl>
Looks like there are a few ZCTAs with missing values in SVIs. (Upon checking the full SVI table, we can see that most of the individual SVI variables are 0 for those ZCTAs.) So we’ll remove those ZCTAs for better visualization.
To set up the interactive map for overall SVI, leaflet()
does most of the heavy lifting. Here, we’ll just add some customized
color palette and labels for the appearance.
svi_clean <- svi %>% drop_na()
#above shows CRS NAD83, change to 4326 (WGS84) to avoid warning as below:
#Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
#Need '+proj=longlat +datum=WGS84'
st_crs(svi_clean) <- 4326
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
#> that
#set color palette
pal <- colorNumeric(
palette = c("orange","navy"),
domain = svi_clean$RPL_themes
)
#set label
zcta_label <- glue("<h3 style='margin: 0px'>{svi_clean$GEOID}</h3>
overall SVI: {svi_clean$RPL_themes}") %>%
map(~HTML(.x))
leaflet(svi_clean) %>%
addProviderTiles(providers$CartoDB.Voyager) %>%
addPolygons(color = "white",
weight = 0.5,
smoothFactor = 0.5,
opacity = 1,
fillColor = ~pal(RPL_themes),
fillOpacity = 0.8,
highlightOptions = highlightOptions(
weight = 5,
color = "white",
fillOpacity = 0.8,
bringToFront = TRUE),
label = zcta_label,
labelOptions = labelOptions(
style = list(
"font-family" = "Fira Sans, sans-serif",
"font-size" = "1.2em"
))
) %>%
addLegend("bottomleft",
pal = pal,
values = ~RPL_themes,
title = "Overall SVI in all ZCTAs in Philadelphia, PA (2020)",
#labFormat = labelFormat(prefix = "$"),
opacity = 1)
From the interactive map, we can visualize easily how SVI varies in different regions in PA and zoom in to examine specific ZCTAs of interest, making it a helpful approach to explore new ideas, patterns and analyses.