Mapping SVI for Spatial Analysis • findSVI

As an essential part of census data analysis, spatial analysis provides valuable insight into geographic distribution, demographic dynamics and socioeconomic disparities. Derived from census variables, SVI is designed to represent the relative resilience of communities within a given area. Interpreting SVI in the spatial context not only is important for managing public heath crisis, but also helps researchers/policymakers/urban planners to better understand the interactions between SVI variables and geographic factors (such as transportation networks, land use patterns, environmental factors, and access to amenities or public services).

CDC/ATSDR Social Vulnerability Index (SVI) database includes county-/census tract-level SVI for US and each state, in the format of both table(.csv) and map(shapefile). In addition, we can explore the SVI maps and tables on CDC/ATSDR SVI Interactive Map.

For spatial analysis at a different geographic level or using more up-to-date census data, we can use findSVI to obtain a SVI table with geometry information, and create maps in R or export to other tools (such as QGIS) using sf::st_write() from the sf package.

library(findSVI)
library(dplyr)
library(purrr)
library(leaflet)
library(tidyr)
library(glue)
library(htmltools)
library(sf)

Get census data with geometry

First, we are using get_census_data() to obtain ZCTA-level data with simple feature geometry for Philadelphia (county), PA for 2020 (Census API required).

phl_zcta_2020_geo_data <- get_census_data(
  year = 2020, 
  state = "PA", 
  county = "Philadelphia",
  geography = "zcta", 
  geometry = TRUE)

Here, we are showing the first 10 rows of the data. With the geometry = TRUE, we’ll get a tibble with an additional column containing simple feature geometry (MULTIPOLYGON).

#> Simple feature collection with 10 features and 134 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -75.21402 ymin: 39.87734 xmax: -74.97371 ymax: 40.11273
#> Geodetic CRS:  NAD83
#> # A tibble: 10 × 135
#>    GEOID NAME        B06009_002E B06009_002M B09001_001E B09001_001M B11012_010E
#>    <chr> <chr>             <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#>  1 19102 ZCTA5 19102           0          15         117         114          59
#>  2 19103 ZCTA5 19103         468         122        1503         378          66
#>  3 19104 ZCTA5 19104        3044         411        8411        1246        1891
#>  4 19106 ZCTA5 19106         565         198         715         187         101
#>  5 19107 ZCTA5 19107         961         258        1019         384         118
#>  6 19109 ZCTA5 19109           0          11           0          11           0
#>  7 19111 ZCTA5 19111        6489         746       16620        1410        2115
#>  8 19112 ZCTA5 19112           0          11           0          11           0
#>  9 19114 ZCTA5 19114        2085         500        6085         933         480
#> 10 19115 ZCTA5 19115        2568         551        6851         800         711
#> # ℹ 128 more variables: B11012_010M <dbl>, B11012_015E <dbl>,
#> #   B11012_015M <dbl>, B16005_007E <dbl>, B16005_007M <dbl>, B16005_008E <dbl>,
#> #   B16005_008M <dbl>, B16005_012E <dbl>, B16005_012M <dbl>, B16005_013E <dbl>,
#> #   B16005_013M <dbl>, B16005_017E <dbl>, B16005_017M <dbl>, B16005_018E <dbl>,
#> #   B16005_018M <dbl>, B16005_022E <dbl>, B16005_022M <dbl>, B16005_023E <dbl>,
#> #   B16005_023M <dbl>, B16005_029E <dbl>, B16005_029M <dbl>, B16005_030E <dbl>,
#> #   B16005_030M <dbl>, B16005_034E <dbl>, B16005_034M <dbl>, …

Get SVI with geometry

After getting the data ready, we can supply this tibble with simple feature geometry to get_svi().

phl_zcta_2020_geo_svi <- get_svi(
  year = 2020, 
  data = phl_zcta_2020_geo_data
  )

get_svi() will return the full SVI table with every SVI variables, intermediate percentile ranks, theme-specific SVIs and overall SVI (consistent with CDC/ATSDR SVI database, without MOE).

phl_zcta_2020_geo_svi %>% glimpse()
#> Rows: 48
#> Columns: 64
#> $ GEOID       <chr> "19102", "19103", "19104", "19106", "19107", "19109", "191…
#> $ NAME        <chr> "ZCTA5 19102", "ZCTA5 19103", "ZCTA5 19104", "ZCTA5 19106"…
#> $ geometry    <MULTIPOLYGON [°]> MULTIPOLYGON (((-75.16854 3..., MULTIPOLYGON …
#> $ E_TOTPOP    <dbl> 5335, 25113, 52480, 13536, 14689, 0, 67159, 0, 31448, 3550…
#> $ E_HU        <dbl> 3967, 18038, 20416, 8597, 8920, 0, 26250, 0, 13909, 14976,…
#> $ E_HH        <dbl> 3365, 15765, 16508, 7625, 7823, 0, 24355, 0, 13032, 14391,…
#> $ E_POV150    <dbl> 834, 3494, 21918, 894, 3350, 0, 17235, 0, 4817, 5594, 7856…
#> $ E_UNEMP     <dbl> 84, 453, 1692, 261, 388, 0, 2208, 0, 856, 1042, 840, 247, …
#> $ E_HBURD     <dbl> 931, 4870, 8191, 1670, 2891, 0, 8433, 0, 3906, 4598, 4698,…
#> $ E_NOHSDP    <dbl> 0, 468, 3044, 565, 961, 0, 6489, 0, 2085, 2568, 2640, 290,…
#> $ E_UNINSUR   <dbl> 165, 608, 2605, 463, 677, 0, 6089, 0, 1143, 2386, 2143, 43…
#> $ E_AGE65     <dbl> 896, 5330, 3771, 2521, 1485, 0, 10956, 0, 5758, 8673, 7151…
#> $ E_AGE17     <dbl> 117, 1503, 8411, 715, 1019, 0, 16620, 0, 6085, 6851, 6812,…
#> $ E_DISABL    <dbl> 333, 2147, 5885, 1057, 1723, 0, 11275, 0, 4722, 5561, 5217…
#> $ E_SNGPNT    <dbl> 59, 71, 2060, 101, 146, 0, 2562, 0, 589, 1032, 438, 183, 9…
#> $ E_LIMENG    <dbl> 0, 490, 877, 156, 732, 0, 6133, 0, 771, 4160, 5210, 133, 2…
#> $ E_MINRTY    <dbl> 1270, 7579, 35619, 2864, 7181, 0, 39350, 0, 8930, 12535, 1…
#> $ E_MUNIT     <dbl> 3543, 12841, 6474, 5989, 6123, 0, 3367, 0, 2321, 4401, 240…
#> $ E_MOBILE    <dbl> 28, 48, 22, 0, 13, 0, 42, 0, 9, 96, 27, 0, 14, 16, 23, 0, …
#> $ E_CROWD     <dbl> 42, 265, 750, 187, 380, 0, 1255, 0, 129, 573, 458, 57, 96,…
#> $ E_NOVEH     <dbl> 1591, 8644, 8893, 2061, 4527, 0, 3579, 0, 1028, 2009, 1513…
#> $ E_GROUPQ    <dbl> 410, 571, 14238, 1246, 1615, 0, 588, 0, 386, 513, 788, 108…
#> $ EP_POV150   <dbl> 16.9, 14.2, 56.3, 7.2, 24.7, NaN, 25.9, NaN, 15.5, 16.1, 2…
#> $ EP_UNEMP    <dbl> 2.2, 2.7, 7.4, 2.9, 4.1, NA, 6.6, NA, 5.2, 6.2, 5.1, 4.3, …
#> $ EP_HBURD    <dbl> 27.7, 30.9, 49.6, 21.9, 37.0, NaN, 34.6, NaN, 30.0, 32.0, …
#> $ EP_NOHSDP   <dbl> 0.0, 2.2, 13.3, 4.7, 8.3, NA, 14.1, NA, 8.9, 9.6, 10.3, 3.…
#> $ EP_UNINSUR  <dbl> 3.1, 2.4, 5.0, 3.7, 4.6, NA, 9.1, NA, 3.7, 6.8, 6.4, 4.1, …
#> $ EP_AGE65    <dbl> 16.8, 21.2, 7.2, 18.6, 10.1, NA, 16.3, NA, 18.3, 24.4, 21.…
#> $ EP_AGE17    <dbl> 2.2, 6.0, 16.0, 5.3, 6.9, NaN, 24.7, NaN, 19.3, 19.3, 20.0…
#> $ EP_DISABL   <dbl> 6.2, 8.7, 11.4, 8.5, 11.8, NA, 16.9, NA, 15.1, 15.9, 15.6,…
#> $ EP_SNGPNT   <dbl> 1.8, 0.5, 12.5, 1.3, 1.9, NaN, 10.5, NaN, 4.5, 7.2, 3.4, 3…
#> $ EP_LIMENG   <dbl> 0.0, 2.0, 1.7, 1.2, 5.1, NaN, 9.9, NaN, 2.6, 12.6, 16.2, 1…
#> $ EP_MINRTY   <dbl> 23.8, 30.2, 67.9, 21.2, 48.9, NaN, 58.6, NaN, 28.4, 35.3, …
#> $ EP_MUNIT    <dbl> 89.3, 71.2, 31.7, 69.7, 68.6, NaN, 12.8, NaN, 16.7, 29.4, …
#> $ EP_MOBILE   <dbl> 0.7, 0.3, 0.1, 0.0, 0.1, NA, 0.2, NA, 0.1, 0.6, 0.2, 0.0, …
#> $ EP_CROWD    <dbl> 1.2, 1.7, 4.5, 2.5, 4.9, NaN, 5.2, NaN, 1.0, 4.0, 3.6, 1.2…
#> $ EP_NOVEH    <dbl> 47.3, 54.8, 53.9, 27.0, 57.9, NA, 14.7, NA, 7.9, 14.0, 11.…
#> $ EP_GROUPQ   <dbl> 7.7, 2.3, 27.1, 9.2, 11.0, NaN, 0.9, NaN, 1.2, 1.4, 2.3, 1…
#> $ EPL_POV150  <dbl> 0.2000, 0.0889, 0.9778, 0.0000, 0.4222, NA, 0.4667, NA, 0.…
#> $ EPL_UNEMP   <dbl> 0.0000, 0.0222, 0.5111, 0.0444, 0.1333, NA, 0.4222, NA, 0.…
#> $ EPL_HBURD   <dbl> 0.2222, 0.3111, 1.0000, 0.0222, 0.6000, NA, 0.5333, NA, 0.…
#> $ EPL_NOHSDP  <dbl> 0.0000, 0.0222, 0.6222, 0.1333, 0.2667, NA, 0.6444, NA, 0.…
#> $ EPL_UNINSUR <dbl> 0.0444, 0.0000, 0.2889, 0.0889, 0.2444, NA, 0.8222, NA, 0.…
#> $ EPL_AGE65   <dbl> 0.7556, 0.9333, 0.0222, 0.8667, 0.2000, NA, 0.6889, NA, 0.…
#> $ EPL_AGE17   <dbl> 0.0000, 0.0444, 0.2000, 0.0222, 0.0889, NA, 0.8222, NA, 0.…
#> $ EPL_DISABL  <dbl> 0.0000, 0.0667, 0.1778, 0.0444, 0.2000, NA, 0.6444, NA, 0.…
#> $ EPL_SNGPNT  <dbl> 0.0667, 0.0222, 0.6667, 0.0444, 0.0889, NA, 0.5778, NA, 0.…
#> $ EPL_LIMENG  <dbl> 0.0000, 0.4667, 0.4000, 0.3333, 0.6444, NA, 0.8000, NA, 0.…
#> $ EPL_MINRTY  <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ EPL_MUNIT   <dbl> 1.0000, 0.9778, 0.8667, 0.9556, 0.9333, NA, 0.5111, NA, 0.…
#> $ EPL_MOBILE  <dbl> 0.9111, 0.7111, 0.2667, 0.0000, 0.2667, NA, 0.5333, NA, 0.…
#> $ EPL_CROWD   <dbl> 0.1556, 0.3111, 0.8667, 0.5556, 0.9111, NA, 0.9556, NA, 0.…
#> $ EPL_NOVEH   <dbl> 0.8889, 0.9778, 0.9556, 0.4667, 1.0000, NA, 0.2222, NA, 0.…
#> $ EPL_GROUPQ  <dbl> 0.8667, 0.5556, 1.0000, 0.8889, 0.9556, NA, 0.2889, NA, 0.…
#> $ SPL_theme1  <dbl> 0.4666, 0.4444, 3.4000, 0.2888, 1.6666, NA, 2.8888, NA, 1.…
#> $ SPL_theme2  <dbl> 0.8223, 1.5333, 1.4667, 1.3110, 1.2222, NA, 3.5333, NA, 2.…
#> $ SPL_theme3  <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ SPL_theme4  <dbl> 3.8223, 3.5334, 3.9557, 2.8668, 4.0667, NA, 2.5111, NA, 1.…
#> $ RPL_theme1  <dbl> 0.0444, 0.0222, 0.6889, 0.0000, 0.3333, NA, 0.5778, NA, 0.…
#> $ RPL_theme2  <dbl> 0.0222, 0.2222, 0.2000, 0.1111, 0.0667, NA, 0.8667, NA, 0.…
#> $ RPL_theme3  <dbl> 0.0889, 0.1778, 0.5556, 0.0444, 0.4000, NA, 0.5111, NA, 0.…
#> $ RPL_theme4  <dbl> 0.9556, 0.9333, 0.9778, 0.7111, 1.0000, NA, 0.5778, NA, 0.…
#> $ SPL_themes  <dbl> 5.2001, 5.6889, 9.3780, 4.5110, 7.3555, NA, 9.4443, NA, 5.…
#> $ RPL_themes  <dbl> 0.1778, 0.2667, 0.6889, 0.0444, 0.3778, NA, 0.7111, NA, 0.…

For visualization purposes, we’ll simplify the table and keep only the GEOID, geometry and SVIs.

svi <- phl_zcta_2020_geo_svi %>% 
  select(GEOID, geometry, contains("RPL_theme"))
svi %>% head(10) 
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -75.21402 ymin: 39.87734 xmax: -74.97371 ymax: 40.11273
#> Geodetic CRS:  NAD83
#> # A tibble: 10 × 7
#>    GEOID                    geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#>    <chr>          <MULTIPOLYGON [°]>      <dbl>      <dbl>      <dbl>      <dbl>
#>  1 19102 (((-75.16854 39.94663, -75…     0.0444     0.0222     0.0889      0.956
#>  2 19103 (((-75.18166 39.95151, -75…     0.0222     0.222      0.178       0.933
#>  3 19104 (((-75.21367 39.96003, -75…     0.689      0.2        0.556       0.978
#>  4 19106 (((-75.15476 39.94573, -75…     0          0.111      0.0444      0.711
#>  5 19107 (((-75.16506 39.95361, -75…     0.333      0.0667     0.4         1    
#>  6 19109 (((-75.16434 39.94935, -75…    NA         NA         NA          NA    
#>  7 19111 (((-75.10653 40.04938, -75…     0.578      0.867      0.511       0.578
#>  8 19112 (((-75.19692 39.90088, -75…    NA         NA         NA          NA    
#>  9 19114 (((-75.03509 40.07462, -75…     0.178      0.422      0.156       0.111
#> 10 19115 (((-75.07456 40.08912, -75…     0.378      0.822      0.289       0.822
#> # ℹ 1 more variable: RPL_themes <dbl>

Interactive maps for overall SVI

With the simple feature geometry, we can visualize SVI patterns and perform spatial SVI analysis with any mapping tool. Here, we’re using a powerful package leaflet for interactive maps.

First we’ll examine the missing value.

svi %>% filter(is.na(RPL_theme1)) 
#> Simple feature collection with 2 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -75.19692 ymin: 39.87734 xmax: -75.13541 ymax: 39.95011
#> Geodetic CRS:  NAD83
#> # A tibble: 2 × 7
#>   GEOID                     geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#> * <chr>           <MULTIPOLYGON [°]>      <dbl>      <dbl>      <dbl>      <dbl>
#> 1 19109 (((-75.16434 39.94935, -75.…         NA         NA         NA         NA
#> 2 19112 (((-75.19692 39.90088, -75.…         NA         NA         NA         NA
#> # ℹ 1 more variable: RPL_themes <dbl>

Looks like there are a few ZCTAs with missing values in SVIs. (Upon checking the full SVI table, we can see that most of the individual SVI variables are 0 for those ZCTAs.) So we’ll remove those ZCTAs for better visualization.

To set up the interactive map for overall SVI, leaflet() does most of the heavy lifting. Here, we’ll just add some customized color palette and labels for the appearance.

svi_clean <- svi %>% drop_na()

#above shows CRS NAD83, change to 4326 (WGS84) to avoid warning as below:
#Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
#Need '+proj=longlat +datum=WGS84'
st_crs(svi_clean) <- 4326
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
#> that

#set color palette
pal <- colorNumeric(
  palette = c("orange","navy"),
  domain = svi_clean$RPL_themes
)

#set label
zcta_label <- glue("<h3 style='margin: 0px'>{svi_clean$GEOID}</h3>
                    overall SVI: {svi_clean$RPL_themes}") %>%
  map(~HTML(.x))

leaflet(svi_clean) %>% 
  addProviderTiles(providers$CartoDB.Voyager) %>% 
  addPolygons(color = "white", 
              weight = 0.5,
              smoothFactor = 0.5,
              opacity = 1,
              fillColor = ~pal(RPL_themes),
              fillOpacity = 0.8,
              highlightOptions = highlightOptions(
                weight = 5,
                color = "white",
                fillOpacity = 0.8,
                bringToFront = TRUE),
              label = zcta_label,
              labelOptions = labelOptions(
                style = list(
                  "font-family" = "Fira Sans, sans-serif",
                  "font-size" = "1.2em"
                ))
              ) %>% 
  addLegend("bottomleft",
            pal = pal,
            values = ~RPL_themes,
            title = "Overall SVI in all ZCTAs in Philadelphia, PA (2020)",
            #labFormat = labelFormat(prefix = "$"),
            opacity = 1)

From the interactive map, we can visualize easily how SVI varies in different regions in PA and zoom in to examine specific ZCTAs of interest, making it a helpful approach to explore new ideas, patterns and analyses.