Skip to contents

As an essential part of census data analysis, spatial analysis provides valuable insight into geographic distribution, demographic dynamics and socioeconomic disparities. Derived from census variables, SVI is designed to represent the relative resilience of communities within a given area. Interpreting SVI in the spatial context not only is important for managing public heath crisis, but also helps researchers/policymakers/urban planners to better understand the interactions between SVI variables and geographic factors (such as transportation networks, land use patterns, environmental factors, and access to amenities or public services).

CDC/ATSDR Social Vulnerability Index (SVI) database includes county-/census tract-level SVI for US and each state, in the format of both table(.csv) and map(shapefile). In addition, we can explore the SVI maps and tables on CDC/ATSDR SVI Interactive Map.

For spatial analysis at a different geographic level or using more up-to-date census data, we can use findSVI to obtain a SVI table with geometry information, and create maps in R or export to other tools (such as QGIS) using sf::st_write() from the sf package.

Get census data with geometry

First, we are using get_census_data() to obtain ZCTA-level data with simple feature geometry for PA for 2020 (Census API required).

pa_zcta_2020_geo_data <- get_census_data(
  year = 2020, 
  state = "PA", 
  geography = "zcta", 
  geometry = TRUE)

Here, we are showing the first 10 rows of the data. With the geometry = TRUE, we’ll get a tibble with an additional column containing simple feature geometry (MULTIPOLYGON).

#> Simple feature collection with 10 features and 134 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -80.48268 ymin: 40.10397 xmax: -79.73383 ymax: 40.83308
#> Geodetic CRS:  NAD83
#> # A tibble: 10 × 135
#>    GEOID NAME        B06009_002E B06009_002M B09001_001E B09001_001M B11012_010E
#>    <chr> <chr>             <dbl>       <dbl>       <dbl>       <dbl>       <dbl>
#>  1 15001 ZCTA5 15001        1655         417        5450         582         724
#>  2 15003 ZCTA5 15003         441         125        2352         245         250
#>  3 15004 ZCTA5 15004          62          43          63          70          21
#>  4 15005 ZCTA5 15005         293          92        1380         200          65
#>  5 15006 ZCTA5 15006          37          58          13          28           0
#>  6 15007 ZCTA5 15007           0          11          22          35           0
#>  7 15009 ZCTA5 15009         404         121        3253         384         206
#>  8 15010 ZCTA5 15010        1237         242        5636         444         719
#>  9 15012 ZCTA5 15012         638         180        2750         302         336
#> 10 15014 ZCTA5 15014         136          75         527          96         105
#> # ℹ 128 more variables: B11012_010M <dbl>, B11012_015E <dbl>,
#> #   B11012_015M <dbl>, B16005_007E <dbl>, B16005_007M <dbl>, B16005_008E <dbl>,
#> #   B16005_008M <dbl>, B16005_012E <dbl>, B16005_012M <dbl>, B16005_013E <dbl>,
#> #   B16005_013M <dbl>, B16005_017E <dbl>, B16005_017M <dbl>, B16005_018E <dbl>,
#> #   B16005_018M <dbl>, B16005_022E <dbl>, B16005_022M <dbl>, B16005_023E <dbl>,
#> #   B16005_023M <dbl>, B16005_029E <dbl>, B16005_029M <dbl>, B16005_030E <dbl>,
#> #   B16005_030M <dbl>, B16005_034E <dbl>, B16005_034M <dbl>, …

Get SVI with geometry

After getting the data ready, we can supply this tibble with simple feature geometry to get_svi().

pa_zcta_2020_geo_svi <- get_svi(
  year = 2020, 
  data = pa_zcta_2020_geo_data
  )

get_svi() will return the full SVI table with every SVI variables, intermediate percentile ranks, theme-specific SVIs and overall SVI (consistent with CDC/ATSDR SVI database, without MOE).

pa_zcta_2020_geo_svi %>% glimpse()
#> Rows: 1,776
#> Columns: 64
#> $ GEOID       <chr> "15001", "15003", "15004", "15005", "15006", "15007", "150…
#> $ NAME        <chr> "ZCTA5 15001", "ZCTA5 15003", "ZCTA5 15004", "ZCTA5 15005"…
#> $ geometry    <MULTIPOLYGON [°]> MULTIPOLYGON (((-80.43758 4..., MULTIPOLYGON …
#> $ E_TOTPOP    <dbl> 31129, 11212, 380, 9191, 292, 629, 15114, 27205, 15243, 30…
#> $ E_HU        <dbl> 16070, 6084, 207, 4275, 131, 167, 7057, 12699, 7709, 1671,…
#> $ E_HH        <dbl> 14093, 5104, 138, 3948, 131, 167, 6509, 11363, 6883, 1450,…
#> $ E_POV150    <dbl> 5567, 2656, 75, 916, 0, 387, 1904, 4761, 2639, 761, 24, 14…
#> $ E_UNEMP     <dbl> 787, 212, 0, 123, 0, 0, 323, 811, 376, 41, 11, 201, 0, 116…
#> $ E_HBURD     <dbl> 3037, 1234, 6, 609, 0, 51, 1424, 2322, 1186, 334, 39, 1729…
#> $ E_NOHSDP    <dbl> 1655, 441, 62, 293, 37, 0, 404, 1237, 638, 136, 25, 542, 2…
#> $ E_UNINSUR   <dbl> 1319, 534, 38, 250, 0, 19, 315, 923, 470, 175, 21, 454, 13…
#> $ E_AGE65     <dbl> 6395, 2090, 42, 2534, 58, 414, 3718, 5815, 3595, 464, 277,…
#> $ E_AGE17     <dbl> 5450, 2352, 63, 1380, 13, 22, 3253, 5636, 2750, 527, 348, …
#> $ E_DISABL    <dbl> 4871, 1656, 61, 1288, 0, 295, 2196, 4584, 2629, 379, 94, 1…
#> $ E_SNGPNT    <dbl> 815, 356, 21, 92, 0, 0, 271, 820, 348, 162, 35, 319, 7, 3,…
#> $ E_LIMENG    <dbl> 142, 86, 0, 19, 0, 0, 57, 42, 86, 0, 0, 46, 182, 0, 0, 4, …
#> $ E_MINRTY    <dbl> 5764, 1771, 30, 459, 0, 0, 869, 3819, 1048, 264, 65, 1684,…
#> $ E_MUNIT     <dbl> 955, 196, 0, 134, 0, 0, 528, 811, 353, 48, 10, 1290, 0, 0,…
#> $ E_MOBILE    <dbl> 537, 38, 25, 23, 0, 0, 18, 251, 445, 0, 0, 3, 0, 93, 24, 3…
#> $ E_CROWD     <dbl> 94, 8, 0, 0, 0, 0, 23, 135, 22, 7, 0, 70, 9, 0, 0, 23, 52,…
#> $ E_NOVEH     <dbl> 1189, 587, 0, 135, 0, 0, 335, 1243, 481, 136, 10, 781, 46,…
#> $ E_GROUPQ    <dbl> 619, 26, 0, 213, 0, 0, 304, 1268, 144, 91, 0, 438, 0, 0, 0…
#> $ EP_POV150   <dbl> 18.2, 23.7, 19.7, 10.2, 0.0, 61.5, 12.8, 18.3, 17.3, 26.0,…
#> $ EP_UNEMP    <dbl> 4.8, 3.5, 0.0, 2.5, 0.0, 0.0, 4.5, 6.1, 5.0, 2.5, 1.8, 2.5…
#> $ EP_HBURD    <dbl> 21.5, 24.2, 4.3, 15.4, 0.0, 30.5, 21.9, 20.4, 17.2, 23.0, …
#> $ EP_NOHSDP   <dbl> 7.0, 5.5, 22.5, 4.1, 13.3, 0.0, 3.6, 6.6, 5.7, 6.2, 2.9, 4…
#> $ EP_UNINSUR  <dbl> 4.3, 4.8, 10.0, 2.8, 0.0, 3.0, 2.1, 3.4, 3.1, 5.9, 1.7, 3.…
#> $ EP_AGE65    <dbl> 20.5, 18.6, 11.1, 27.6, 19.9, 65.8, 24.6, 21.4, 23.6, 15.2…
#> $ EP_AGE17    <dbl> 17.5, 21.0, 16.6, 15.0, 4.5, 3.5, 21.5, 20.7, 18.0, 17.3, …
#> $ EP_DISABL   <dbl> 15.9, 14.8, 16.1, 14.3, 0.0, 46.9, 14.8, 17.1, 17.2, 12.8,…
#> $ EP_SNGPNT   <dbl> 5.8, 7.0, 15.2, 2.3, 0.0, 0.0, 4.2, 7.2, 5.1, 11.2, 7.1, 4…
#> $ EP_LIMENG   <dbl> 0.5, 0.8, 0.0, 0.2, 0.0, 0.0, 0.4, 0.2, 0.6, 0.0, 0.0, 0.3…
#> $ EP_MINRTY   <dbl> 18.5, 15.8, 7.9, 5.0, 0.0, 0.0, 5.7, 14.0, 6.9, 8.6, 5.1, …
#> $ EP_MUNIT    <dbl> 5.9, 3.2, 0.0, 3.1, 0.0, 0.0, 7.5, 6.4, 4.6, 2.9, 2.0, 17.…
#> $ EP_MOBILE   <dbl> 3.3, 0.6, 12.1, 0.5, 0.0, 0.0, 0.3, 2.0, 5.8, 0.0, 0.0, 0.…
#> $ EP_CROWD    <dbl> 0.7, 0.2, 0.0, 0.0, 0.0, 0.0, 0.4, 1.2, 0.3, 0.5, 0.0, 1.0…
#> $ EP_NOVEH    <dbl> 8.4, 11.5, 0.0, 3.4, 0.0, 0.0, 5.1, 10.9, 7.0, 9.4, 2.0, 1…
#> $ EP_GROUPQ   <dbl> 2.0, 0.2, 0.0, 2.3, 0.0, 0.0, 2.0, 4.7, 0.9, 3.0, 0.0, 2.9…
#> $ EPL_POV150  <dbl> 0.5313, 0.7039, 0.5837, 0.2295, 0.0000, 0.9920, 0.3109, 0.…
#> $ EPL_UNEMP   <dbl> 0.5595, 0.3761, 0.0000, 0.2593, 0.0000, 0.0000, 0.5259, 0.…
#> $ EPL_HBURD   <dbl> 0.5174, 0.6615, 0.0490, 0.2234, 0.0000, 0.8627, 0.5436, 0.…
#> $ EPL_NOHSDP  <dbl> 0.4001, 0.2798, 0.9393, 0.1896, 0.7911, 0.0000, 0.1635, 0.…
#> $ EPL_UNINSUR <dbl> 0.4804, 0.5417, 0.8597, 0.2890, 0.0000, 0.3186, 0.2186, 0.…
#> $ EPL_AGE65   <dbl> 0.5516, 0.4240, 0.0959, 0.8678, 0.5091, 0.9881, 0.7832, 0.…
#> $ EPL_AGE17   <dbl> 0.3258, 0.5840, 0.2690, 0.1986, 0.0545, 0.0499, 0.6317, 0.…
#> $ EPL_DISABL  <dbl> 0.5877, 0.4991, 0.5997, 0.4594, 0.0000, 0.9903, 0.4991, 0.…
#> $ EPL_SNGPNT  <dbl> 0.6929, 0.7989, 0.9641, 0.2963, 0.0000, 0.0000, 0.5322, 0.…
#> $ EPL_LIMENG  <dbl> 0.6555, 0.7469, 0.0000, 0.5312, 0.0000, 0.0000, 0.6226, 0.…
#> $ EPL_MINRTY  <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ EPL_MUNIT   <dbl> 0.7801, 0.6724, 0.0000, 0.6684, 0.0000, 0.0000, 0.8199, 0.…
#> $ EPL_MOBILE  <dbl> 0.4422, 0.2872, 0.7499, 0.2775, 0.0000, 0.0000, 0.2479, 0.…
#> $ EPL_CROWD   <dbl> 0.5305, 0.3886, 0.0000, 0.0000, 0.0000, 0.0000, 0.4479, 0.…
#> $ EPL_NOVEH   <dbl> 0.7071, 0.8234, 0.0000, 0.3613, 0.0000, 0.0000, 0.5105, 0.…
#> $ EPL_GROUPQ  <dbl> 0.8031, 0.5585, 0.0000, 0.8212, 0.0000, 0.0000, 0.8031, 0.…
#> $ SPL_theme1  <dbl> 2.4887, 2.5630, 2.4317, 1.1908, 0.7911, 2.1733, 1.7625, 2.…
#> $ SPL_theme2  <dbl> 2.8135, 3.0529, 1.9287, 2.3533, 0.5636, 2.0283, 3.0688, 3.…
#> $ SPL_theme3  <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ SPL_theme4  <dbl> 3.2630, 2.7301, 0.7499, 2.1284, 0.0000, 0.0000, 2.8293, 3.…
#> $ RPL_theme1  <dbl> 0.5028, 0.5380, 0.4745, 0.0999, 0.0392, 0.3785, 0.2236, 0.…
#> $ RPL_theme2  <dbl> 0.7418, 0.8473, 0.2520, 0.4597, 0.0148, 0.2985, 0.8593, 0.…
#> $ RPL_theme3  <dbl> 0.8173, 0.7860, 0.6118, 0.4801, 0.0000, 0.0000, 0.5176, 0.…
#> $ RPL_theme4  <dbl> 0.8224, 0.6515, 0.1266, 0.4631, 0.0000, 0.0000, 0.6850, 0.…
#> $ SPL_themes  <dbl> 9.3825, 9.1320, 5.7221, 6.1526, 1.3547, 4.2016, 8.1782, 9.…
#> $ RPL_themes  <dbl> 0.8059, 0.7707, 0.2287, 0.2707, 0.0136, 0.1022, 0.6118, 0.…

For visualization purposes, we’ll simplify the table and keep only the GEOID, geometry and SVIs.

svi <- pa_zcta_2020_geo_svi %>% 
  select(GEOID, geometry, contains("RPL_theme"))
svi %>% head(10) 
#> Simple feature collection with 10 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -80.48268 ymin: 40.10397 xmax: -79.73383 ymax: 40.83308
#> Geodetic CRS:  NAD83
#> # A tibble: 10 × 7
#>    GEOID                    geometry RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4
#>    <chr>          <MULTIPOLYGON [°]>      <dbl>      <dbl>      <dbl>      <dbl>
#>  1 15001 (((-80.43758 40.55899, -80…     0.503      0.742       0.817      0.822
#>  2 15003 (((-80.2366 40.59886, -80.…     0.538      0.847       0.786      0.652
#>  3 15004 (((-80.38652 40.34327, -80…     0.474      0.252       0.612      0.127
#>  4 15005 (((-80.24356 40.65536, -80…     0.0999     0.460       0.480      0.463
#>  5 15006 (((-79.88531 40.63274, -79…     0.0392     0.0148      0          0    
#>  6 15007 (((-79.93657 40.65095, -79…     0.378      0.298       0          0    
#>  7 15009 (((-80.45619 40.71337, -80…     0.224      0.859       0.518      0.685
#>  8 15010 (((-80.48268 40.73145, -80…     0.478      0.894       0.761      0.922
#>  9 15012 (((-79.90417 40.15197, -79…     0.310      0.863       0.568      0.749
#> 10 15014 (((-79.74922 40.60994, -79…     0.557      0.195       0.633      0.658
#> # ℹ 1 more variable: RPL_themes <dbl>

Interactive maps for overall SVI

With the simple feature geometry, we can visualize SVI patterns and perform spatial SVI analysis with any mapping tool. Here, we’re using a powerful package leaflet for interactive maps.

First we’ll examine the missing value.

missing <- svi %>% filter(is.na(RPL_theme1)) 
missing %>% glimpse()
#> Rows: 13
#> Columns: 7
#> $ GEOID      <chr> "15260", "15616", "15691", "16312", "17077", "17120", "1782…
#> $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-79.95539 4..., MULTIPOLYGON (…
#> $ RPL_theme1 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme2 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme3 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_theme4 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ RPL_themes <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA

Looks like there are a few ZCTAs with missing values in SVIs. (Upon checking the full SVI table, we can see that most of the individual SVI variables are 0 for those ZCTAs.) So we’ll remove those ZCTAs for better visualization.

To set up the interactive map for overall SVI, leaflet() does most of the heavy lifting. Here, we’ll just add some customized color palette and labels for the appearance.

svi_clean <- svi %>% drop_na()

#above shows CRS NAD83, change to 4326 (WGS84) to avoid warning as below:
#Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD83 +no_defs).
#Need '+proj=longlat +datum=WGS84'
st_crs(svi_clean) <- 4326
#> Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
#> that

#set color palette
pal <- colorNumeric(
  palette = c("orange","navy"),
  domain = svi_clean$RPL_themes
)

#set label
zcta_label <- glue("<h3 style='margin: 0px'>{svi_clean$GEOID}</h3>
                    overall SVI: {svi_clean$RPL_themes}") %>%
  map(~HTML(.x))

leaflet(svi_clean) %>% 
  addProviderTiles(providers$CartoDB.Voyager) %>% 
  addPolygons(color = "white", 
              weight = 0.5,
              smoothFactor = 0.5,
              opacity = 1,
              fillColor = ~pal(RPL_themes),
              fillOpacity = 0.8,
              highlightOptions = highlightOptions(
                weight = 5,
                color = "white",
                fillOpacity = 0.8,
                bringToFront = TRUE),
              label = zcta_label,
              labelOptions = labelOptions(
                style = list(
                  "font-family" = "Fira Sans, sans-serif",
                  "font-size" = "1.2em"
                ))
              ) %>% 
  addLegend("bottomleft",
            pal = pal,
            values = ~RPL_themes,
            title = "Overall SVI in all ZCTAs in PA (2020)",
            #labFormat = labelFormat(prefix = "$"),
            opacity = 1)

From the interactive map, we can visualize easily how SVI varies in different regions in PA and zoom in to examine specific ZCTAs of interest, making it a helpful approach to explore new ideas, patterns and analyses.