The goal of findSVI is to calculate regional CDC/ATSDR Social Vulnerability Index (SVI) (former site: www.atsdr.cdc.gov/placeandhealth/svi/index.html) at a geographic level of interest using US census data from American Community Survey.
Overview
CDC/ATSDR releases SVI biannually at the counties/census tracts level for US or an individual state. findSVI aims to support more flexible and specific SVI analysis with additional options for years (2012-2022) and geographic levels (e.g., ZCTA/places, combining multiple states).
To find SVI for one or multiple year-state pair(s):
-
find_svi()
: retrieves US census data (Census API key required) and calculates SVI based on CDC/ATSDR SVI documentation (www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html) for each year-state pair at the same geography level.
In most cases, find_svi()
would be the easiest option. If you’d like to include simple feature geometry or have more customized requests for census data retrieval (e.g., different geography level for each year-state pair, multiple states for one year), you can process individual entry using the following:
-
get_census_data()
: retrieves US census data (Census API key required); -
get_svi()
: calculates SVI from the census data supplied.
Essentially, find_svi()
is a wrapper function for get_census_data()
and get_svi()
that also supports iteration over 1-year-and-1-state pairs at the same geography level.
Installation
Install the findSVI package via CRAN:
install.packages("findSVI")
Alternatively, you can install the development version of findSVI from GitHub with:
# install.packages("devtools")
devtools::install_github("heli-xu/findSVI")
Usage
To find county-level SVI for New Jersey (NJ) for 2017, and for Pennsylvania (PA) for 2018:
library(findSVI)
library(dplyr)
summarise_results <- find_svi(
year = c(2017, 2018),
state = c("NJ", "PA"),
geography = "county"
)
summarise_results %>%
group_by(year, state) %>%
slice_head(n = 5)
#> # A tibble: 10 × 8
#> # Groups: year, state [2]
#> GEOID RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4 RPL_themes year state
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 34001 0.95 0.8 0.65 1 0.95 2017 NJ
#> 2 34003 0.2 0.3 0.55 0.45 0.25 2017 NJ
#> 3 34005 0.3 0.5 0.35 0.4 0.3 2017 NJ
#> 4 34007 0.7 0.9 0.55 0.6 0.75 2017 NJ
#> 5 34009 0.65 0.6 0.1 0.55 0.45 2017 NJ
#> 6 42001 0.212 0.242 0.697 0.227 0.182 2018 PA
#> 7 42003 0.136 0.0758 0.742 0.576 0.212 2018 PA
#> 8 42005 0.621 0.530 0.0152 0.167 0.227 2018 PA
#> 9 42007 0.182 0.409 0.530 0.348 0.197 2018 PA
#> 10 42009 0.712 0.606 0.0758 0.288 0.394 2018 PA
(First 5 rows of results for 2017-NJ and 2018-PA are shown. ‘RPL_themes` indicates overall SVI, and ’RPL_theme1’ to ‘RPL_theme4’ indicate theme-specific SVIs.)
To retrieve county-level census data and then get SVI for PA for 2020:
data <- get_census_data(2020, "county", "PA")
data[1:10, 1:10]
#> # A tibble: 10 × 10
#> GEOID NAME B06009_002E B06009_002M B09001_001E B09001_001M B11012_010E
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 42001 Adams Coun… 7788 602 20663 NA 1237
#> 2 42003 Allegheny … 45708 1713 228296 49 24311
#> 3 42005 Armstrong … 3973 305 12516 9 912
#> 4 42007 Beaver Cou… 7546 640 31915 NA 3380
#> 5 42009 Bedford Co… 3996 317 9386 11 468
#> 6 42011 Berks Coun… 36488 1356 93714 44 8812
#> 7 42013 Blair Coun… 7292 679 24920 19 2552
#> 8 42015 Bradford C… 4395 362 13358 NA 969
#> 9 42017 Bucks Coun… 25651 1306 128008 53 8222
#> 10 42019 Butler Cou… 6118 468 37577 NA 2121
#> # ℹ 3 more variables: B11012_010M <dbl>, B11012_015E <dbl>, B11012_015M <dbl>
(First 10 rows and columns are shown, with the rest of columns being other census variables for SVI calculation.)
#> Rows: 67
#> Columns: 63
#> $ GEOID <chr> "42001", "42003", "42005", "42007", "42009", "42011", "420…
#> $ NAME <chr> "Adams County, Pennsylvania", "Allegheny County, Pennsylva…
#> $ E_TOTPOP <dbl> 102627, 1218380, 65356, 164781, 48154, 419062, 122495, 607…
#> $ E_HU <dbl> 42525, 602416, 32852, 79587, 24405, 167514, 56960, 30691, …
#> $ E_HH <dbl> 39628, 545695, 28035, 72086, 19930, 156389, 51647, 25084, …
#> $ E_POV150 <dbl> 13573, 212117, 13566, 28766, 10130, 77317, 27397, 13731, 5…
#> $ E_UNEMP <dbl> 2049, 32041, 1735, 4249, 1033, 12196, 2765, 1331, 14477, 4…
#> $ E_HBURD <dbl> 9088, 133524, 5719, 15764, 3952, 40982, 12146, 5520, 57197…
#> $ E_NOHSDP <dbl> 7788, 45708, 3973, 7546, 3996, 36488, 7292, 4395, 25651, 6…
#> $ E_UNINSUR <dbl> 5656, 46333, 2632, 6242, 3310, 25627, 6155, 3992, 25208, 6…
#> $ E_AGE65 <dbl> 20884, 230745, 14496, 35351, 10950, 72293, 25372, 12948, 1…
#> $ E_AGE17 <dbl> 20663, 228296, 12516, 31915, 9386, 93714, 24920, 13358, 12…
#> $ E_DISABL <dbl> 13860, 163671, 11431, 25878, 7797, 57961, 20278, 8731, 653…
#> $ E_SNGPNT <dbl> 1719, 29689, 1159, 4167, 681, 10507, 3096, 1397, 11396, 29…
#> $ E_LIMENG <dbl> 1318, 9553, 130, 606, 64, 16570, 388, 172, 11502, 449, 185…
#> $ E_MINRTY <dbl> 11624, 269795, 2096, 18205, 1672, 123611, 7120, 2733, 1089…
#> $ E_MUNIT <dbl> 821, 82729, 1180, 4563, 635, 11010, 3629, 1011, 25508, 660…
#> $ E_MOBILE <dbl> 2882, 4147, 3289, 3012, 3491, 4628, 4094, 4419, 4764, 6464…
#> $ E_CROWD <dbl> 468, 4697, 238, 693, 217, 1878, 451, 472, 2916, 489, 446, …
#> $ E_NOVEH <dbl> 1726, 72338, 2058, 5824, 961, 13331, 4216, 2086, 11711, 49…
#> $ E_GROUPQ <dbl> 4140, 33976, 795, 2933, 481, 13171, 3289, 736, 9462, 5592,…
#> $ EP_POV150 <dbl> 13.8, 17.9, 21.0, 17.7, 21.4, 19.0, 22.9, 22.9, 9.7, 13.2,…
#> $ EP_UNEMP <dbl> 3.9, 4.9, 5.5, 5.1, 4.5, 5.6, 4.7, 4.7, 4.2, 4.6, 5.2, 10.…
#> $ EP_HBURD <dbl> 22.9, 24.5, 20.4, 21.9, 19.8, 26.2, 23.5, 22.0, 23.8, 19.4…
#> $ EP_NOHSDP <dbl> 10.8, 5.2, 8.2, 6.2, 11.3, 12.8, 8.3, 10.2, 5.7, 4.6, 8.0,…
#> $ EP_UNINSUR <dbl> 5.6, 3.8, 4.1, 3.8, 6.9, 6.2, 5.1, 6.6, 4.1, 3.3, 4.1, 3.2…
#> $ EP_AGE65 <dbl> 20.3, 18.9, 22.2, 21.5, 22.7, 17.3, 20.7, 21.3, 18.7, 18.8…
#> $ EP_AGE17 <dbl> 20.1, 18.7, 19.2, 19.4, 19.5, 22.4, 20.3, 22.0, 20.4, 20.0…
#> $ EP_DISABL <dbl> 13.7, 13.6, 17.6, 15.8, 16.3, 14.0, 16.8, 14.5, 10.5, 12.8…
#> $ EP_SNGPNT <dbl> 4.3, 5.4, 4.1, 5.8, 3.4, 6.7, 6.0, 5.6, 4.7, 3.8, 5.3, 8.1…
#> $ EP_LIMENG <dbl> 1.4, 0.8, 0.2, 0.4, 0.1, 4.2, 0.3, 0.3, 1.9, 0.3, 0.1, 0.0…
#> $ EP_MINRTY <dbl> 11.3, 22.1, 3.2, 11.0, 3.5, 29.5, 5.8, 4.5, 17.4, 5.6, 7.6…
#> $ EP_MUNIT <dbl> 1.9, 13.7, 3.6, 5.7, 2.6, 6.6, 6.4, 3.3, 10.1, 7.9, 5.7, 2…
#> $ EP_MOBILE <dbl> 6.8, 0.7, 10.0, 3.8, 14.3, 2.8, 7.2, 14.4, 1.9, 7.7, 4.7, …
#> $ EP_CROWD <dbl> 1.2, 0.9, 0.8, 1.0, 1.1, 1.2, 0.9, 1.9, 1.2, 0.6, 0.8, 1.2…
#> $ EP_NOVEH <dbl> 4.4, 13.3, 7.3, 8.1, 4.8, 8.5, 8.2, 8.3, 4.9, 6.4, 11.0, 9…
#> $ EP_GROUPQ <dbl> 4.0, 2.8, 1.2, 1.8, 1.0, 3.1, 2.7, 1.2, 1.5, 3.0, 5.1, 1.7…
#> $ EPL_POV150 <dbl> 0.0758, 0.2727, 0.5303, 0.2424, 0.5606, 0.3788, 0.6818, 0.…
#> $ EPL_UNEMP <dbl> 0.1212, 0.4242, 0.6818, 0.5000, 0.2576, 0.6970, 0.3636, 0.…
#> $ EPL_HBURD <dbl> 0.5303, 0.6970, 0.2424, 0.4394, 0.1970, 0.8636, 0.5909, 0.…
#> $ EPL_NOHSDP <dbl> 0.7273, 0.0152, 0.2424, 0.1061, 0.8182, 0.9091, 0.2727, 0.…
#> $ EPL_UNINSUR <dbl> 0.5152, 0.1061, 0.1364, 0.1061, 0.7424, 0.6667, 0.3939, 0.…
#> $ EPL_AGE65 <dbl> 0.4848, 0.2727, 0.7879, 0.7121, 0.8788, 0.0909, 0.5606, 0.…
#> $ EPL_AGE17 <dbl> 0.5909, 0.1970, 0.2576, 0.3333, 0.3939, 0.9091, 0.6212, 0.…
#> $ EPL_DISABL <dbl> 0.2576, 0.2273, 0.7727, 0.5000, 0.5909, 0.3333, 0.6667, 0.…
#> $ EPL_SNGPNT <dbl> 0.2273, 0.6364, 0.1515, 0.7424, 0.0455, 0.8636, 0.7879, 0.…
#> $ EPL_LIMENG <dbl> 0.7576, 0.6515, 0.0909, 0.2879, 0.0303, 0.9697, 0.1667, 0.…
#> $ EPL_MINRTY <dbl> 0.6515, 0.8636, 0.0303, 0.6364, 0.0455, 0.9242, 0.2879, 0.…
#> $ EPL_MUNIT <dbl> 0.1515, 0.9545, 0.4242, 0.6970, 0.1970, 0.7727, 0.7576, 0.…
#> $ EPL_MOBILE <dbl> 0.4394, 0.0303, 0.6818, 0.2121, 0.9091, 0.1515, 0.5000, 0.…
#> $ EPL_CROWD <dbl> 0.4091, 0.1818, 0.0909, 0.2576, 0.3333, 0.4091, 0.1818, 0.…
#> $ EPL_NOVEH <dbl> 0.0000, 0.9848, 0.4545, 0.5909, 0.0455, 0.6818, 0.6061, 0.…
#> $ EPL_GROUPQ <dbl> 0.6667, 0.4697, 0.0758, 0.2879, 0.0455, 0.5455, 0.4394, 0.…
#> $ SPL_theme1 <dbl> 1.9698, 1.5152, 1.8333, 1.3940, 2.5758, 3.5152, 2.3029, 2.…
#> $ SPL_theme2 <dbl> 2.3182, 1.9849, 2.0606, 2.5757, 1.9394, 3.1666, 2.8031, 2.…
#> $ SPL_theme3 <dbl> 0.6515, 0.8636, 0.0303, 0.6364, 0.0455, 0.9242, 0.2879, 0.…
#> $ SPL_theme4 <dbl> 1.6667, 2.6211, 1.7272, 2.0455, 1.5304, 2.5606, 2.4849, 2.…
#> $ RPL_theme1 <dbl> 0.2424, 0.1667, 0.1970, 0.1364, 0.5455, 0.9242, 0.3636, 0.…
#> $ RPL_theme2 <dbl> 0.3788, 0.2121, 0.2273, 0.5758, 0.1667, 0.9091, 0.6970, 0.…
#> $ RPL_theme3 <dbl> 0.6515, 0.8636, 0.0303, 0.6364, 0.0455, 0.9242, 0.2879, 0.…
#> $ RPL_theme4 <dbl> 0.1212, 0.5606, 0.1515, 0.2576, 0.0455, 0.5152, 0.4848, 0.…
#> $ SPL_themes <dbl> 6.6062, 6.9848, 5.6514, 6.6516, 6.0911, 10.1666, 7.8788, 8…
#> $ RPL_themes <dbl> 0.2273, 0.2879, 0.0909, 0.2424, 0.1667, 0.9545, 0.5152, 0.…
To find SVI for custom geographic boundaries:
cz_svi <- find_svi_x(
year = 2020,
geography = "county",
xwalk = cty_cz_2020_xwalk #county-commuting zone crosswalk
)
…where xwalk
is supplied by users to define the relationship between a Census geography (‘GEOID’) and the custom geographic level (‘GEOID2’). The Census geography should be fully nested in the custom geographic level of interest. As an example, first 10 rows of the county-commuting zone crosswalk are shown below:
cty_cz_2020_xwalk %>% head(10)
#> GEOID GEOID2
#> 1 01069 3
#> 2 01023 9
#> 3 01005 3
#> 4 01107 4
#> 5 01033 10
#> 6 04012 37
#> 7 04001 32
#> 8 05081 55
#> 9 05121 46
#> 10 06037 37
With the crosswalk, county-level census data are aggregated to the commuting zone-level, and SVI is calculated for each commuting zone. Below shows the overall and theme-specific SVI of the first 10 rows, with GEOIDs representing the commuting zone IDs.
#> # A tibble: 10 × 6
#> GEOID RPL_theme1 RPL_theme2 RPL_theme3 RPL_theme4 RPL_themes
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.778 0.833 0.885 0.730 0.826
#> 2 2 0.734 0.436 0.698 0.388 0.625
#> 3 3 0.871 0.892 0.703 0.570 0.833
#> 4 4 0.881 0.498 0.838 0.947 0.876
#> 5 5 0.560 0.675 0.684 0.333 0.606
#> 6 6 0.799 0.813 0.605 0.302 0.720
#> 7 7 0.821 0.680 0.802 0.875 0.842
#> 8 8 0.694 0.888 0.438 0.0842 0.570
#> 9 9 0.899 0.969 0.838 0.918 0.962
#> 10 10 0.357 0.507 0.589 0.134 0.335