SVI Calculation - Process multiple SVI calculations with findSVI and purrr

Code

install.packages("devtools")
devtools::install_github("heli-xu/findSVI")

Make a table for the requests

Based on the github issue, we’ll put the request information in a table for easier access/manipulation later.

Code

library(tidyverse)
df_request <- tribble(
  ~state, ~year, ~n,
  "AZ", 2015, 8,
  "AZ", 2016, 2,
  "AZ", 2017, 2,
  "AZ", 2018, 2,
  "AZ", 2019, 3,
  "FL", 2014, 4,
  "FL", 2015, 8,
  "GA", 2015, 8,
  "GA", 2016, 2,
  "GA", 2017, 2,
  "GA", 2018, 2,
  "GA", 2019, 3,
  "KY", 2012, 4,
  "KY", 2013, 4,
  "KY", 2014, 4,
  "KY", 2015, 8,
  "MA", 2013, 4,
  "MA", 2014, 4,
  "MA", 2015, 8,
  "MA", 2016, 2,
  "MA", 2017, 2,
  "NJ", 2012, 4,
  "NJ", 2013, 4,
  "NJ", 2014, 4,
  "NJ", 2015, 8,
  "NJ", 2016, 2,
  "NJ", 2017, 2,
  "NY", 2017, 4,
  "NY", 2018, 4
) %>%
  select(-n)

df_request

state	year
AZ	2015
AZ	2016
AZ	2017
AZ	2018
AZ	2019
FL	2014
FL	2015
GA	2015
GA	2016
GA	2017
GA	2018
GA	2019
KY	2012
KY	2013
KY	2014
KY	2015
MA	2013
MA	2014
MA	2015
MA	2016
MA	2017
NJ	2012
NJ	2013
NJ	2014
NJ	2015
NJ	2016
NJ	2017
NY	2017
NY	2018

Testing with one request entry

Using findSVI, we are retrieving ZCTA-level census data of AZ for 2018 and calculating SVI from the census data. Here we’ll only include all the ZCTAs(GEOIDs) and RPLs(SVI), leaving out the individual SVI variables and intermediate rankings.

Code

library(findSVI)
# census_api_key("YOUR KEY GOES HERE")
data <- findSVI::get_census_data(2018, "zcta", "AZ")
result <- findSVI::get_svi(2018, data) %>% 
  select(GEOID, contains('RPL_theme')) %>% 
  glimpse()

Rows: 405
Columns: 6
$ GEOID      <chr> "85003", "85004", "85006", "85007", "85008", "85009", "8501…
$ RPL_theme1 <dbl> 0.4025, 0.6177, 0.8405, 0.7722, 0.7646, 0.8633, 0.1418, 0.2…
$ RPL_theme2 <dbl> 0.0684, 0.0380, 0.3722, 0.7266, 0.2506, 0.5063, 0.0962, 0.1…
$ RPL_theme3 <dbl> 0.7063, 0.6076, 0.9038, 0.8506, 0.8886, 0.9620, 0.2253, 0.5…
$ RPL_theme4 <dbl> 0.7873, 0.8962, 0.9544, 0.9570, 0.9620, 0.9899, 0.6658, 0.7…
$ RPL_themes <dbl> 0.4911, 0.5848, 0.9089, 0.9013, 0.8532, 0.9595, 0.2582, 0.4…

It looks like findSVI is working.

Iterating all entries with purrr

Now we’ll use purrr with findSVI to iterate through all state-year combinations.

Code

library(cli)

all_result <- df_request %>%
  group_by(row_number()) %>%
  group_map( ~ {
    year_tmp <- .x$year
    state_tmp <- .x$state
    cli_alert("starting pull for {state_tmp} - {year_tmp}")
    data <- findSVI::get_census_data(year_tmp, "zcta", state_tmp)
    result <- findSVI::get_svi(year_tmp, data) %>%
      select(GEOID, contains('RPL_theme')) %>%
      mutate(year  = year_tmp, state = state_tmp)
    cli_alert_success("Finished pull for {state_tmp} - {year_tmp}")
    return(result)
  }) %>% bind_rows()

GEOID	RPL_theme1	RPL_theme2	RPL_theme3	RPL_theme4	RPL_themes	year	state
85003	0.5628	0.0477	0.7211	0.9724	0.6658	2015	AZ
85004	0.5327	0.0653	0.6457	0.9146	0.5955	2015	AZ
85006	0.8819	0.5578	0.9020	0.9598	0.9397	2015	AZ
85007	0.8116	0.7839	0.8819	0.9774	0.9523	2015	AZ
85008	0.7764	0.2060	0.8920	0.9673	0.8568	2015	AZ
85009	0.8844	0.5327	0.9573	0.9950	0.9724	2015	AZ
85012	0.2136	0.0553	0.4271	0.6457	0.3065	2015	AZ
85013	0.2663	0.0955	0.5804	0.7990	0.4146	2015	AZ
85014	0.4950	0.2638	0.6985	0.8342	0.6307	2015	AZ
85015	0.8040	0.6256	0.8467	0.9296	0.9070	2015	AZ
85016	0.4070	0.1131	0.6281	0.8442	0.5402	2015	AZ
85017	0.8543	0.6583	0.9246	0.9899	0.9648	2015	AZ
85018	0.2513	0.3593	0.5653	0.6533	0.4171	2015	AZ
85019	0.9196	0.6859	0.9296	0.7839	0.9196	2015	AZ
85020	0.4271	0.3166	0.6432	0.8065	0.5829	2015	AZ

First 15 rows of the result table are shown. In this table, “GEOID” represents the ZCTA, and “RPL_” columns are the corresponding theme-specific SVI and overall SVI. While results for all requests are summarized in one table, the ranking and calculation is done separately by each request entry (state-year combination). Complete SVI table including information for individual variables in each theme can be obtained with findSVI for specific state-year combination.