layout: true <div class="my-footer"><span><a href="https://dajmcdon.github.io/epi-modelling-calgary-2023" style="color:white">dajmcdon.github.io/epi-modelling-calgary-2023</a></span></div> --- background-image: url("gfx/cover-1.svg") background-size: contain background-position: top <br/><br/><br/><br/><br/><br/><br/><br/><br/><br/> # Statistical approaches to epidemic forecasting ##
Evaluation and software <br> .pull-left[ ###Daniel J. McDonald ###University of British Columbia #### 10 February 2023 ] .pull-right.center[ ![:scale 30%](gfx/qrcodes-1.png) <br> ] --- ## Mathematical modelling of disease / epidemics is very old * .tertiary[Daniel Bernoulli (1760)] - studies inoculation against smallpox * .tertiary[John Snow (1855)] - cholera epidemic in London tied to a water pump * .tertiary[Ronald Ross (1902)] - Nobel Prize in Medicine for work on malaria * .tertiary[Kermack and McKendrick (1927-1933)] - basic epidemic (mathematical) model .center[ ![:scale 50%](https://www-tc.pbs.org/wgbh/masterpiece/wp-content/uploads/2019/02/victoria-s3-e4-historical-01-3200x1800.jpg) ] --- ## Forecasting is also old, but not that old ### US CDC Flu Challenge began in 2013 .center[ ![:scale 60%](https://www.cdc.gov/flu/images/weekly/flu-forecasting.jpg?_=29012) ] > CDC’s Influenza Division has collaborated each flu season with .tertiary[external researchers] on flu forecasting. <br>CDC has provided forecasting teams data, relevant public health forecasting targets, and forecast accuracy metrics while .tertiary[teams submit their forecasts], which are based on a variety of methods and data sources, each week. --- ## The Covid-19 Pandemic * CDC pivoted their Flu Challenge to Covid-19 in June 2020 * Similar efforts in Germany and then Europe .pull-left[ ![](https://covid19forecasthub.org/images/forecast-hub-logo_DARKBLUE.png) ] .pull-right[ ![:scale 50%](https://covid19forecasthub.eu/images/logo.svg) ] * .secondary[Nothing similar for Canada] * CDC now has Flu and Covid simultaneously --- ## Why the Forecast Hubs? * Collect public forecasts in a standard format and visualize * Used internally by CDC * Turns out, most individual teams' forecasts are ... not great * Combine submissions into an "Ensemble" ![https://doi.org/10.1371/journal.pcbi.1007486](gfx/pnas.png) --- # Outline .large[ .tertiary[0.] .grey[History and background] .tertiary[1.] Standard forecasting models. .tertiary[2.] Forecast evaluation .tertiary[3.] Gaps in forecasting .tertiary[4.] Some lessons learned .tertiary[5.] Software for the community ] --- class: inverse, middle, center # Standard forecasting models --- <img src="gfx/submitted-hosp-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Quick overview of types .pull-left[ * BPagano - SIR * CMU - Time Series * Colorado (4 versions) - SEIR, different scenarios * Georgia Tech - Deep learning... * IHME (rarely) - SEIR * Johns Hopkins (3 in Applied Physics) - SEIR, Time series, ensemble * Johns Hopkins (other, Infection Dynamics) - SEIR ] .pull-right[ * Dean Karlen (Physicist at UVic) - SEIR type * MOBS-Gleam - Agent based mobility model * USC - SEIR * UVA - Ensemble of VAR, LSTM, SEIR * Prolix - "Offsets obtained by correlations, best linear approximation of reproduction rates (using vaccination approximation) by least euclidean distance, and linear prediction." ] .large.fourth-color[SIR-type] .large.tertiary[Time series] .secondary.large[Ensemble / Deep Learning / ML] --- ## Time series (simplification) > On day `\(t\)`, model hospitalizations at day `\(t+a\)` as a function of today's hospitalizations and those in the recent past. .secondary[For example:] `$$\widehat{y}_{t+a} = a_{0} y_{t} + a_{7}y_{t-7} + a_{14} y_{t-14} + b_0 x_t + b_{7} x_{t-7}$$` .secondary[COVIDhub Baseline:] `$$\widehat{y}_{t+a} = y_{t}$$` * Seemingly, straightforward to do. * Performs surprisingly well, most of the time. * Not so good when you'd most want it to work. * Uses essentially no information about epidemics. --- ## SIR-type (compartmental) models - Stochastic Version .pull-left-wide[ Suppose each of N people in a bucket at time t: 1. .large[.primary[Susceptible(t)]] : not sick, but could get sick 1. .large[.secondary[Infected(t)]] : sick, can make others sick 1. .large[.tertiary[Removed(t)]] : either recovered or dead; not sick, can't get sick, can't make anyone sick <hr> **Assume**: During period h, each .primary[S] meets kh people. **Assume**: Prob( .primary[S] meets .secondary[I] and becomes .secondary[I] ) = c. .fourth-color[So]: Prob( .primary[S] `\(\rightarrow\)` .secondary[I] ) = `\(1 - (1 - c I(t) / N )^{hk} \approx kchI(t) / N\)` .fourth-color[So]: New .secondary[I] in time h `\(\sim Binom(S(t),\ kchI(t) / N)\)` **Assume**: Prob( .secondary[I] `\(\rightarrow\)` .tertiary[R]) = `\(\gamma h\)` .fourth-color[So]: new removals in time h `\(\sim Binom(I(t),\ \gamma h)\)` ] .pull-right-narrow[
] --- ## SIR-type (compartmental) models - Stochastic Version .pull-left-wide[ ### Over-all equations: `\begin{aligned} C(t+h) & = \mathrm{Binom}\left(S(t),\ \frac{\beta}{N} h I(t)\right)\\ D(t+h) & = \mathrm{Binom}\left(I(t),\ \gamma h\right)\\ S(t+h) & = S(t) - C(t+h)\\ I(t+h) & = I(t) + C(t+h) - D(t+h)\\ R(t+h) & = R(t) + D(t+h) \end{aligned}` ### In the deterministic limit, `\(N\rightarrow\infty,\ h\rightarrow 0\)` `\begin{aligned} N &= S(0) + I(0) + R(0)\\ \frac{dS}{dt} & = -\frac{\beta}{N} S(t)I(t)\\ \frac{dI}{dt} & = \frac{\beta}{N} I(t)S(t) - \gamma I(t)\\ \frac{dR}{dt} & = \gamma I(t) \end{aligned}` **"_the_ SIR model"** is often ambiguous between these ] .pull-right-narrow[
] --- ## Data issues - .tertiary[Ideally] we'd see .large[.primary[S], .secondary[I], .tertiary[R]] at all times - Easier to observe new infections, .large[.secondary[I(t+h) - I(t)]] - Removals by death are easier to observe than removals by recovery, so we mostly see .large[.tertiary[(R(t+h) - R(t))] × (death rate)] - The interval between measurements, say `\(\Delta\)`, is often `\(\gg h\)` - Measuring .large[.secondary[I(t)]] and .large[.tertiary[R(t)]] (or their rates of change) is hard + testing/reporting is sporadic and error prone + Need to model test error (false positives, false negatives) _and_ who gets tested + Need to model lag between testing and reporting - Parameters (especially, `\(\beta\)`) change during the epidemic + Changing behavior, changing policy, environmental factors, vaccines, variants, ... --- ## Connecting to Data - Likelihood calculations are straightforward if we can measure .large[.secondary[I(t)], .tertiary[R(t)]] at all times 0, h, 2h, … T - Or .large[.secondary[I(0)]], .large[.tertiary[R(0)]] and all the increments .large[.secondary[I(t+h) - I(t)], .tertiary[R(t+h) - R(t)]] - Still have to optimize numerically - Likelihood calculations already become difficult if the time between observations `\(\Delta \gg h\)` + Generally, `\(\Delta \approx\)` 1 day + In principle, this just defines another Markov process, with a longer interval `\(\Delta\)` between steps, but to get the likelihood of a `\(\Delta\)` step we have to sum over all possible paths of `\(h\)` steps adding up to it - Other complications if we don't observe all the compartments, and/or have a lot of noise in our observations + We don't and we do. --- ## Connecting to Data .pull-left[ - Often more tractable to avoid likelihood (Conditional least squares, simulation-based inference) - Intrinsic issue: Initially, everything just looks exponential + So it's hard to discriminate between distinct models + So even assuming an SIR model, it's easier to estimate `\(\beta - \gamma\)` than `\((\beta, \gamma)\)` or `\(\beta/\gamma\)` - Can sometimes **calibrate** or fix the parameters based on other sources + E.g., `\(1/\gamma =\)` average time someone is infectious, which could be determined from clinical studies / observations ] .pull-right[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">I have been thinking about how different people interpret data differently. And made this xkcd style graphic to illustrate this. <a href="https://t.co/a8LvlmZxT7">pic.twitter.com/a8LvlmZxT7</a></p>— Jens von Bergmann (@vb_jens) <a href="https://twitter.com/vb_jens/status/1372251931444350976?ref_src=twsrc%5Etfw">March 17, 2021</a></blockquote> ] --- ## These models fit well "in sample" .pull-left[ * Track observed cases closely (they should) * Can provide nuanced policy advice on some topics * Many questions depend on modulating `\(\beta\)` 1. What happens if we lock down? 2. What happens if we mask? 3. What happens if we have school online? 4. Vaccine passport? * Vaccination modeling is easier, directly removes susceptibles * What about out-of-sample? ] .pull-right[.center[
]] --- ## What does this "look like"? ```r sim_SIR <- function(TT, N = 1000, beta = .1, gamma = .01) { S <- double(TT); I <- double(TT); R <- double(TT) S[1] <- N - 1; I[1] <- 1; R[1] <- 0 for (tt in 2:TT) { contagions <- rbinom(1, size = S[tt - 1], prob = beta * I[tt - 1] / N) removals <- rbinom(1, size = I[tt - 1], prob = gamma) S[tt] <- S[tt - 1] - contagions I[tt] <- I[tt - 1] + contagions - removals R[tt] <- R[tt - 1] + removals } tibble(S = S, I = I, R = R, time = 1:TT) } ``` --- ## What does this "look like"? <img src="gfx/sim-sir-plot-1.svg" width="100%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # So how do they do? <br><br> Out-of-sample COVID Forecasting --- ## The forecast format * Various targets .secondary[incident deaths], .tertiary[incident cases], .fourth-color[hospital admissions] * Produce point forecasts * Quantiles to measure uncertainty (set of 23) * Predict 1–4 epiweeks or 1–28 days ahead <img src="gfx/unnamed-chunk-1-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Forecast evaluation [Weighted Interval Score (Bracher et al., 2000)](https://arxiv.org/abs/2005.12881) `$$WIS = \frac{2}{K} \sum_{k=1}^K \max\{\tau_k (y - q_k),\ (1-\tau_k)(q_k - y)\}$$` .pull-left[ * Calculated for each target (forecast date, location, horizon) * For each Interval: 1. Width of interval. 1. Under prediction (AE to the top) 1. Over prediction (AE to the bottom) * Weighted average using probability content * Mathematically equivalent to an average of quantile losses * Discrete approximation of CRPS ] .pull-right[ <img src="gfx/unnamed-chunk-2-1.svg" width="100%" style="display: block; margin: auto;" /> ] --- ## Comparing across space and time .pull-left[ * Error proportional to cases * Large right skew * "Adjust" by scaling to a Baseline * Baseline is flat line + residual quantiles <img src="gfx/wis-map-1.svg" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="gfx/wis-densities-1.svg" width="100%" style="display: block; margin: auto;" /> ] --- ## Performance over time (Incident Deaths) <img src="gfx/death-scores-all-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Overall performance (Incident Deaths) <img src="gfx/overall-death-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## What about Incident Cases? <img src="gfx/overall-cases-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## What about Hospital Admissions? <img src="gfx/overall-hosp-1.svg" width="100%" style="display: block; margin: auto;" /> --- class: middle, center, h2 .large[Live evaluation dashboard] <https://delphi.cmu.edu/forecast-eval/> --- class: middle, center, inverse # Gaps in forecasting --- ## Predicting surges in Florida <iframe src="https://ubc-stat-dajmcdon.shinyapps.io/fl-fcast-visualizer/" width="100%" height="700px" style="border:none"></iframe> --- ## PI Coverage <img src="gfx/coverage-plot-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## The terrible case of Ohio <img src="gfx/plot-oh-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## The terrible case of Ohio <img src="gfx/bad-forecast-1.svg" width="100%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # Forecasting lessons --- ## Forecasting lessons Human in the loop <img src="gfx/manual-corrections.png" width="100%" height="90%" style="display: block; margin: auto;" /> --- ## Forecasting lessons Human in the loop <img src="gfx/quality-control.png" width="90%" style="display: block; margin: auto;" /> --- ## Forecasting lessons Simple models [M, Bien, Green, Hu, ..., Tibshirani](http://medrxiv.org/content/10.1101/2021.06.22.21259346v1) <img src="gfx/add-hrr-1.svg" width="100%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # Software: signal processing and simple models --- ## `{epiprocess}` .secondary[<https://cmu-delphi.github.io/epiprocess/>] 1. Data types * `epi_df` — basically a `data.frame` but with important meta information 1. `as_of` tag to denote the data vintage 2. time type 3. geo type 4. additional keys (e.g., age, gender, race) * `epi_archive` — a collection of `epi_df`s of different vintages * But stored so as to eliminate redundancies * Allows for important operations (filling, merging, snapshots, ) 2. Fundamental functionality — the `slide()` * Rolling correlations * Moving averages * outlier correction * arbitrary functions --- ## Example of `slide()` ```r x <- x %>% # covid cases in 2 locations downloaded with `{epidatr}` group_by(geo_value) %>% mutate(smooth_spline = growth_rate(time_value, cases, method = "smooth_spline"), trend_filter = growth_rate(time_value, cases, method = "trend_filter")) ``` <img src="gfx/unnamed-chunk-8-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Example of `slide()` ```r x <- x %>% # covid cases in 2 locations downloaded with `{epidatr}` group_by(geo_value) %>% mutate(smooth_spline = growth_rate(time_value, cases, method = "smooth_spline"), trend_filter = growth_rate(time_value, cases, method = "trend_filter")) ``` <img src="gfx/unnamed-chunk-10-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Outlier detection ```r x <- x %>% mutate(outliers = detect_outlr_rm(time_value, cases)) # uses rolling median, but lots of fancy alternative options ``` <img src="gfx/unnamed-chunk-13-1.svg" width="100%" style="display: block; margin: auto;" /> --- ## Examining an `epi_archive` <img src="gfx/unnamed-chunk-14-1.svg" width="100%" style="display: block; margin: auto;" /> * Easy to inspect revision behaviour * Can use it for pseudo-prospective forecasts --- class: middle, center, inverse # `{epipredict}` <https://cmu-delphi.github.io/epipredict> --- ### .tertiary[A set of basic, easy-to-use forecasters that work out of the box.] You can do a limited amount of customization. We currently provide: - Baseline flat-line forecaster - Autoregressive forecaster (not an "AR" model, you don't want this) - Autoregressive classifier ### .tertiary[A framework for creating custom forecasters out of modular components.] 1. .secondary[Preprocessor]: do things to the data before model training 2. .secondary[Trainer]: train a model on data, resulting in a fitted model object 3. .secondary[Predictor]: make predictions, using a fitted model object 4. .secondary[Postprocessor]: do things to the predictions before returning .tertiary[A very specialized plug-in to `{tidymodels}`] --- ## Basic autoregressive forecaster * Predict `deaths` 1-week ahead on 0, 7, 14 day lags of cases and deaths. * Use `lm`, produce intervals with residual quantiles. ```r library(epipredict) jhu <- case_death_rate_subset %>% filter(time_value > max(time_value) - 60) canned <- arx_forecaster(jhu, outcome = "death_rate", predictors = c("case_rate", "death_rate")) canned$predictions ``` ``` ## An `epi_df` object, 56 x 6 with metadata: ## * geo_type = state ## * time_type = day ## * as_of = 2022-05-31 12:08:25 ## ## # A tibble: 56 × 6 ## geo_value time_value .pred .pred_distn forecast_date target_date ## * <chr> <date> <dbl> <dist> <date> <date> ## 1 ak 2021-12-31 0.397 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 2 al 2021-12-31 0.262 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 3 ar 2021-12-31 0.410 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 4 as 2021-12-31 0.0983 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 5 az 2021-12-31 0.610 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 6 ca 2021-12-31 0.283 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 7 co 2021-12-31 0.500 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 8 ct 2021-12-31 0.518 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 9 dc 2021-12-31 0.797 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## 10 de 2021-12-31 0.605 [0.05, 0.95]<q-rng> 2021-12-31 2022-01-07 ## # … with 46 more rows ``` --- ### Use random forests instead ```r rf <- arx_forecaster( jhu, outcome = "death_rate", predictors = c("case_rate", "death_rate"), trainer = parsnip::rand_forest(mode = "regression"), # use ranger args_list = arx_args_list( ahead = 14, lags = list(c(0:4, 7, 14), c(0, 7, 14)), levels = c(0.05, 1:9/10, 0.95), quantile_by_key = "geo_value" ) ) ``` ### Or boosting ```r xgb <- arx_forecaster( jhu, outcome = "death_rate", predictors = c("case_rate", "death_rate"), trainer = parsnip::boost_tree(mode = "regression", trees = 20), # use xgboost args_list = arx_args_list( ahead = 14, lags = list(c(0:4, 7, 14), c(0, 7, 14)), levels = c(0.05, 1:9/10, 0.95), quantile_by_key = "geo_value" ) ) ``` --- ## Code a forecaster by hand ```r # A preprocessing recipe describing how to turn raw data into features / response r <- epi_recipe(jhu) %>% step_epi_lag(case_rate, lag = c(0, 1, 2, 3, 7, 14)) %>% step_epi_lag(death_rate, lag = c(0, 7, 14)) %>% step_epi_ahead(death_rate, ahead = 14) %>% step_epi_naomit() # A postprocessing routine describing what to do to the predictions f <- frosting() %>% layer_predict() %>% layer_threshold(starts_with(".pred"), lower = 0) %>% # predictions/intervals should be non-negative layer_add_target_date(target_date = max(jhu$time_value) + 14) # Bundle up the preprocessor, training engine, and postprocessor ewf <- epi_workflow(r, quantile_reg(tau = c(.1, .5, .9)), f) # Fit it to some data (we could fit this to ANY data that has the same format) trained_ewf <- ewf %>% fit(jhu) # examines the recipe to determine what we need to make the prediction latest <- get_test_data(r, jhu) preds <- trained_ewf %>% predict(new_data = latest) ``` --- ## Examine the model ```r extract_fit_engine(trained_ewf) ``` ``` ## Call: ## quantreg::rq(formula = ..y ~ ., tau = ~c(0.1, 0.5, 0.9), data = data, ## na.action = function (object, ...) ## UseMethod("na.omit"), method = "br", model = FALSE) ## ## Coefficients: ## tau= 0.1 tau= 0.5 tau= 0.9 ## (Intercept) -8.982762e-03 -0.0006293589 -0.0003799987 ## lag_0_case_rate 2.062774e-03 0.0022135082 0.0010485960 ## lag_1_case_rate -9.368519e-05 -0.0002539653 0.0005056500 ## lag_2_case_rate 3.388688e-04 0.0001240708 0.0017839004 ## lag_3_case_rate 3.455396e-04 0.0001238113 0.0002803659 ## lag_7_case_rate -4.971177e-04 0.0004952626 0.0002953601 ## lag_14_case_rate 2.430204e-03 0.0040881872 0.0037889962 ## lag_0_death_rate 2.620076e-02 0.1620189335 0.2056693415 ## lag_7_death_rate 3.524037e-02 0.1224601311 0.4249898859 ## lag_14_death_rate 4.888930e-02 0.1396515777 0.4339184700 ## ## Degrees of freedom: 1792 total; 1782 residual ``` --- ## Plot the forecasts at some locations <img src="gfx/unnamed-chunk-20-1.svg" width="100%" style="display: block; margin: auto;" /> --- class: middle, inverse, center # Thoughts and thanks --- ## Some concluding thoughts .pull-left[ * **Data is trouble** 1. We've spent lots of time trying to make data available 2. Even when Public Health doesn't cooperate 3. Nowcasting 4. Low SNR 5. Very nonstationary * **Forecast evaluation is not settled** 1. Not optimized to predict turning points. 2. How do we create ensembles? 3. Better to predict squishy concepts: "surge", "upswing", "big upswing"? ] .pull-right[ * **Hugely important to backtest properly** 1. Data is constantly revised 2. We see up to 10% "improvement" if we use finalized data * **Understanding the spatio-temporal dynamics is open** 1. How do "waves" propagate? 2. How important is mixing? 3. Seasonality? 4. Effects of NPIs? * **True Counterfactual causal inference is open** * **We have a massive survey dataset** ] <hr> .tertiary.center.large[We're building software so others can use it.] --- ## Many thanks * Ryan, Logan, Addison, Rob, Larry, Valerie, Alden, and the rest of [Delphi](https://delphi.cmu.edu/) * Jens, Dean, Dan, Sally and [BC COVID Modelling group](https://bccovid-19group.ca) * Funding to me from NSERC, CANSSI * I get to benefit from the results of funding from Google, Facebook, Amazon, Change Healthcare, Quidel, SafeGraph, Qualtrics, CDC, CSTE * Forecast evaluation from [Reich Lab](http://covid19forecasthub.org) and [Delphi](https://delphi.cmu.edu/forecast-eval/) <hr> <br><br> .center[ .tertiary.large[Questions?] On these or other issues. ]