--- title: "Running a Simulation with metaRVM" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Running a Simulation with metaRVM} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction This vignette demonstrates how to run a `metaRVM` simulation using the example configuration and data files included with the package. This is a good way to get started and understand the basic workflow. ## Locating the Example Files The `metaRVM` package includes a set of example files in its `extdata` directory. To run the example, we first need to locate these files. The `system.file()` function in R is the recommended way to do this, as it will find the files wherever the package is installed. ```{r} # Locate the example YAML configuration file yaml_file <- system.file("extdata", "example_config.yaml", package = "MetaRVM") print(yaml_file) ``` The `yaml_file` variable now holds the full path to the example configuration file. This file is set up to use the other example data files (also in the `extdata` directory) with relative paths. Below is the content of the yaml file. ```yaml run_id: ExampleRun population_data: mapping: demographic_mapping_n24.csv initialization: population_init_n24.csv vaccination: vaccination_n24.csv mixing_matrix: weekday_day: m_weekday_day.csv weekday_night: m_weekday_night.csv weekend_day: m_weekend_day.csv weekend_night: m_weekend_night.csv disease_params: ts: 0.5 tv: 0.25 ve: 0.4 dv: 180 dp: 1 de: 3 da: 5 ds: 6 dh: 8 dr: 180 pea: 0.3 psr: 0.95 phr: 0.97 simulation_config: start_date: 01/01/2023 # m/d/Y length: 150 nsim: 1 ``` ## Running the Simulation Once we have the path to the configuration file, the simulation can be run using the `metaRVM()` function. ```{r, results='hide'} # Load the metaRVM library library(MetaRVM) options(odin.verbose = FALSE) # Run the simulation sim_out <- metaRVM(yaml_file) ``` The `metaRVM()` function will parse the YAML file, read the associated data files, run the simulation, and return a `MetaRVMResults` object. ## Deep-dive into `MetaRVM` Classes ### Working with Configuration Files The simulation can be run by directly providing a YAML configuration file path, or by creating a `MetaRVMConfig` object. ```{r} # Load configuration from YAML file config_obj <- MetaRVMConfig$new(yaml_file) # Examine the configuration config_obj ``` ### Exploring Configuration Parameters The `MetaRVMConfig` class provides several methods to explore the simulation arguments: ```{r} # List all available parameters param_names <- config_obj$list_parameters() head(param_names, 10) # Get a summary of parameter types and sizes param_summary <- config_obj$parameter_summary() head(param_summary, 10) ``` ### Accessing Demographic Information One of MetaRVM's key features is demographic stratification, and it's ability to define parameters for specific demographic strata. ```{r} # Get demographic categories age_categories <- config_obj$get_age_categories() race_categories <- config_obj$get_race_categories() zones <- config_obj$get_zones() cat("Age categories:", paste(age_categories, collapse = ", "), "\n") cat("Race categories:", paste(race_categories, collapse = ", "), "\n") cat("Geographic zones:", paste(zones, collapse = ", "), "\n") ``` ### Alternative Ways to Run the Simulation ```{r} # Method 1: Direct from file path # sim_out <- metaRVM(config_file) # Method 2: From MetaRVMConfig object sim_out <- metaRVM(config_obj) # Method 3: From parsed configuration list config_list <- parse_config(yaml_file) sim_out <- metaRVM(config_list) ``` ## Exploring the Results The `metaRVM()` function returns a `MetaRVMResults` object with formatted, analysis-ready data. The results are formatted with calendar dates and demographic attributes, and stored in a data frame called results: ```{r} # Look at the structure of formatted results head(sim_out$results) # Check unique values for key variables cat("Disease states:", paste(unique(sim_out$results$disease_state), collapse = ", "), "\n") cat("Date range:", paste(range(sim_out$results$date), collapse = " to "), "\n") ``` ### Data Subsetting and Filtering The `subset_data()` method provides flexible filtering across all demographic and temporal dimensions. It returns an object of class `MetaRVMResults`. ```{r} # Subset by single criteria hospitalized_data <- sim_out$subset_data(disease_states = "H") hospitalized_data$results # Subset by multiple demographic categories elderly_data <- sim_out$subset_data( age = c("65+"), disease_states = c("H", "D") ) elderly_data$results # Specific date range peak_period <- sim_out$subset_data( date_range = c(as.Date("2023-10-01"), as.Date("2023-12-31")), disease_states = "H" ) peak_period$results ``` # Specifying Disease Parameter via Distributions `metaRVM` allows for disease parameters to be specified as distributions, which is useful for capturing uncertainty. When a parameter is defined by a distribution, each simulation instance will draw a new value from that distribution. For more details on the available distributions and their parameters, refer to the `yaml-configuration` vignette. An example YAML file with parameter distributions is included in the package, `example_config_dist.yaml`. Here is its content: ```{r} # Locate the example YAML configuration file with distributions yaml_file_dist <- system.file("extdata", "example_config_dist.yaml", package = "MetaRVM") ``` ```yaml run_id: ExampleRun_Dist population_data: mapping: demographic_mapping_n24.csv initialization: population_init_n24.csv vaccination: vaccination_n24.csv mixing_matrix: weekday_day: m_weekday_day.csv weekday_night: m_weekday_night.csv weekend_day: m_weekend_day.csv weekend_night: m_weekend_night.csv disease_params: ts: 0.5 tv: 0.25 ve: dist: uniform min: 0.3 max: 0.5 dv: 180 dp: 1 de: 3 da: dist: uniform min: 4 max: 6 ds: dist: uniform min: 5 max: 7 dh: dist: lognormal mu: 2 sd: 0.5 dr: 180 pea: 0.3 psr: 0.95 phr: 0.97 simulation_config: start_date: 01/01/2023 # m/d/Y length: 150 nsim: 20 # Increased nsim for meaningful summary statistics ``` To run a simulation with this configuration, we pass the file path to `metaRVM`. ```{r, results='hide', message=FALSE, warning=FALSE} # Run the simulation with the new configuration sim_out_dist <- metaRVM(yaml_file_dist) ``` ## Generating Summary Statistics across Demographics The `MetaRVMResults` class provides basic summarization functionality across multiple instances of the simulation, when one or more disease parameters are specified via distribution, and there are more than one simulations per configurations. The `summarize` method generates output of class `MetaRVMSummary` which has a `plot` method available to use. Now that we have run a simulation with parameter distributions, we can use the `summarize` method to see the variability in the results. ```{r, fig.height = 4, fig.width = 8, fig.align = "center"} library(ggplot2) # Summarize hospitalizations by age group hospital_summary_dist <- sim_out_dist$summarize( group_by = c("age"), disease_states = "n_IsympH", stats = c("median", "quantile"), quantiles = c(0.05, 0.95) ) # Plot the summary hospital_summary_dist$plot() + ggtitle("Daily Hospitalizations by Age Group (with 90% confidence interval)") + theme_bw() ``` ```{r, fig.height = 6, fig.width = 8, fig.align = "center"} # Summary of hospitalizations by age and race group hospital_summary <- sim_out_dist$summarize( group_by = c("age", "race"), disease_states = "n_IsympH", stats = c("median", "quantile"), quantiles = c(0.05, 0.95) ) hospital_summary # visualize the summary hospital_summary$plot() + ggtitle("Daily Hospitalizations by Age and Race") + theme_bw() ``` # Specifying Disease Parameters by Demographics The disease parameters can also be specified for different demographic subgroups. These subgroup-specific parameters will override the global parameters. For more details, refer to the `yaml-configuration` vignette. An example YAML file is provided, `example_config_subgroup_dist.yaml`, that demonstrates this feature. It also includes parameters defined by distributions. ```{r} # Locate the example YAML configuration file with subgroup parameters yaml_file_subgroup <- system.file("extdata", "example_config_subgroup_dist.yaml", package = "MetaRVM") ``` ```yaml run_id: ExampleRun_Subgroup_Dist population_data: mapping: demographic_mapping_n24.csv initialization: population_init_n24.csv vaccination: vaccination_n24.csv mixing_matrix: weekday_day: m_weekday_day.csv weekday_night: m_weekday_night.csv weekend_day: m_weekend_day.csv weekend_night: m_weekend_night.csv disease_params: ts: 0.5 tv: 0.25 ve: dist: uniform min: 0.3 max: 0.5 dv: 180 dp: 1 de: 3 da: 5 ds: 6 dh: dist: lognormal mu: 2 sd: 0.5 dr: 180 pea: 0.3 psr: 0.95 phr: 0.97 sub_disease_params: age: 0-17: pea: 0.08 18-64: ts: 0.6 65+: # This fixed value will override the global lognormal distribution for dh dh: 10 phr: 0.9227 simulation_config: start_date: 01/01/2023 # m/d/Y length: 150 nsim: 20 ``` Now, let's run the simulation with this configuration. ```{r, results='hide', message = FALSE} # Run the simulation with the subgroup configuration sim_out_subgroup <- metaRVM(yaml_file_subgroup) ``` We can now plot the results to see the impact of the subgroup-specific parameters. For example, we can compare the number of hospitalizations in the "65+" age group, which has a `dh` of 10, to other age groups that use the global `dh` drawn from a lognormal distribution. ```{r, fig.height = 6, fig.width = 8, fig.align = "center"} # Summarize hospitalizations by age group hospital_summary_subgroup <- sim_out_subgroup$summarize( group_by = c("age"), disease_states = "H", stats = c("median", "quantile"), quantiles = c(0.025, 0.975) ) # Plot the summary hospital_summary_subgroup$plot() + ggtitle("Daily Hospitalizations by Age Group (Subgroup Parameters)") + theme_bw() ```