epichains: Methods for simulating and analysing the size and length of transmission chains from branching process models

License: MIT R-CMD-check Codecov test coverage Lifecycle: experimental

epichains is an R package to simulate, analyse, and visualize the size and length of branching processes with a given offspring distribution. These models are often used in infectious disease epidemiology, where the chains represent chains of transmission, and the offspring distribution represents the distribution of secondary infections caused by an infected individual.

epichains re-implements bpmodels by providing bespoke functions and data structures that allow easy manipulation and interoperability with other Epiverse-TRACE packages, for example, superspreading and epiparameter, and potentially some existing packages for handling transmission chains, for example, epicontacts.

epichains is developed at the Centre for the Mathematical Modelling of Infectious Diseases at the London School of Hygiene and Tropical Medicine as part of the Epiverse Initiative.

Installation

Install the released version of the package:

install.packages("epichains")

The latest development version of the epichains package can be installed via

# check whether {remotes} is installed
if (!require("remotes")) install.packages("remotes")
remotes::install_github("epiverse-trace/epichains")

If this fails, try using the pak R package via

# check whether {pak} is installed
if (!require("pak")) install.packages("pak")
pak::pak("epiverse-trace/epichains")

If both of these options fail, please file an issue with a full log of the error messages. Here is an example of an issue reporting an installation failure. This will help us to improve the installation process.

To load the package, use

library("epichains")

Quick start

epichains provides three main functions:

The objects returned by the simulate_*() functions can be summarised with summary(). Running summary() on the output of simulate_chains() will return the same output as simulate_chain_stats() using the same inputs.

Objects returned from simulate_chains() can be aggregated into a <data.frame> of cases per time or generation with the function aggregate().

The simulated <epichains> object can be plotted in various ways using plot(). See the plotting section in vignette("epichains") for two use cases.

Simulation

For the simulation functionality, let’s look at a simple example where we simulate a transmission chain with \(20\) index cases, a constant generation time of \(3\), and a poisson offspring distribution with mean \(1\). We are tracking the chain “size” statistic and will cap all chain sizes at \(25\) cases. We will then look at the summary of the simulation, and aggregate it into cases per generation.

set.seed(32)
# Simulate chains
sim_chains <- simulate_chains(
  n_chains = 20,
  statistic = "size",
  offspring_dist = rpois,
  stat_threshold = 25,
  generation_time = function(n) {
    rep(3, n)
  }, # constant generation time of 3
  lambda = 1 # mean of the Poisson distribution
)
# View the head of the simulation
head(sim_chains)
#>    chain infector infectee generation time
#> 21     1        1        2          2    3
#> 22     2        1        2          2    3
#> 23     3        1        2          2    3
#> 24     3        1        3          2    3
#> 25     4        1        2          2    3
#> 26     6        1        2          2    3

# Summarise the simulation
summary(sim_chains)
#> `epichains_summary` object 
#> 
#>  [1]   5  17   4   8   1  16   9 Inf   5  18   5   1 Inf  24   1  14  19   2   4
#> [20]  14
#> 
#>  Simulated sizes: 
#> 
#> Max: >=25
#> Min: 1

# Aggregate the simulation into cases per generation
chains_agregegated <- aggregate(sim_chains, by = "generation")

# view the time series of cases per generation
chains_agregegated
#>    generation cases
#> 1           1    20
#> 2           2    26
#> 3           3    36
#> 4           4    43
#> 5           5    31
#> 6           6    25
#> 7           7    20
#> 8           8     9
#> 9           9     3
#> 10         10     1
#> 11         11     1
#> 12         12     1
#> 13         13     1

Inference

Let’s look at the following example where we estimate the log-likelihood of observing a hypothetical chain_lengths dataset.

set.seed(32)
# randomly generate 20 chain lengths between 1 to 40
chain_lengths <- sample(1:40, 20, replace = TRUE)
chain_lengths
#>  [1]  6 11 20  9 40 33 39 27  6 12 39 35  9 25  6 15 12  6 37 35

# estimate loglikelihood of the observed chain sizes
likelihood_eg <- likelihood(
  chains = chain_lengths,
  statistic = "length",
  offspring_dist = rpois,
  lambda = 0.99
)
# Print the estimate
likelihood_eg
#> [1] -104.2917

Each of the listed functionalities is demonstrated in detail in the “Getting Started” vignette.

Package vignettes

The theory behind the models provided here can be found in the theory vignette.

We have also collated a bibliography of branching process applications in epidemiology. These can be found in the literature vignette.

Specific use cases of epichains can be found in the online documentation as package vignettes, under “Articles”.

As far as we know, below are the existing R packages for simulating branching processes and transmission chains.

Click to expand

Reporting bugs

To report a bug please open an issue.

Contribute

Contributions to {epichains} are welcomed. Please follow the package contributing guide.

Code of conduct

Please note that the epichains project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Citing this package

citation("epichains")
#> To cite package 'epichains' in publications use:
#> 
#>   Azam J, Funk S, Finger F (2024). _epichains: Simulating and Analysing
#>   Transmission Chain Statistics Using Branching Process Models_. R
#>   package version 0.1.1, https://epiverse-trace.github.io/epichains/,
#>   <https://github.com/epiverse-trace/epichains>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {epichains: Simulating and Analysing Transmission Chain Statistics Using
#> Branching Process Models},
#>     author = {James M. Azam and Sebastian Funk and Flavio Finger},
#>     year = {2024},
#>     note = {R package version 0.1.1, 
#> https://epiverse-trace.github.io/epichains/},
#>     url = {https://github.com/epiverse-trace/epichains},
#>   }