A toolkit for the infectious disease modeller

the odin, dust, mcstate and orderly suite of R packages

Marc Baguelin


Our Raison d’être (why?)

From 1946 to 2024

  • Scarcity
  • Not powerful by today’s standards
  • Technical skills needed to operate
  • Complete knowledge possible
  • Widely available
  • Incredibly more powerful
  • No technical skills needed
  • No one understands 100% how it works

Expectations for ML in G/P Health?

  • Availability of access: large access to computing ressources
  • Computing power/complexity: Model (and data associated) are expected to be big and complex
  • Technical skills: Low entry point
  • Control: Not so much discussed, but relies on expert committee/peer review, emphasizes reproducible and transparent evidence

What did happen during Covid ?

  • Availability: large access to computing ressources but disparities for e.g. HPC
  • Complexity: More models and on average much more complex than before
  • Technical skills: People with low experience in epi modelling did initiate models
  • Control: Very variable in space and time, definitely not full transparency, explosion of pre-prints

How do we get 1-2-3 with 4?

  • We want (1) widely available (2) complex model with (3) low technical entry point BUT (4) reproducible and truly transparent
  • Many ML tools have (1*), (2) and (3) but not (4)
  • Our response to the “black box” problem : modularity (and open source!)
  • When pipelines get complex, no single person person can control everything but each module can be trusted
  • Experts can confidently focus on their bit

Core Components

  • odin: A domain-specific modelling language for generating systems of ordinary differential equations and deterministic and stochastic difference equations
  • dust: Facilitates high-performance parallel computation
  • mcstate: Integrates efficient Bayesian inference algorithms
  • orderly: Simplifies and enhances reproducibility of collaborative data analyses

A flavour of odin

  • Simple ‘R-like’ language to write models in R
  • Intuitive mapping with mathematical objects
deriv(S) <- -beta * S * I / N
deriv(I) <- beta * S * I / N - sigma * I
deriv(R) <- sigma * I

initial(S) <- N - I0
initial(I) <- I0
initial(R) <- 0

N <- user(1e6)
I0 <- user(1)
beta <- user(4)
sigma <- user(2)
\[\begin{align} \frac{dS}{dt} &= -\beta S \frac{I}{N}\\ \frac{dI}{dt} &= \beta S \frac{I}{N} - \sigma I\\ \frac{dR}{dt} &= \sigma I \end{align}\]

What’s there now (and beyond)

History of project

  • odin and orderly pre-existed the COVID pandemic
  • Decision to use odin and build Imperial UK real model “from scratch”
  • Decision based on discussion with our research software engineer team
  • Ambition to build tools with “legacy”
  • Creating mcstate (inference) dust (efficient parallelisation)
  • The process has created a huge technical debt

Application during COVID-19

  • Huge impact, informing critical government decisions in the UK through SPI-M and SAGE, from April 2020 to May 2022, while ensuring constant public access to our model code
  • Despite no real advertisement of the tools (and lack of documentation), the toolkit has been adopted by several groups worldwide


An inclusive tool

  • Empowering researchers to efficiently utilize the toolkit for policy scenarios by the lowering entry ticket to (complex) modelling pipelines
  • Accelerating uptake and adoption in academic (incl. teaching) and operational settings
  • Supporting computation from web browser to HPC and GPUs

Web app

We can use the wodin web interface for odin to fit to data


Harnessing contemporary approach

  • Inference tools (mcstate) developed during the pandemic focused on PMCMC and MCMC
  • Developping mcstate2 including support for:
    • Automatic differentiation for odin models
    • Gradient descent, HMC and NUTS
    • Parallel tempering
    • Efficient hierarchical modelling of large models
  • Modular approach where “models” talk to “samplers”

A flavour of mcstate2

Creating a community

  • Organizing a workshop in the autumn to gather user feedback for interface and API improvement and resource development
  • User input is crucial in enhancing toolkit usability and effectiveness
  • Need to build a community
  • Contact me if interested m.baguelin@imperial.ac.uk


  • COVID-19 pandemic highlighted the critical need for rapid, reproducible pipelines for the modelling of epidemics
  • odin, dust, dcstate, and orderly developed to enhance teaching, understanding, production, and reproducibility of population-dynamic models
  • Some modelling groups worldwide have already embraced these packages
  • We are working on reimbursing the technical debt contracted during the pandemic

Our ambition

  • Improving efficiency and reproducibility of infectious disease modelling
  • Positioning the toolkit as a key global resource for population dynamic modelling
  • Transforming how infectious diseases and epidemic threats are modelled, leading to better public health outcomes worldwide

