Getting started

Sebastian Lequime



nosoi aims to provide a modular framework to conduct epidemiological simulations under a wide range of scenarios, varying from relatively simple to highly complex and parametric. This flexibility allows nosoi to take into account the effect of numerous covariates, taken individually or in the form of interactions, on the epidemic process, without having to use advanced mathematics to formalize the model into differential equations. At its core, nosoi generates a transmission chain (a link between hosts), the foundation of every epidemic process, allowing to reconstruct in great details the history of the epidemic.

nosoi is an agent-based model, which means it is centered on individuals, here called “hosts”, that enter the simulation when they get infected. nosoi is also stochastic, and thus relies heavily on probabilities, mainly 4 core probabilities. It operates on a discretized time.

Critical assumption of nosoi

nosoi assumes that the maximum number of hosts infected during a simulation is orders of magnitude smaller than the total exposed population. This means that, currently, it does not take into account building herd immunity using the simulated epidemic results (although a proxy can be used, as it will be discussed in several of the companion tutorials).

Essential building blocks: 3 probabilities and 2 numbers

At its core, nosoi can be summarized by 3 probabilities and 2 numbers at a specific time point:

Each of these probabilities and numbers are computed during the simulation. In a very simple scenario, each one is a constant. In more complex models supported by nosoi, these probabilities could depend on a host’s parameters (e.g. genetics), dynamic parameters (for how long the host has been infected), environmental parameters (related to the host’s location), or the particular period of the simulation (e.g. seasons, mitigation campaign,…). All those parameters can be taken into account either individually, or for the population as a whole.

What happens at each time step?

Time is discretized in nosoi. Each time step follows the same pattern for each host:

After these five propagation steps, the simulation moves to the next time step.

Setting up the core functions

nosoi runs under a series of user-defined probabilities and numbers (see General principles ). Each follows the same principles to be set up. We provide here a detailed explanation on how to set up a function correctly so that it can be used in the simulator. This will apply to the five core functions already mentioned: pExit, pMove, sdMove, nContact and pTrans.

Expected output

Every function which name starts with a p (i.e. pExit, pMove and pTrans) should return a single probability (a number between 0 and 1).

nContact should return a positive natural number (positive integer).

sdMove should return a real number (keep in mind this number is related to your coordinate space).

Functions’ arguments


A core function in nosoi should always explicitly depend on the time t (e.g. pExit(t)), even if t is not used within the body of the function. For instance, to return a single constant value of 0.08, the function should be expressed as:

p_Function  <- function(t){0.08}

The argument t represents the time since the host’s initial infection, and can be included in the body of the function, to model a time-varying probability or number. For instance, a quantity depending on the following function will evolve in time as a logistic distribution with parameters \(\mu\) 10 and \(s\) 2:

p_Function  <- function(t){plogis(t,10,2)}


The argument prestime can be used to represent the absolute time of the simulation, shared by all the hosts (as opposed to the relative time since infection t, which is individual-dependent). For instance, the following function can be used to produce a periodic pattern:

p_Function  <- function(t,prestime){(sin(prestime/12)+1)/2} and current.env.value

The arguments (discrete structure) or current.env.value (continuous structure) can be used to represent the location of the host (only if a structured population is used), as shown below:

p_Function  <- function(t,{
  if("C"){return(1)}} #discrete (between states "A","B" and "C")

p_Function  <- function(t,current.env.value){current.env.value/100} #continuous

For more details on how to set up the influence of the structure, we refer to the tutorials on discrete and continuous structure.


The argument host.count can be used to represent the number of hosts present at a given location (only for a structured population), as in the following:

p_Function  <- function(t,, host.count){
  if("B" & host.count < 300 ){return(0.5)}
  if("B" & host.count >= 300 ){return(0)}
  if("C"){return(1)}} #discrete (between states "A","B" and "C")

p_Function  <- function(t,current.env.value,host.count){(current.env.value-host.count)/100} #continuous

Extra Parameters

Any of the parameters used in these functions can be themselves dependent on the individual host (in order to include some heterogeneity between host). For instance, the \(\mu\) parameter of a logistic distribution can be determined for each individual host by another function to be specified:

p_Function  <- function(t,pFunction.param1){plogis(t,pFunction.param1,2)}

Where for instance \(\mu\) can be sampled from a normal distribution (mean = 10 and sd = 2):

p_Function_param1 <- function(x){rnorm(x,mean=10,sd=2)} #sampling one parameter for each infected individual

Notice here that the function is expressed as a function of x instead of t. x is present in the body of the function as the number of draws to make from the distribution.

Every parameter function you specify should be gathered into a list, where the function determining the parameter for each individual (here, p_Function_param1) has to be named according to the name used in p_Function for this parameter (here, pFunction.param1).

p_Function_parameters  <- list(pFunction.param1 = p_Function_param1)

Combining arguments

We have previously shown that you can combine the time since infection t with other parameters such as or prestime. In fact, you can combine as many arguments as you want, making a function dependent on the time since infection, current location, present time and individual host-dependent parameters. They however need to respect a specific order to be correctly parsed by the simulator: first t, then prestime, then (discrete) or current.env.value (continuous) and finally individual host-dependent parameters.

p_Function  <- function(t, prestime,, pFunction.param1, pFunction.param2,...){}

Going further

Once your core functions are ready, you can provide everything nosoi needs to run a simulation. A series of tutorials will guide you in how to set up nosoi depending on your case, both for single host and dual host scenarios:

  1. Spread of a pathogen in a homogeneous population (no structure) of hosts, a “simple” scenario.
  2. Spread of a pathogen in a structured (discrete) population of hosts.
  3. Spread of a pathogen in a structure (continuous) population of hosts.

To get basic statistics or approximated phylogenetic trees out of your simulations output, you can have a look at this page. To visualize your simulations, you can have a look at these few examples.

A series of practical examples are also available: