Support vector-valued flows with `mp_discrete_stoch()`? #288

papsti · 2025-03-24T14:17:34Z

how difficult would it be to support vector-valued flows with mp_discrete_stoch()?

for instance, the current implementation of the sir_age starter model leverages vector-valued flows to easily stratify the basic SIR by age (or metapop structure) as described here, but i wouldn't currently be able to generate discrete stochastic equations as noted here.

i'm not sure if the concerns with doing this are computational or mathematical or both? could we spell them out a bit to discuss feasibility of this feature?

being able to easily go from simulating deterministically to stochastically would be a strong incentive for me to recode the spec for a model i'm currently using heavily. this would be most useful for me if it could be done in the next month or two, but would still be useful beyond.

The text was updated successfully, but these errors were encountered:

stevencarlislewalker · 2025-03-24T16:12:01Z

(previous comment deleted ... I was on the wrong track earlier)

Your question has really helped me think @papsti . Please let me know if you have any questions about what I wrote below so I can clarify it better in my own mind.

The General Issue

Branching flows and stratification are two kinds of model structure that interact when using mp_discrete_stoch, but that do not interact when using the default mp_euler (or the mp_rk4) method. I believe we will get a similar interaction issue with mp_hazard

Assumptions to Bound Scope

All state variables have either n elements or 1 element
The elements of length-n state vectors line up (e.g., by age, by location, etc ...)
Flowing from state vector A to vector B means that there is one flow between each of the ith elements of A and B
The per-capita rate of flow between A and B needs to have either n elements or 1 element

Note that these are genuinely restrictive assumptions. For example, it is not possible to have S stratified by age and I stratified by both age and symptom status. To stratify by symptom status you need to have more vectors, S, Im, and Is say.

The Problem

With these assumptions the reulermultinom engine function will fail because it takes two arguments that cannot be fitted into these assumptions.

A scalar valued size variable (i.e., the size of the from compartment)
A vector valued rate variable (i.e., the vector of per-capita rates of flowing to each of the compartments that the from compartment flows to -- if there is more than one element in this vector we have a branching flow)

Because the state variables are already vector valued, it would require 'fancy' bookkeeping to have a single vector that kept track of both stratification of state variables and of branching of flows. In other words we are trying to use vectors for two things: (1) representing grouping structure and (2) representing branching of flows.

Proposed Solution

To make reulermultinom work under the assumptions above, we need to make a different argument signature for reulermultinom:

A vector valued size variable (i.e., the vector of from compartments)
m vector valued rate arguments, one vector for each branch of the branching flow

This is about a day of focused work and will break the existing reulermultinom function so I'll need to figure out a back-compatibility strategy as well (so maybe more than a day?).

Visualizing the Proposal

Consider this branched flow involving vector-valued state variables, S, Im, and Is.

graph LR;
    S-->I
8000
m;
    S-->Is;

In the above proposal, this vector-valued diagram expands to the following element-wise diagram.

graph LR;
    S_1-->Im_1;
    S_1-->Is_1;
    S_2-->Im_2;
    S_2-->Is_2;

Note that if we try to stratify the I vector by both age-group (1,2) and symptom status (m,s) we will specify this vector-valued diagram that does not describe branching.

graph LR;
    S-->I;

Because the vector-valued diagram doesn't describe branching, the corresponding element-wise diagram would also not describe branching. We would have to have something like this.

graph LR;
    S_1_m-->I_1_m;
    S_1_s-->I_1_s;
    S_2_m-->I_2_m;
    S_2_s-->I_2_s;

What these pictures show is that each vector-valued diagram translates into parallel element-wise diagrams that all share the same branching pattern. For example, lets look at one branched flow in one of the element-wise diagrams above.

graph LR;
    S_1-->Im_1;
    S_1-->Is_1;

Note that all states are from stratum _1. This is the restriction. We need all states in each (possibly branched) flow to be from the same stratum.

In short, I'm proposing to use (1) different elements of vectors to represent strata and (2) different vectors to represent alternative branches through a diagram.

What About Examples without Branching Flows?

You may be wondering, @papsti , but my example does not involve branching flows! The problem is that the second argument of the reulermultinom engine function expects a vector where each element does describe branching. But vectors in the example describe stratification, and so this breaks the machine.

Complexities and Down the Road

If we all like this proposal, we will need to create new flow functions (i.e., not just mp_per_capita_flow and friends). Here are some examples.

We will need a way to flow within vectors (e.g., S_1 to S_2 to represent, say, movement among spatial locations). Maybe something like mp_among_strata_flow?

We will also need a way to handle movement from one specific element in one specific vector to a specific element in another vector.

The above makes the user handle the functional form of the state-dependent per-capita flow rates on their own. This could get confusing if there are multiple types of stratification (age-by-geog-by-immunity). The user will need to do things like Kronecker products, which I would rather not force them to do. So probably what we need are helpers for specifying the per-capita flow rate arguments. What I am thinking is something like this:

foi_vec = mp_state_dependence("beta * I / N", ...)
mp_per_capita_flow("S", "E", foi_vec, "infection")

Where the ... somehow specifies how to treat this expression with beta, I, and N representing differently shaped tensors. Sigh ... honestly, in the end we are going to need fancy bookkeeping but presented in an interface that is easy to learn.

The stuff in this down-the-road section is what paralyzes me when designing a stratification interface. But I do think that @papsti's use case is going to be common enough that we should just implement it as proposed above.

stevencarlislewalker · 2025-03-24T17:13:12Z

Here is a good minimal example to use when working on this stuff.

library(macpan2); library(tidyverse)
spec = mp_tmb_model_spec(
    during = list(
        mp_per_capita_flow("S", "Im", "(1 - alpha) * M %*% ((Im + Is) / N)", "infection_mild")
      , mp_per_capita_flow("S", "Is", "alpha * M %*% ((Im + Is) / N)", "infection_severe")
    )
  , default = list(
      S  = c(98, 198)
    , Im = c(1, 1)
    , Is = c(1, 1)
    , M = matrix(c(0.2, 0.1, 0.1, 0.25), 2, 2)
    , alpha = 0.1
    , N = 300
  )
)
(spec
  |> mp_simulator(50, mp_state_vars(spec))
  |> mp_trajectory()
  |> mutate(grp = as.factor(row))
  |> ggplot()
  + geom_line(aes(time, value, colour = grp))
  + facet_wrap(~matrix, scales = 'free', ncol = 1)
  + theme_bw()
)
(spec
  |> mp_discrete_stoch()
  |> mp_expand()
  |> mp_print_during()
)

papsti · 2025-03-24T18:23:47Z

thank you for writing all of this out---that helped enormously. i appreciate the complexities of this problem when you're thinking of asking macpan2 to almost handle stratification for you, and i agree this requires careful consideration.

i think your proposal for my usecase is a sound first step. i have an iteration on the idea that might simplify the problem.

i think the issue becomes really difficult in general because we want macpan to respect constraints with branching flows, but the rate is treated as a monolith, and we're not leveraging the fact that the user is actually sp 8000 ecifying how the branching should occur in the rate expression. perhaps this insight could inspire a more useful spec, particularly when setting up equations with mp_discrete_stoch().

for instance, in the very helpful example above, the rates into the Im and Is subcategories are identical, except that they are branched with alpha and 1-alpha. would it help in setting up the call to reulermultinom() to have the spec differentiate between the "base rate" M %*% ((Im + Is) / N) and the branching factors (alpha, 1-alpha)?

if so, perhaps there could be something like

mp_per_capita_branched_flow(
   from = "S", to = c("Im", "Is")
   rate = "M %*% ((Im + Is) / N)",
   branch_with = c("alpha", "1-alpha") # here the length and order should match the to vector, and should sum to 1
   abs_rate = c("infection_m", "infection_s") # here too we could ask the user to specify two distinct names, to ensure that we're not relying on macpan's internal indexing, which we might want to reserve for another kind of stratification (age, patches, etc)
)

stevencarlislewalker · 2025-03-24T18:33:47Z

I like this idea. It would involve adding another flow function, but that might be worth it. One nice thing for development time is that we ask the user to tell us how many branches there are, which will likely be useful for specifying the internal logic. Thanks.

papsti · 2025-03-24T19:32:01Z

sounds good. for the record, the model i seek to recode has a number of branched flows, so this is extremely relevant to me currently.

stevencarlislewalker · 2025-03-24T20:16:27Z

I am wondering if the mp_per_capita_branched_flow function should be mp_per_capita_related_flows and look more like this.

mp_per_capita_related_flows(
   from = "S", to = c("Im", "Is")
   intermediate_expressions = list(foi ~ M %*% ((Im + Is) / N)),
   rates = c("foi * alpha", "foi * (1-alpha)")
   abs_rate = c("infection_m", "infection_s")
)

This interface could have benefits for #267.

stevencarlislewalker · 2025-04-23T20:38:38Z

Related to #188.

stevencarlislewalker self-assigned this Mar 24, 2025

stevencarlislewalker added this to Roadmap Mar 24, 2025

stevencarlislewalker moved this to 📋 Backlog in Roadmap Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support vector-valued flows with `mp_discrete_stoch()`? #288

Support vector-valued flows with `mp_discrete_stoch()`? #288

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support vector-valued flows with mp_discrete_stoch()? #288

Support vector-valued flows with mp_discrete_stoch()? #288

Comments

Uh oh!

The General Issue

Assumptions to Bound Scope

The Problem

Proposed Solution

Visualizing the Proposal

What About Examples without Branching Flows?

Complexities and Down the Road

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Support vector-valued flows with `mp_discrete_stoch()`? #288

Support vector-valued flows with `mp_discrete_stoch()`? #288