So this is actually about yesterday’s work, which took place until quite late into the night. (Well, to my standards, it probably wasn’t THAT late).
Status
The situation is as follows:
- I implemented basically everything, that is, including a rudimentary Genetic Algorithm to try out different “policies” (varying Beta and Mu per node), and then seeing if I can get my GA to look for better populations.
- “Better” is, for now, and quite obviously in the context, to have a minimum possible spread of a disease (infection) over a network after a set of time steps.
- We have a preset network (I ran tests with a few).
- We have a preset “base level” Beta and Mu.
- Using a binary choice of nodes, some of the nodes benefit from a reduction of their Beta or an increase of their Mu values. The binary choice represents an individual for the GA.
- The GA: We generate (at first at random) a generation of such individuals, and then calculate each individual fitness (one full SIS simulation, per individual). Then we selected some of the better individuals, generate offspring out of them, create a new population, and iterate.
- To be noted: I don’t want each individual to be optimized for a preset of initially infected nodes, and so I run 100 simulations with 100 DIFFERENT initial set of infected nodes (albeit always at 20% prevalence).
- The Constraint: An obvious choice would be to keep individuals with ALL improved Betas and Mus, BUT we preset a “budget” of the number of influences one can have. For a network of 10 nodes, we can act on 20 variables (binary each, and one for Beta, one for Mu, times number of the nodes). The Budget is set to a value BELOW 20 in the example, so that we now look for individuals that CAN’T choose a perfect setup.
- The code: Not critically optimized, but running in C++ for the SIS simulations (R for the setup), and multi-CPUs (7, in fact), it still takes about a minute or so for the above setup to run…
- The goal: I was hoping that after a few generations (say 50 or 100), for a generation of 100 individuals, running each its complete simulations
The cold shower
Well, at this point, and after many tests, code reviews, etc. It appears… I failed. So far, quite miserably so, the “thing” simply doesn’t improve across generations, overall spread is quite stable in fact… But it was all yesterday.
Surely I should review the code. Actually I’m not happy with the individuals selection (parents) across generations, maybe it’s not aggressive enough (maybe I could simply keep the overall best individuals as parents, or use other known algorithms, to be continued…).
But also, I woke up this morning wondering if the approach was sensible:
Can one improve the protection of a network by acting on the Beta and Mu of only certain nodes?
And that might be a key realisation. I mean I intuitively KNOW that if you protect nodes with high centrality from infection (lower Beta) instead of some other node, your overall infection containment capacity SHOULD go down.
Well yes, but… For 100 iterations (I’ve tried 300 too), with varying starting infected base, and using probabilities? Would it make enough of a difference to choose which nodes to affect? Can you get consistent enough improvements?
Or is the search space maybe in fact so SPARSE and the improvements of choosing which nodes (not how much) to affect so ineffective that the GA can’t seem to improve?
Is the problem with my implementation (very well could be!), with the parameters (quite probably so too) or simply with the approach (in which case, back to the drawing board…)?
I just don’t know, but as the progress was “relevant”, I wanted to make a note of it.