Project Log: Day 3


Today will be about identifying what and WHERE I need to CHANGE to my implementation of:

  • a SIS model (graph-based epidemiology, discrete time steps)
  • networks generation

To do so, I rescue my original code for the “Complex Networks” exercise on the topic of Epidemiology. It was a mix of R and C++, and I seem to have located the right function. Note that it might NOT be the best base model, and SIRS might be better, with continuous time progress, or some other configuration, but that’s for future consideration, I need to focus on initial progress right now.

So for my objectives, I will need to modify that function so that:

  • Recovery probability is not a number (mu), but rather a VECTOR, with one number per node.
  • Similarly, the infection probability (from an infected neighbour), the Beta parameter, will also become a vector.

Both vectors will be of length n, the order of the Graph. Each node (vertex) is one computer in a network, in this case. To begin with, we’ll keep it simple, and so I will first have two possible Beta parameters, and then two mu, representing respectively how much a node is protected from infection (much, or not so much), and how probably we will detect the infected node (much, a high mu, or not so much, a low mu). Lots will happen later on these two aspects, per node, with lots of rationales loosely based on real-world phenomena, but that’s not needed yet. And my Dissertation tutor insisted I should go step by step anyway, and simpler is better (I concur :D).

That’s already two new variable values/sets per simulation, as before these two were preset parameters across each network for any given run.  So maybe I will need to increase the number of runs for each configuration to be able to take some measurements. But we’ll see what we can learn there.

As per the “networks generation”, for the above I’ll keep it fixed. But what I really want is to be able to generate networks with more or less Applications (represented each as a group of servers), and more or less segmentation (e.g. One DMZ? Each Application its own Subnet? Each application itself potentially segmented in three subnets?). Plus a company network is expected to be quite static anyway (I could start with a manually generated graph, and it also needs not be big…).

However, long term, my idea is that the “amount of segmentation” will become a variable in itself. I shall expect the amount of segmentation (if I manage to represent that “acceptably well”) will have a direct impact on the spreading of an infection on my network. Again, I need not go too fast, there is time, but that doesn’t mean I cannot work on some code for that future need.

I have some basic code by now that allows me to “merge” two graphs without loosing existing connections, so maybe I will generate “Application graphs”, and later see how I can merge those in a representative manner to put together a “company network” as a set of connected Applications.

(Later on, I shall introduce Employees computers, as a separate subnet, and probably its own rules, but for now I’m aiming for simple but functioning… And another idea to work on is software components and their vulnerabilities, maybe even time-based… Yet another complication that needs not happen this instant.)

So much to be done still. Oh well.