Intro
I have been presenting my project, then on PTO, and then discussing improvements with my tutor.
Here about the improvements.
In summary my implementation of the Genetic Algorithm was way too simplistic (functional, but not great).
Variance in individuals
I believe I mentioned that in a past entry, but anyhow. One thing that was simply wrong was to use too few repetitions in a Monte-Carlo setting.
That introduced variance in the calculated fitness of the individuals, which could make the GA sometimes find an individual was better than it in fact should have been estimated.
This was discovered when comparing results with an alternate method (MMCA, a mean-field approach) that did not require as many iterations.
Fixing that would of course mean to SLOW things down further in the MC setting, but on the other hand the results would be more… Correct/valid.
Tournaments
So up until my last version, I had reused code from a simple exercise about GA. There the choice of candidate parents for the upcoming offspring generation used a basic sampling of all individuals in random pairs, so that all individuals had a chance to reproduce. Then for each pair you would take the best of the two individuals and that would become one parent.
This has two issues: you only get 50% of a generation to reproduce. And you could (and did) have “bad individuals” still present (some pairs would probably be between two individuals of low fitness), and so that kept some genetic diversity but progress would be slow.
I have now reviewed the approach of tournaments in GA and added a variable (which I fix for all simulations, in fact) for tournament pressure, and that has made a lot of difference! That is, along with fixing the next point…
children cross-over and pruning
Another simplistic (but quite inefficient) part of the algorithm would take two parents and do a one-point crossover to generate two children.
Which in principle is perfectly fine, BUT in my case I have to respect a budget.
The simplest way to ensure the budget is respected is to prune any “ilegal” children, those over budget.
However that leaves us with the rest of children that are at OR UNDER budget.
This won’t help improve, as going under budget contributes NEGATIVELY to our goals.
About mutation
Mutation would have the same problem, the way it was implemented, as did the crossover, and could produce individuals over budget. Instead then of choosing random genes across the whole gene pool of children (which was simple and I thought rather elegant), I instead have had to now choose random individual children (in proportion of mutation pressure), and then choose random PAIRS of genes (but one from the 1s, one from the 0s) and switch them, thereby ensuring I’m not going over (nor under) budget upon mutation.
Conclusions
Even with all those issues the GA performed, it just performed poorly. But many children generated were ilegal as per my budget, leaving fewer than one would wish, and of the rest they could come from far from optimal parents. The selection process was not efficient.
After introducing the necessary corrections and improvements, and even though things are slower PER generation, the GA minimizes MUCH faster, converges earlier to what appears to be better fitness in fact, and so iterates for fewer generations, compensating a bit the MC additional repetitions.
I am getting closer to the deadline for the Project, and these additional results will force me to rewrite a good chunk of my dissertation, but for better results, so I can’t complain!