ALLAN JAMES EDGAR

Learning and Genetic Algorithm


Home Hardware Software Venture

This project was inspired by my university computational neuroscience class. I wanted to see if I could emulate a problem solving problem that our brain uses frequently (at least some of the time), memory. I simulated 2 robots playing a game of cards in C. Memory is a new concept in computing so you can send my royalty cheques to... wait my lawyer just called, it is NOT. Either way I used "memory" to keep tally of moves played that lead to losses vs victories. In any given round, with the history, I used a Naive Bayes Classifier to predict a win/loss probability for each move and picked the highest. The bot with memory played similarly against the one lacking it at the beginning but after several rounds started to turn the tides. I then tried highly trained memory vs zero trained memory with predictable results. Several permutations of bot-x vs bo-y, with varying states of training, didn't prove anything that couldn't be intuited but it was fun to implement. I am proud to say that no bot, of any amount of training, could beat my "superior intellect". I did however only deal with relatively simple learning algorithms in this test. More sophisticated methods, such as Neural Network would have spanked me, if not converged on chance, i'm sure.

After this I approached the problem from another biomemetic algorithm. The Genetic Algorithm. I made a set of "input to action" instructions (we'll call these chromosomes) and arranged them randomly into random vectors of length 8 (we'll call these genes). For example, the zeroth vectors first instruction might be given input[0]=3 and input[1]=2 do not play. It would have 7 other instructions to that effect. Now I made 20 of these random vectors which we'll call our population. Obviously these 8 instructions that form a gene do not cover the whole problem space. Those instructions that are called for, yet not covered were set to random. You'll see later that this leads to some unintended but interesting outcomes. There are no exact rules how to implement this in general but it is trying to emulate genetic evolution, hence the esoteric names. It is waaaaayyyy beyond my scope to go deep into this class of algorithms but if you are interested, this is a great resource.

In this example we go by sets of 8 moves against opponents that respond randomly. After each set of moves the win/loss ratio is determined and we select some of the best performers out by "tournament selection". The selected winners of this selection are considered the survivors from the population. Next comes the crossover step, as in real life, we cross-over genes from the population survivors between 2 random "parents" and create a "child". It's strange I know. I still concerned about my search one time about "how to kill a parent (process) without killing the child (process)". Anywayyy rinse and repeat until you create a new population the same size as the previous. But that's not all, we also simulate mutations. At a low rate we randomly change some data (albeit at a low rate) to simulate the introduction of novel genetic material into the "gene pool". This gives us a new full and mutated population to restart the loop anew. There are a plethora of halting criteria (fitness criteria, improvement plateaus) for this loop but for this case I just chose one hundred loops. Despite it's simplicity this yeilded interesting results. The bots that played with the final strategies were considerably better than not only random strategies but even those that exited at only 50 loops. Here's the interesting part mentioned earlier. The trained strategies tended to focus primarily on the extremes (high or low card draws) where randomness played the least role, and the instruction was most impactful. It almost seems like they developed heuristics dependent on the constraint of its capacity to learn. Like we do. When finally paired against some of their own final populations they performed relatively equivalently and even eeked out a few victorious rounds against me.

The first method provided a classification and the second was an optimisation. They could both be used in different ways to play the game better. The demarcation is not as binary as I previously thought. The classifier output was used to optimize a strategy and the optimizer classified strategems and optimized from the available options. This WAS more of an exersize to familirize myself with C and it's family of languages but was enlightening in and of itself.