Coding the weasel

I recently took up reading yet another of Richard Dawkins’ works on evolutionary biology, “The Blind Watchmaker”, something I have neglected since I reread “The extended phenotype”. This lead me to thinking about, and eventually programming a version, of the much famed “weasel program”. It has endured as much scrutiny as it has received praise, but it is such and eloquent example of the principles of natural selection, that I felt compelled to explore it once more.

The extended phenotype is my personal favorite book of all time, and I struggle to see it being toppled from my favorites podium any time soon. I like it so much, not only for its unrivaled insights into evolutionary biology, or it’s unique and paraphrased invocation of game theory in explaining evolution in terms of strategies, but the philosophical framework it presents for thinking about any topic with a completely novel perspective.

The Blind Watchmaker however, surprised me with the insights it presented (having expected little, in order to avoid disappointment after “The Extended Phenotype”), and for me it came at exactly the right time, as I recently began to dabble in a little light mathematical modeling and computer programming.

Within the first half of the book, Richard Dawkins invokes his computer programming skills to showcase the beautifully basic principles, by which natural selection can assemble complex structures from un-complex components. In addition to that, he eludes to the subjective nature of complexity, a matter which set me to a catatonic state of introspective reasoning for some undefined stretch of time.

Among his examples was the, seemingly popular, example of evolution of a sentence from the individual components that is assembled from, its letters. The program is called the weasel program, as it stems from a quote in a William Shakespeare play, reading “Methinks it is like a weasel”.

The program starts by selecting 28 random characters from the alphabet. The sentence is copied, each position in the sentence copied with a certain chance of mutation to any one of the remaining 25 letters of the alphabet and the “space”. Before a copy is made, the program checks the sentence for its accuracy in resembling the target sentence, but it is checked letter by letter. If a letter in the sentence at position is identical to the letter in position x in the target sentence, it becomes immune to mutation and is “locked” in that position.

I wrote, a simple little python script that does exactly what is described above. I subsequently wrote a not so simple version of in R, to satisfy my curiosity as to “how to”.

With this I look the liberty of using both uppercase ad lowercase letters to form a library of 53 characters, including the “space”. This was the result for the locked example.

Locked example of evolution of the sentence to it target sequence.

Locked example of evolution of the sentence to its target sequence.

This result was obtained using a 100% mutation rate when the letter in position is not identical to the corresponding position in the target sentence.

Thereafter, I was pondering, as any scientist would at this point, that if the target sequence was defined in advance did we really witness the assembly of something complex from simple sub-units? I would like to point out at this point, that fundamentally, the principle is still illustrated, but not to the satisfaction of many out there.

I then took to programming an unlocked example, again in R, as I was more familiar with the matrix notation in R than I was in python at that stage. R is not as friendly with string manipulation as python is, but every problem has a solution.

The program is structured to have the following properties:

  • A sentence of a “target” length of characters is generated randomly from the library of 53 characters. The target sentence can consist of any sequence of letters.
  • For this example, it consisted of the sentence: “Methinks It Is Like a Weasel”, printed with the capitals in place.
  • The target sentence the produces 5 copies of itself, one letter at a time, with each letter having a 5% chance of incorporating a random character from the character library.
  • The 5 sentences or “progeny” are each evaluated for the fraction to which it resembles the target sentence.
  • The “progeny” with the highest resemblance to the target sequence, is chosen as the new parent and the cycle is repeated.

This is the result:

Mutation of random characters to target sequence.

Unlocked example of evolution of the sentence to it target sequence.

Interesting observations from this results it that the correct sequence evolves (over many more generations than for the locked example) with selection coefficients attributed to the progeny of each generation, completely arbitrarily. We chose to allocate the selection coefficient that allows for evolution to the target sequence, but by allocating the lowest selection coefficient to the progeny with highest resemblance, we could probably evolve an opposite sentence to our target, whatever that would look like.

The validity of this coding approach is open for debate, should there be any resistance, but it was written according to the example algorithm on the bottom of the wiki/Weasel_program page.

I found it to be an immensely educational to program these codes, and I will pursue the next phase of recreational programming to evaluate how to manipulate the selection coefficients with respect to the environment in which any sentence finds itself, relative to the frequencies of rival sentences, along the lines of game theory, if it is conceivable to code such an example. I hope it is not a bridge too far.