Coding the weasel

I recently took up reading yet another of Richard Dawkins’ works on evolutionary biology, “The Blind Watchmaker”, something I have neglected since I reread “The extended phenotype”. This lead me to thinking about, and eventually programming a version, of the much famed “weasel program”. It has endured as much scrutiny as it has received praise, but it is such and eloquent example of the principles of natural selection, that I felt compelled to explore it once more.

The extended phenotype is my personal favorite book of all time, and I struggle to see it being toppled from my favorites podium any time soon. I like it so much, not only for its unrivaled insights into evolutionary biology, or it’s unique and paraphrased invocation of game theory in explaining evolution in terms of strategies, but the philosophical framework it presents for thinking about any topic with a completely novel perspective.

The Blind Watchmaker however, surprised me with the insights it presented (having expected little, in order to avoid disappointment after “The Extended Phenotype”), and for me it came at exactly the right time, as I recently began to dabble in a little light mathematical modeling and computer programming.

Within the first half of the book, Richard Dawkins invokes his computer programming skills to showcase the beautifully basic principles, by which natural selection can assemble complex structures from un-complex components. In addition to that, he eludes to the subjective nature of complexity, a matter which set me to a catatonic state of introspective reasoning for some undefined stretch of time.

Among his examples was the, seemingly popular, example of evolution of a sentence from the individual components that is assembled from, its letters. The program is called the weasel program, as it stems from a quote in a William Shakespeare play, reading “Methinks it is like a weasel”.

The program starts by selecting 28 random characters from the alphabet. The sentence is copied, each position in the sentence copied with a certain chance of mutation to any one of the remaining 25 letters of the alphabet and the “space”. Before a copy is made, the program checks the sentence for its accuracy in resembling the target sentence, but it is checked letter by letter. If a letter in the sentence at position is identical to the letter in position x in the target sentence, it becomes immune to mutation and is “locked” in that position.

I wrote, a simple little python script that does exactly what is described above. I subsequently wrote a not so simple version of in R, to satisfy my curiosity as to “how to”.

With this I look the liberty of using both uppercase ad lowercase letters to form a library of 53 characters, including the “space”. This was the result for the locked example.

Locked example of evolution of the sentence to it target sequence.

Locked example of evolution of the sentence to its target sequence.

This result was obtained using a 100% mutation rate when the letter in position is not identical to the corresponding position in the target sentence.

Thereafter, I was pondering, as any scientist would at this point, that if the target sequence was defined in advance did we really witness the assembly of something complex from simple sub-units? I would like to point out at this point, that fundamentally, the principle is still illustrated, but not to the satisfaction of many out there.

I then took to programming an unlocked example, again in R, as I was more familiar with the matrix notation in R than I was in python at that stage. R is not as friendly with string manipulation as python is, but every problem has a solution.

The program is structured to have the following properties:

  • A sentence of a “target” length of characters is generated randomly from the library of 53 characters. The target sentence can consist of any sequence of letters.
  • For this example, it consisted of the sentence: “Methinks It Is Like a Weasel”, printed with the capitals in place.
  • The target sentence the produces 5 copies of itself, one letter at a time, with each letter having a 5% chance of incorporating a random character from the character library.
  • The 5 sentences or “progeny” are each evaluated for the fraction to which it resembles the target sentence.
  • The “progeny” with the highest resemblance to the target sequence, is chosen as the new parent and the cycle is repeated.

This is the result:

Mutation of random characters to target sequence.

Unlocked example of evolution of the sentence to it target sequence.

Interesting observations from this results it that the correct sequence evolves (over many more generations than for the locked example) with selection coefficients attributed to the progeny of each generation, completely arbitrarily. We chose to allocate the selection coefficient that allows for evolution to the target sequence, but by allocating the lowest selection coefficient to the progeny with highest resemblance, we could probably evolve an opposite sentence to our target, whatever that would look like.

The validity of this coding approach is open for debate, should there be any resistance, but it was written according to the example algorithm on the bottom of the wiki/Weasel_program page.

I found it to be an immensely educational to program these codes, and I will pursue the next phase of recreational programming to evaluate how to manipulate the selection coefficients with respect to the environment in which any sentence finds itself, relative to the frequencies of rival sentences, along the lines of game theory, if it is conceivable to code such an example. I hope it is not a bridge too far.

On the evolution of immortality

The average human lifespan has significantly increased in the last couple of hundred years, prompting suspicion of a potential evolutionary trend towards living longer.

I have before heard the argument, that human interference in the natural progression of disease and disability is affecting the “Darwinian, survival of the fittest”, and consequently is likely to influence evolution in the favor of a genetically ‘weaker’ human species. That argument has merit only when predicated on the inaccurate assumption that the individual or group (population) is the fundamental unit of selection. [Dawkins, 1976; Dawkins, 1982]

Can we however reasonably expect our descendants to keep getting older every generation, or is there likely to be an upper limit for maximum age for humans?

To postulate an answer to the question at hand, we will have to delve into the technicalities regarding the evolution of longevity, though I suspect that my (possibly feeble) attempt at a thought experiment, might only answer questions to which we already know the answers.

First, I would like to suggest that we define the term longevity for its use herein, as the length of time that extends beyond reproductive age, as a proportion of total life expectancy, or lifespan, used interchangeably with ‘length of time beyond reproductive age’.

Is it likely that genes ‘for’ longevity will have an increased selection coefficient compared to their rival allele(s)?

For there to be an increase in life-expectancy, due to selection in favor of an increased longevity proportion, we have to assume for the purpose of this argument, that average time to reproduction remains constant, in order to isolate for the selection coefficient regarding a ‘longevity’ gene/allele. We also assume the existence of an allele that confers some sort of increased lifespan due to increased longevity (increased proportion of life-expectancy after reproduction).

Selection pressure that would favor the propagation of a gene ‘for’ increased longevity, is linked to the phenotypic effect that this gene is likely to have on the propagation of copies of itself into future generations. The presence of an allele ‘for’ longevity, should in principle favor the increase of such an allele in future populations, at the expense of its rival.

Since we have stated that the average age of reproduction is not affected, and this gene is strictly for increased lifespan beyond reproductive age, we can safely assume that the gene ‘for’ longevity has no direct influence over copies of itself being present in its progeny. The only way by which such a gene can increase the inclusive fitness of copies of itself, is by insuring an increased survival probability of such a gene (and therefore individuals who carry a copy of this gene) in future generations. If there is more time after reproduction in which the principle investment of energy goes to ensuring survival of progeny, then such a gene will benefit from an increased longevity fraction. If energy is no longer invested in reproduction, then it makes evolutionary sense to invest energy in ensuring the survival of offspring, which carries copies of the genes of its parents.

Selection in favor of a gene ‘for’ longevity will have a higher selection coefficient than its rival allele, all other things being equal.

However, it is safe to assume that there is an evolutionary stable state for longevity, based on costs of developing such a trait. In principle, if there were no increased costs associated, then organisms would tend to evolve to gain immortality. But lifespan after reproduction is likely to be optimized for ‘minimum time beyond reproduction required to insure survival of copies of genes into the next generation’. This is necessarily bound to reproductive age. A gene ‘for’ longevity will increase the probability of its survival, if it can insure that its progeny survives until reproductive age. It will thereafter have exactly half the benefit if it can can assist grandchildren of itself to reach reproductive age. Whatever arbitrary value of inclusive fitness a gene ‘for’ longevity can have, will be halved in each subsequent generation, reducing the evolutionary benefit of survival after reproductive age, with the passing of each generation.

The chance that his children will contain this ‘increased longevity’ gene is ½, and that for his grandchildren is only ¼. The value of a parent assisting his grandchildren to survive is only half the value for that of his own children. A gene ‘for’ longevity will gain double the advantage of assistance, if both its parents and grandparents are assisting it in reaching sexual maturity, and has therefore twice the (arbitrary) fitness value for surviving into the next generation, compared to its rival alleles which incur no such advantage. From the offspring point of view, it might seem beneficial to survive for parents to survive for long periods of time after reproduction.

However, the advantages of increased longevity is balanced by the costs of diverting resources away from reproduction, in order to increase lifespan thereafter. Off the top of my head, it would require the evolution of better policing systems, genetically speaking, to ward of age related diseases (cancer, Alzheimer’s etc.). It would also require the co-evolution of better copying fidelity for somatic cell genes. Resources for reaching maximal physical health at reproductive age would have to be diverted to ensure better mechanical tenacity at advanced age. We can stop here, by assuming the list is very incomplete, and whatever other factors that need be considered, will add to the burden of costs. We can also assume that costs will also increase rapidly with time, whereas benefit will decrease radically with each passing generation.

What then if we assume, that there is an established equilibrium for longevity, governed by the average reproduction age? This has already been shown to be the case, both in previous research and for reasons mentioned in the above argument. It has been described more accurately here.

Would an increased average time to reproduction lead to an increase in lifespan? Would the artificial selection pressure induced by humans, reserving the capacity to procreate until much later than the average, lead to an increased average reproductive age, and therefore increased lifespan?

At first glance, the argument above would suggest that increased average reproductive age would indeed lead to increased longevity. Though I would like to get into the details of selection governing such a potential increase, this communication is already at its limits with respect to readability due to my ramblings over technicalities.

I will simply state, that, artificial selection for ‘increased age of reproduction’, is required to have an increased propagation potential (selection coefficient), compared to shorter reproductive cycles, for such an allele to stabilize itself in a population. Waiting longer to have children in this instance would have to increase the number of descendants from longer reproductive cycles, relative to the number of those produced from shorter reproduction intervals. This has associated with it a number of costs, such as increased resource requirements to reach reproductive age. It has been suggested for this reason, that life-expectancy is negatively correlated with reproductive age.

I would like to add another component to this line of thought. The Constructal Law. Recently published in this here article, is a mathematical model, describing the correlation between, body size, distance traveled during a lifetime and off course life-expectancy. Whether, reproductive cycles and life-expectancy is a product of organism size, or organism size is a product of either one, or a combination, of the aforementioned components, remains a discussion best reserved for a future opportunity.

The Constructal Law essentially states that, the larger a moving body, the longer its lifespan and distance traveled during its life. If my logic serves correct at this juncture, then increased life-expectancy will be associated with increased average human size, though I am certain that we have speciated our way to within the current limits of our (human) body size distribution, very long ago.

If you are wondering whether there is a reasonable chance that humans will one day live to exceed 100 years on average, then the answer should be no.

And if and you might be still be inclined to answer yes, then the following is something to consider. If such a genetic mutation does happen to occur, one that causes a change in the very roots of embryology, one that will increase body size, increase time to reproduction and therefore increase life-span, it is likely that they would not be referred to as humans (by our current criteria), as a result of speciation. It would be the evolutionary equivalent of primates having predicted that humans would evolve a more intelligent descendant from an ancient common ancestor. What the latter paradoxical statement is really implying, is that, if this were to occur, current modern humans would probably only be the common ancestor for that line of evolution.

The Scientific Discipline

Wherever there exists ambition to exceed, there is an accompanying potential for failure, directly proportional to the level of difficulty associated with completion of such a challenge. As Sam Harris in his stimulating read, “The Moral Landscape” proposed, there exists two definite extremes of the human condition, which he termed ‘the good life’ and ‘the bad life’, which are connected by a continuum (the subjective association of) ‘good’ and ‘bad’ components. Theoretically speaking, and forgive me for paraphrasing, all individuals aspire to the good life, whatever his/her subjective interpretation of that good life is.

I do find it ironic however, that rationally minded individuals (whom, as a scientist, I pride myself in thinking I am surrounded by) can be so abducted from the rationality of scientific inquiry when posed with aspirations of a personal nature. One individual can produce the most phenomenal research covering years, and even decades, yet struggle to maintain something as simple as a diet for the duration of one week.

Having recently re-read “The Extended Phenotype” by Richard Dawkins, I have come to the conclusion that there is as a matter of fact, another side to the Necker cube. All of us are posed with challenges that at times, seems too large for the aspirations of even the most disciplined amongst ourselves. It is along the lines of the philosophical, and scientifically feasible, model discussed in “The Extended Phenotype”, that I had decided to restructure the way in which I am to approach specific personal challenges.

In life, you are provided with variables and constants. Constants, I like to think of as all things which I have no ability or capacity to control, even though they do not produce repeatable output, yet can still be regarded as constantly beyond your control. For instance; as a lecturer, you accept as a constant, that all undergraduate students, are trying to succeed in passing, by doing the minimum required effort. Therefore, low cost to benefit ratio. Along the lines of evolutionary biology, positive selection of any trait, is a product of the cost of evolving that trait, relative to the benefit such a trait would produce, associated with the direct milieu that trait finds itself in. In short, students are lazy because the system incentivizes laziness.

Now, I am not here to solve the riddle of student education, for that extends far beyond both my expertise and capabilities. I am however proposing a variation on the outlook of student laziness, or more importantly, lack of achievement in our personal lives.

What if discipline is not (always) a product of motivation, strong will and endurance, but simply an unavoidable byproduct of a system that incentivizes success, relative to a perception that is conducive to success in that particular environment? What if, instead of working up the energy and motivation to maintain your diet/exercise routine, you could take a step back, and design for yourself a system, in which you are most likely to succeed? (Though I have had some success with this approach in my personal life, I am hardly specialist on these subject matter.)

For everything that is propagated into the next generation, there exists a fundamental selection, if represented by competing versions of the same component. Richard Dawkins proposed the meme theory of selection as far back as 1974, in his book, “The Selfish Gene”. In accordance with this proposal, I would like to set forth the following parameters for an example of an exercise routine.

When propositioned with a choice of either performing a component of your exercise routine, or let’s say, watching television, you are presented with alternatives for use of a unit of time, each measured by arbitrary values of fitness, related to cost and benefit. What these units of fitness is, although measured arbitrarily, is not irrelevant, as the extent thereof will serve as the basis for selection. Which of these activities one is likely to pursue, and is likely to pursue for the remainder of, perhaps a calendar week, is linked to the selection coefficient for each activity. So we satisfy the criteria of evolution by selection, in producing alternative components from which to select.

The second criteria for evolution by selection, is propagation into the next generation. If we imagine any specific activity chosen at the beginning of a week, as an arbitrary commitment to a line of evolution, then initial selection of either activity should in principle serve as the higher probability decision for all subsequent choices for those activities. This of course, if the popular “I’ll start next week” attitude is anything to go by. It also assumes for the third factor governing selection in favor of either as an activity, discussed hereafter.

The probability of propagation of any component (perception/meme) into the next generation, is subject to selection coefficients of each alternative, within the conditions in which it finds itself. Therefore, if becoming overweight and unfit is the milieu in which either choice, exercise vs television, finds itself, then going outdoors for a jog is very incompatible with achieving that aim. In addition to the global milieu, there exists interactions with other components (activities) that influences the selection coefficients for each one of our proposed alternatives. Watching television likely has a selectional advantage over exercise, if the food of choice is McDonald’s and the choice of drink is a sugary soft drink. As Richard Dawkins brilliantly proposed, some memes (genes) have a higher probability for selection in favor of, when present in a mix of memes that will positively influence its propagation.

This third component is where I would like to make a distinction between constants and variables in this model of selection between activities, if there can be such a thing. If we consider the first two components as constants, therefore in any conventional system (which is likely to be the majority), selection in favor of watching television has a higher inclusive fitness than does exercise, for the reasons mentioned above, then the third component can be considered a variable. If we can succeed in designing a system where healthy eating and exercise has a higher selectional advantage than the alternative, then it should proceed naturally to make decisions that include the latter. If running a marathon is the global objective, then all selection pressures should on average favor the activity which is most conducive to achieving the objective. i.e. Exercise. And being a scientist, the following should be generally true. If we require 8, then 4 + 3 will not suffice.

There is no reason not to see the obvious analogies with the psychology of motivation and discipline, yet seeing that my training is limited to biology and my experience limited to introspection, this rational approach has served me well. I’ll leave with this thought: For every individual that fails, there exists a system that allows him to fail.