4. Comparisons With Observation

We shall compare the script theory with some examples of primate social intelligence, particularly of vervet monkeys. Some examples which can be analysed in script terms are:

4.1 Using Kin Relations

Cheney and Seyfarth (1990) have made a series of observations on vervet monkeys, using hidden loudspeakers to replay various types of call of specific individuals to others in the group, in their natural surroundings. In one of these experiments, they replayed the screams of infant vervets to groups of females, including the infant's own mother and controls.

Vervets can recognise the calls of individuals in their troop, and mothers generally go to help their infant if a scream indicates that juvenile play has got too rough. As expected, the mothers consistently paid more direct attention to the replayed calls of their own infants than did the controls. More interestingly, when a particular infant's call was replayed, the control females would look towards that infant's mother, often before the mother herself had responded.

Dasser (1987) has shown in laboratory conditions that monkeys know kin relations of others in their group. The control females' reaction shows that they can combine this knowledge with other general knowledge (that mothers respond to their childrens' calls), to anticipate who will respond in a particular case.

Factual knowledge of a kin relation can be embodied in a script, such as that in figure 6(a), which states that Shelley is Profumo's child. This is a script in the mind of some other monkey (not Shelley or Profumo). I shall not discuss here how they learn these relations, although a script-based account can be given.

Figure 6: (a) A factual script, which says that Shelley is the child of Profumo ; (b) A typical incident of an infant screaming, and some individual paying attention; (c) the general rule which can be learnt from such incidents.

These kinship fact scripts are so important that, we suppose, they are continually and automatically unified in with the script of current scene - so that whenever any monkey observes Shelley, he or she automatically includes the fact that Profumo is Shelley's mother in the script. A typical scene of an infant screaming, and some individual going to help, would be encoded as in the script of figure 6(b). The knowledge of who is the infant's mother has been automatically included, by unifying a directly observed script with the script of figure 6(a).

After observing several scenes like figure 6(b), with different infant/mother pairs, taking the script intersection of these will give the rule script of 6(c) - that when any infant screams, his mother pays attention.

Suppose that the control animals in Cheney and Seyfarth's experiment had learnt the factual script of figure 6(a) - that Profumo is Shelley's mother - and the rule script of figure 6(c). Hearing Shelley's scream, they made a script "Shelley screams"; unified in the script 6(a) "Shelley is Profumo's child"; and then unified in the rule script 6(c) "Mothers pay attention to their infants' screams", to correctly deduce "Profumo will pay attention". Thus they looked towards Profumo with this expectation.

4.2 Habituation to calls

Cheney and Seyfarth have shown that if a vervet habituates to a call from a certain individual, it does not thereby habituate to the same call given by different individuals, or to completely distinct calls from the same individual. There is habituation to similar calls given by the same individual, but 'similarity' depends on the denotation of the call, rather than acoustic similarity.

We can use the script theory, first to give an account of the meanings of calls, and then to describe the learning processes which (a) give calls their meaning to vervets, and (b) explain some of the habituation effects described by Cheney and Seyfarth.

Consider two different calls - vervets' 'wrr' and 'chutter' calls - which are acoustically distinct but tend to be given in similar circumstances, when members of another group are seen. There must be at least two scripts associated with each call - one script which causes monkeys to make the call when appropriate, and another script which they use when hearing it. For the 'wrr' call, these scripts are shown in figure 7.

Figure 7: (a) The script which causes a monkey to utter a 'wrr' call when seeing a monkey from a different group; (b) the script activated in another monkey's mind when she hears the call

Figure 7a is the simplest script which could cause a monkey to utter a 'wrr' call on seeing a monkey from another group. Slots which are in effect 'executive commands' from the SIM to other cognitive subsystems, to cause a monkey to do something, are marked with a '*". Thus the *call slot in figure 7(a) is a command slot which causes the monkey to give a 'wrr' call.

Figure 7(b) is the simplest possible script which could enable a monkey to understand the meaning of a call - to convert a perception that the call has occurred to an expectation of an alien monkey.

The meaning of the 'wrr' call in a monkey group depends on both these scripts existing in the brains of all monkeys; for the 'wrr' call to serve as a useful communication, these two scripts must stay in line, associating the call with the same referent. The same applies to any other call. If, for instance, a call-giving script depended on one stimulus, whereas the call hearing script for the same call mentioned another, that call would systematically mislead, and so might not enhance vervets' survival.

We might hypothesise that both scripts are innate, and that natural selection has ensured that they stay in line, with the same meaning. However,there is an alternative hypothesis, that at least the 'hearing' script of figure 7(b) is learnt; we can investigate that alternative.

A vervet will observe many occasions when some other vervet gives a 'wrr' call, and a member of another group is present. By forming scripts of these occasions, and intersecting them together as described in section 2, she will learn just the script of figure 7(b). If, for every call, the 'hearing' script analogous to that of figure 7(b) is learnt rather than innate, this guarantees that the meaning of each call in its two scripts stays in line.

Given that the 'understanding' script 7(b) can be learnt (and if 'wrr' calls are made, will be useful to the monkey) then one hypothesis is that the 'calling' script 7(a) is innate, and evolved through kin selection effects. (A more complex case, where the calling script is not innate, is analysed in the next section).

Then if a 'wrr' call from a particular individual (eg Brutus) is repeatedly played in circumstances when no monkey from another group is present (ie when the call is misleading) the same learning mechanism will lead the hearer to learn a more specialised rule script - that when the caller is Brutus, no monkey from another troop is present. A monkey can learn, and use, both the general script of figure 8b and the exception script at the same time.

Figure 8: (a) Script for innate fear of birds; (b) Script for giving an alarm call.

This can give rise to the habituation effects observed by Cheney and Seyfarth. It gives a simple account of the observations that:

(a) A monkey can habituate to a particular call by a particular individual

(b) Habituation to one call by one individual does not cause habituation to the same call by other individuals

(c) Habituation to one call by one individual does not cause habituation to completely different calls by the same individual.

The final observation - that habituation to 'wrr' leads to habituation to 'chutter', which is acoustically distinct but has a very similar referent - can be understood within the script theory, but not so simply; possible explanations depend on some detailed considerations and parameters.

Whenever presented with data which are consistent with one 'target' script, there is some tendency to learn more general scripts (which include the target script) at the same time. So if, for instance, there is a 'wrr or chutter' script which is only slightly more general than a 'wrr' script - if, for instance, 'wrr' and chutter' are subclasses of the same class of call - there will be a strong tendency to learn or habituate to the more general script.

In this way (or others) the theory can be made to accommodate this last finding, rather than giving an immediate and satisfying account of it.

In general, however, the script theory gives a fairly satisfactory and economical theory of the evolution, learning and use of vervet monkey calls. It gives a minimal computational theory of the meaning of the calls, without, for instance, having to postulate that vervets represent the knowledge of others or intend to influence the knowledge of others - or even intend to influence the behaviour of others. Vervet meaning may be much simpler than human language meaning.

4.3 Learning Alarm calls

The adult vervet's "eagle alarm" call is highly specific, given only on seeing those raptors which prey on vervets. Young monkeys' eagle alarm calls are initially non-specific - being triggered by any bird, not just predators. However, they soon learn to be specific - long before they have seen enough predator attacks to learn directly which species are predators; it appears that they learn from the responses of older peers,who ignore their false alarms (Seyfarth & Cheney 1980).

To analyse this in the script theory, we need to postulate several different scripts, some of which are innate. Assume that:

(a) Vervets are born with an innate fear of birds. This is summarised in the script of figure 8a, where the slot *fear is a command slot (from the SIM to the monkey's autonomous nervous system, endocrine system and so on) to show the symptoms of fear.

(b) Being fearful in the presence of a bird leads a vervet innately to utter an 'eagle alarm' call. This is summarised in the innate script of figure 8b, which also contains an executive command to give the call.

(c) Any fear reaction is enhanced or diminished by knowing whether one's peers are fearful; this is summarised by the scripts of figure 9. These say "If your peers are frightened, you should be too" and "If your peers are not frightened, you need not be".

Figure 9: a pair of scripts instructing a primate to show fear (or not) depending on whether his peers are showing fear

For a young vervet, scripts 8a and 8b will together lead it to give eagle alarm cries to any bird - as observed.

However, as it grows, it observes instances in which a martial eagle appears and its peers are very frightened, and other instances in which, for instance, a vulture appears and its peers are not at all frightened. Combining these instances by script intersection, it will learn the scripts of figure 10 - that martial eagles always frighten its peers and vultures do not.

Figure 10: Learned scripts to the effect that (a) martial eagles always inspire fear in one's peers (b) vultures do not.

These learnt scripts can then unify with the script of figure 9 to alter the monkey's own level of fear appropriately [1]. For a vulture, the anticipation of an un-scary situation will damp the fear reaction enough to suppress the alarm call; for a martial eagle, the reverse will occur.

This gives a script-based analysis of how vervets learn, from their peers' reactions, which birds are worth fearing. The explanation is not unique; we could devise alternative explanations, and express them too in the script notation. It is not entirely black-and-white; it depends on graded quantities such as 'level of fear' and on how different scripts influence this quantity.

It also leaves some questions open. Suppose a group of vervets became unnecessarily afraid of some harmless bird - would this fear be propagated socially from generation to generation for ever? There must also be mechanisms whereby, in the long term, the real predatory habits of birds influence vervets' fear of them.

The proposed script mechanism makes specific predictions as to how long it will take a young vervet to learn that a given species of bird is (or is not) feared. It predicts how many examples (typically a rather small number) must be observed to reliably learn a script such as that in figure 10a or 10b. We can start to compare these numbers with observations.

4.4 Rank and Alliances

In most primate groups there is a defined rank ordering of animals, which determines access to key resources, for feeding, reproduction, shelter and so on. The effects of rank are greatly complicated by alliances (Harcourt 1988), either permanent (eg based on matrilineal kin relations) or temporary; if one has a high-ranking ally one may, for short periods, be able to enjoy some of the privileges of high rank oneself. In most monkey groups, individuals of lower rank attempt to form alliances with those of higher rank, for instance by grooming them. We can describe many aspects of this behaviour in script terms.

The relative rank of two individuals defines how they interact with one another in a large number of ways - which one gives way to the other, and so on. So there are many ways in which a primate, observing two others together, can judge which one is of higher rank. A typical rank-judging script is shown in figure 11a. A typical fact about individual ranks, which can be learnt using this script [2], is shown in the script of figure 11b. If Cassius retreats from Caesar, then Caesar must out-rank Cassius.

Figure 11: (a) A rule script which can be used to learn about rank from behaviour; (b) a typical fact about rank which can be learnt in this way.

In this respect, learning about rank is much like learning about kin relations. Some rank-determining scripts, such as that in figure 11a, may be innate; but other similar scripts, describing other accompaniments of rank, may be learnt.

The rank facts such as that in figure 11b are very important for a monkey. In a troop of N monkeys, there are N(N-1)/2 rank facts to know, and it might be a disadvantage for an individual to have to learn every one of them by observation; it might take a long time to observe all the necessary dyadic interactions. As has been discussed by several authors (Cheney & Seyfarth 1990; d'Amato and Colombo 1988) it would be useful to use the fact that rank is transitive; if A out-ranks B and B out-ranks C, then A out-ranks C. This general rule is easily represented in a script, shown in figure 12.

Figure 12: a script which expresses the fact that rank is transitive.

In this way, a monkey could determine the ranks of all the members of his or her group with comparatively few observations.

We suppose that the facts of rank, such as that in figure 11b, are, like the facts of kin, so important that they are continually, automatically combined with the visible facts of the current scene (by script unification), so that rank-dependent rule scripts can then be applied. It seems likely that monkeys have many scripts enabling them to judge rank, to know when and how to challenge it, to make alliances, to stop others making alliances, to exploit alliances and to call for help, and to know when it is worth helping an ally.

Monkeys may have an innate goal script - to try to increase their own rank - and many learned scripts to call upon to achieve it. Gaining rank is an autonomous goal within the SIM itself, rather than a goal defined by some other behavioural system.

4.5 Primate Emotional Responses

If emotion is regarded as a set of bodily responses (endocrine, expression, posture, vocalisation...) ensuing from a cognitive appraisal of the present situation, this is largely an appraisal of the social situation and its possibilities. Therefore many emotions arise from appraisal of the current situation by scripts in the SIM. In this view, many rule scripts (both innate and learned) result in emotional responses. When the current script matches the rule script, it is unified with it, causing the response.

We can give a script-based account of many aspects of primate emotion, including, for instance, the attachment behaviour which Bowlby (1969) noted is common across many primate species, including man. For instance, one initially puzzling aspect of attachment behaviour is the fact that infants of many species seem to show stronger and more persistent attachment behaviour to a parent who rejects them, than to a more loving parent.

As Bowlby (1980) has described, the attachment response (a goal to be close to a caregiver) is enhanced in situations of stress and anxiety. This serves a sound evolutionary purpose, because those situations (eg when a predator is near) are just the situations when a caregiver is likely to be most useful. This can be described by an innate script, shown in figure 13b. On the other hand, a rejecting parent is likely to cause anxiety. This (also innate) reaction is described by the script of figure 13a.

Figure 13: A script description of anxious attachment; (a) Parental rejection leads to anxiety (b) Anxiety leads to the goal of being close to a parent.

The two scripts of (13a) and (13b) combine to give the observed effect; rejecting parental behaviour leads the infant to cling to the parent. Note that they do not combine directly by script unification, as the slot *anxiety on the first script is a command slot which sets off the bodily symptoms of anxiety, while the slot 'anxiety' on the second script refers to perceiving those bodily symptoms; the two slots are distinct, and do not unify together. The chain of cause and effect runs through the body.

In this way, script theory could be used to build a principled computational model of emotional response (innate and learned) in typical primates such as monkeys, before going on to tackle the much more complex emotional responses (in chimps and mankind) which ensue when one appraises not only the actual situation, but also what others may think about it.

4.5 Tactical Deception

Amongst the most suggestive evidence for primate social intelligence are reports of deception, where primates appear deliberately to mislead one another. These reports are open to a wide variety of interpretations, from full-blown 'theory of mind' accounts through to basic behavioral accounts. The theory of this paper gives a framework in which possible accounts of some incidents of deception can be framed, without invoking a theory of mind, for comparison with alternative accounts.

Byrne and Whiten (1990) define tactical deception as 'acts from the normal repertoire of the animal, deployed such that another individual is likely to misinterpret what the acts signify, to the advantage of the agent'. By compiling data from many observers, Byrne and Whiten (1988, 1990, 1992) have built up a strong body of evidence that this kind of behaviour is widespread in some primate species, rare in others. It is most common in Cercopithecines (vervets, macaques and baboons) and in the great apes, particularly chimps.

Byrne and Whiten group their 253 reports of tactical deception into classes, depending on the evidence in the report. In level-0 reports, intepretations other than tactical deception are possible; for level-1 incidents the evidence for tactical deception outweighs competing explanations, and finally level-2 deception 'implies that the primate can represent the mental states of others' - which requires a primate theory of mind, and so does not fall within the scope of this theory. Reports of level-2 deception are almost entirely confined to the great apes. I therefore assume, for the moment, that great apes have some capacity to represent the mental states of others, so that their deceptions should probably not be analysed in the simple script-based terms of this theory. However, we may use scripts to analyse deception in the Cercopithecinae (for which Byrne and Whiten report 45 incidents of deception at level-1 and above), assuming (as in previous examples) that the cercopithecines use simple scripts without representing others' mental states.

Byrne (1993) has analysed several of these incidents in a production-rule formalism. Typical of these is his analysis of report. no 104, where a juvenile baboon, to get a food item (a deep growing corm, partially dug out by an adult of rank below his own mother), screamed as if hurt, so his mother came and chased away the adult; when both were out of sight the juvenile then continued to dig out the corm Byrne proposes a production rule of the form:

(need to remove A) & (mother dominant-to A) & (mother out-of-sight) => (scream).

Usually Byrne's production rules are of this form (pattern) => (procedure), or (X) => (do Y), whereas scripts are of the form (pattern) & (do procedure) => (consequence) , or (X) & (do Y) => (Z) (in the rule script form (cause) => (effect) ); a more declarative form of knowledge, but one which will lead to the same action if Z is a desirable consequence. Apart from this small difference, it is straightforward to translate from Byrne's production rules to rule scripts, or vice versa. Thus all the production rule analyses of tactical deception have closely equivalent script forms.

The script learning theory makes interesting predictions about the learning of this script (or its equivalent production rule). First, the juvenile could not have learnt the script from just one previous incident, or lucky accident. At least two previous 'accidental' successes are needed.

Second, we may ask: how is the qualification 'mother out of sight' learned as part of the rule ? Does the baboon need explicit negative evidence (that when mother is present, the trick does not work) to learn the full rule script ?

Following the previous discussion attachment behaviour, we expect that presence or absence of its mother is a very important variable, always represented in a young baboon's factual scripts. We might also expect that when its mother is absent, its has a greater tendency to scream - giving it more opportunities to learn this rule script. But why should it not learn a more general rule script, which has no qualification 'mother absent'?A priori , the simpler rule without the qualification is more likely to be true, by the 'Occam's Razor' weighting of the prior probabilities.

Suppose the juvenile has three successful examples, when mother was absent and the trick worked. The script intersection mechanism projects out all the common information in these examples, including the fact 'mother absent'. This more specific rule 'explains' more about these three examples, and so is favoured over a simpler alternative without the qualification (in spite of the smaller prior probability of the more complex rule). The more positive examples accumulate, the more the specific, qualified rule is favoured. This enables it to learn the specific rule, without over-generalising, in the absence of explicit negative evidence.

Pieces of explicit negative evidence - examples of 'mother present, trick failed' - are consistent with the specific rule, but do not actually help the baboon to learn it. Only if it experienced some 'mother present, trick worked' examples would there be any tendency to learn the more general, unqualified, rule in stead; and this is unlikely to happen, as its mother could see the trick.

Finally, the learning theory helps us to analyse why primate tactical deception is tactical - why it cannot be used more regularly with success. If, on some occasions, the mother can gain evidence that the third party whom she attacked was actually 'innocent', these examples would lead her to habituate to her child's distress call, just as in the discussion of 4.2; the learning theory tells us how many examples are needed. It tells us not only how primates can learn to cry 'wolf', but also how their peers can learn to ignore them - all without needing any theory of mind.

4.7 Innate and Learned Scripts

In some of the previous examples, we have postulated certain innate scripts, as a basis from which script learning can begin. This might seem to be an uncontrolled process; could we not postulate as many innate scripts as we wanted, and perhaps even do without any script learning in the theory ? Fortunately this is not the case; there are firm evolutionary grounds to limit the number and complexity of innate scripts. Every script has an information content (typically 20 - 200 bits); if it is to be an innate script, this requires at least that much extra innate information in the design of the brain. Such extra design information can only accumulate, through selection, at a very slow rate. This places a lower bound on the time required to evolve new innate scripts.

In (Worden 1995b) I derived a speed limit for evolution, which bounds the rate at which useful new genetic information, expressed in the phenotype, can accumulate through natural selection. This leads to a quantitative relation between (1) the information content of a script (2) the selective advantage of having it innate, rather than having to learn it, and (3) the minimum number of generations needed to evolve it as an innate script.

If a certain selection pressure leads to differential survival rates of ±D percent per generation, then the evolutionary response to this selection pressure can accumulate useful new information in the phenotype only at a rate of dG/dn bits per generation, where approximately

dG/dn =< D/80 (4.1)

For instance, a selection pressure which leads to variances in survival rate of ±10% can accumulate useful new genetic information in the phenotype at a rate not more the 1/8 bit per generation [3]. This means that the minimum number of generations N needed to evolve an innate script with information content B bits, which gives a selective advantage of D percent must obey

N > 80 * B/ D (4.2)

Probably the simpler scripts involve around 20-50 bits of information; so under a 8% selection pressure, these would take at least 200 generations to evolve as innate scripts. For universal, species-dependent facts (such as those in figures 7 and 8) 200 generations is not a long time; these scripts might well be part of the innate makeup of the brain of any vervet monkey.

However, the scripts of figure 10, since they each mention a particular species of bird, and must depend on the specific sensory cues for that species, probably have an information content of 100 bits or more; and the differential survival value of knowing (from birth) that one species of bird is harmless is probably more like 1% than 10% (as we saw in 4.3, there are ways to learn such scripts, and an innate script only gives extra fitness at ages before this learning can take place). So to make the 'vultures are harmless' script innate would take of the order of 8,000 generations. Since primates often depend on their flexibility to colonise new habitats (where different predators prevail), an 8,000 generation evolution time is often too slow; predator-dependent scripts must be learned.

The evolutionary speed limit therefore gives a well-defined criterion for the dividing line between innate and learned scripts. It leads us to expect that a few simple general scripts are innate, but that complex, habitat-specific or group-specific scripts must be learnt.

5. Testing the Theory

From the above examples, script theory seems to be in broad agreement with the evidence. Scripts have the descriptive power to express the kinds of social knowledge which most primates show; the learning mechanism enables them to learn rule scripts rapidly, as primates do; and the mechanism of script unification provides enough inferential power to do the kinds of social reasoning which primates apparently do.

Yet these examples, on their own, leave much to be desired. We can devise a set of scripts, inferences and learning sets to account for each example - but what does this add to what we already knew? Does it bring any new insights, or will it simply adapt itself as required to each new observation? What data might prove the theory wrong ?

The test of the script theory comes not as we devise new scripts to account for each new observation, but when the same scripts appear repeatedly in accounts of different behaviour. (We began to see this in section 4, in the links between call habituation, attachment behaviour and tactical deception.) At that point, the precise computational basis of the theory constrains us, to stop us handwaving or bending the theory ad hoc to account for each new fact. It can then start making definite predictions, which can be proved wrong.

To make these tests, we need first to construct (for some well-studied species) a set of scripts which accounts - to a first approximation - for most of the social behaviour we observe. This would mean constructing the sum of social knowledge for a species; a sort of Primate Social Encyclopaedia expressed in scripts. For a species such as the vervet monkey this might involve of the order of 20 - 50 innate scripts and 100 - 300 learned scripts.

For each innate script, there should be a plausible account of the selection pressure which gave rise to it; and for each learned script, we should be able to observe the examples from which an individual can learn it. So constructing the encyclopaedia is not an unconstrained exercise of invention; in itself it is a useful test of the theory. Doing so will define what nodes, slots and values are needed for the construction of scripts.

This will define a framework and parameters, within which we can consider some specific aspect of social behaviour - such as predator alarm calls, or competition for food - which is describable using only a few (preferably simple) scripts. For that aspect we can use the theory to predict what is learnable, and how fast; and to devise new tests of the theory.

6. Discussion

6.1 Computational Theories of Primate Social Intelligence

Formal computational descriptions of primate social intelligence have been proposed by Byrne (1993), Shultz (1991) and Schmidt and Marsella (1991). Shultz and Schmidt and Marsella are mainly concerned with the higher-order problems of recognising agency and othersÕ plans within a primate theory of mind, rather than the first-order problem of primate social intelligence (without a theory of mind). Only Byrne addresses this issue, in a production rule formalism, so I shall only discuss his work.

Scripts are very similar in spirit to production rules; and as shown in section 4.6, we can make a close equivalence between scripts and production rules for describing any particular observation. The script theory differs from Byrne's production rule formalism mainly by having a worked-out theory of learning, tailored to the social domain, which Byrne's production rules do not yet have - but could be extended to have. Alternatively, as in section 4.6, we can simply translate the script learning theory into production rule terms, assuming that any near-optimal theory of production rule learning must have approximately this form.

6.2 Scripts in Human Cognition

The introduction and discussion of scripts by Schank and Abelson (1977) and observations of others (eg Bower et al 1979; Graesser et al 1980) have built up a wealth of evidence that some form of script-like information structure is an important component of human social cognition. In particular Nelson (1978,1985; Nelson & Gruendel 1981) has studied the development of script structures in childhood and its close relation to the development of language.

As noted in the introduction, this computational model has much in common with the models discussed by Holland, Holyoak, Nisbett and Thagard (1986) in their framework for induction. Like their models, it combines elements of scripts, mental models and rule systems, paying attention to how rules are induced and modified through experience. Several other features of the q-morphism models of Holland et al. are shared in this model - in particular, the induction of default hierarchies of rules, rule competition, and the use of statistical criteria of variability to decide when a new rule is supported by the evidence. However, this model does not share some of the mechanisms which they postulate, such as the learning of 'inference rules' and analogies. This difference is justified by the fact that their model is designed to account for human cognition, whereas this is a minimal theory, to model the social cognition of primates such as vervet monkeys, which is expected to be much simpler than human cognition.

The evidence for scripts in mankind provides an important corroboration of the idea explored in this paper, that scripts are important in general primate social cognition. At the same time, however, the human evidence is harder to interpret because of two very important, and largely human-specific, complications - the existence of a well-developed theory of mind in mankind, and language. Both of these give the growing chld an enormous advantage over other primates in forming and using scripts, and therefore complicate any analysis of script learning and use. That is why the examples used in this paper (section 4) have concentrated on primates which have neither a theory of mind, or language; they form a simpler test case in which the basic script mechanism can be studied. This basic script theory then forms a starting point from which the later developments - of a primate theory of mind, and language - can be discussed (Worden 1995a).

6.3 Computational Models of Learning

The script learning theory is an example of concept induction - inducing some complex concept or structure (in this case, rule scripts for social causal regularities of a primate group) from examples (in this case, an individualÕs social history, expressed in factual scripts).

Concept induction has been extensively studied in the literature of AI and machine learning over many years (Michalski 1986), and much of this work is directly comparable with the script learning theory. Broadly, one can discern two main flavours of concept induction work - approaches based on computational heuristics, and approaches based on a mathematical analysis of performance.

The space of possible concepts is typically very large, and many computational heuristics have been devised to arrive rapidly at interesting parts of this space. Typical of these are the 'information gain' heuristics embodied in algorithms such as ID3 (Quinlan 1986), which builds up a decision tree from its root by putting the largest information gains nearest the root, and in many conceptual clustering methods (e.g. Fisher 1987; Lebowitz 1986).While Mitchell (1980) has shown that any induction method needs to have some form of inductive bias (towards some parts of the concept space rather than others) if it is to do useful learning, the bias built into these heuristics is not always transparent. A drawback of heuristic methods is that they give no simple guarantee of performance; often one must simply try out the method on sets of 'typical' data to see how it performs. For instance, setting the bias towards simple concepts (the Occam's Razor) too strongly may lead to over-generalisation. Nevertheless, techniques quite similar to the script intersection method of finding likely rule scripts have been extensively explored.

Neural nets and other reinforcement learning techniques tend to have a very weak inductive bias (Denker et al 1987), and so to be very slow learners - much slower than the fast social learning seen in primates. Primate evolution has clearly gone a long way to provide the required inductive bias; the problem is to know just what inductive bias has been built in by evolution.

Other approaches to concept learning start not from a plausible heuristic but from a mathematical analysis of the performance required. Much work in this vein uses Valiant's (1984) framework for Probably Approximately Correct learning, or pac-learning. This framework defines a sub-class of the concept space (a restrictive bias) explicitly, and then analyses the number of training examples needed to find (with high probability) a concept which correctly classifies new examples (with high probability). However, the pac-learning framework is a worst-case analysis - guaranteeing performance for any concept in the sub-class, and any probabilistic mix of training examples, and for any consistent learning algorithm (Haussler et al 1994). For this reason, its predicted learning times (its sample complexity) tend to be over-pessimistic (Buntine 1990).

One can see intuitively (and it can be shown mathematically, as is done in (Worden 1995c)) that natural selection tends to optimise average performance, rather than worst-case performance; it is average learning performance which determines lifetime survival. A monkey which failed to learn some 'worst case' rule script, but learnt most scripts rather well, would do better than one which handled the worst case at the cost of slower learning of many other scripts. So the measure of performance in pac-learning analyses is not appropriate for this problem.

Average performance is optimised by Bayesian methods, where the inductive bias towards some concept (or rule script) is defined by a prior probability for different sets of rule scripts to hold in the habitat. Evolution effectively builds some moderately realistic model of these priors into the species' brain. To learn the best set of rule scripts means to find the peak of the posterior probability, in the light of the factual scripts. The Bayesian approach to learning is also well represented in the ML literature; one important example is Anderson's (1990) Rational Analysis, which uses an approach similar to this one, to successfully analyse several human problem-solving tasks, and classical conditioning; but has not been applied to learning of structures as complex as rule scripts. Anderson and Matessa (1991) have applied this rational approach to human categorisation; the success of their comparisons illustrates two points:

(1) The theoretical optimality of the Bayesian approach does in practice lead to good performance - at least as good as the many heuristic approaches which have been used for the same problem.

(2) If it did prove to be necessary to include categorisation directly within social learning, then it would be fairly straightforward to combine Anderson and Matessa's (1991) model of categorisation with this model of scripts, as they are both Bayesian - by defining joint prior probabilities over a larger space.

Haussler et al (1994) have developed a unified framework within which both Bayesian and pac-learning performance bounds can be derived as ends of a spectrum. Although the case they analyse (learning boolean-valued functions from concentrated 'pure' training data) is not as complex as script learning (learning probabilistic scripts from noisy training data 'diluted' in many irrelevant scripts) the results they derive at the Bayesian end of the spectrum are broadly extensible to this case - showing how script learning can be fitted into the general framework of guaranteed-performance learning.

Therefore the script learning mechanism is closely related to a number of existing computational learning theories, both heuristic and mathematically-based; but since previous approaches have not been explicitly designed for optimum fitness in the social learning problem, it is not identical to any of them.

6.4 Neat Theories Versus Piecemeal Theories

The theory proposed here is a tight, concise computational theory; scripts are very simple information structures, and three basic operations on them (intersection, inclusion and unification) support all the learning and inference needed in the theory. However, one might wonder whether such simple 'neat' mechanisms can really be the basis of primate social behaviour, or whether some more piecemeal account is more valid. Perhaps different bits of social intelligence evolved at different times in different ways - a neural net here, a reflex circuit there - without the tight coherent structure I propose. Script theory may seem more of a computer scientist's theory than a biologist's; would not a larger, looser theory be more biologically plausible?

Arguments in support of a small neat theory are:

1. High Performance Demands Tight Design: A large, loose theory would, I believe, discount both the direct evidence that primate social cognition is so flexible and powerful, and the evolutionary argument that 50 million years of intense social competition must have made it so.

Whatever the beginnings of primate social intelligence, evolution has honed it to faculty with great representational power, fast learning and flexible inference. To be this powerful, social cognition must be coherent and consistent; it should not contradict itself when faced with some new problem, as a loose, ad hoc design might do. Script theory can be shown to be self-consistent, and to be a near-optimal solution to the problem of social cognition.

We have abundant evidence that when really high performance is required, nature chooses simple, precise designs - such as the optical design of the eye, or the protein-encoding in DNA. While it may be hard to discern such simplicity in the primate brain, we should at least think it possible that social intelligence is based on a simple, spare mechanism such as the script theory, which demonstrably gives the high performance (eg fast learning) which we observe in primates.

2. It is understandable and testable: Scripts can be easily envisaged, and their information content understood; the key operations of script intersection and unification are easily done by hand. So incisive tests of the script theory, as discussed in the previous section, are feasible.

In contrast, a theory which relied on an ad hoc collection of neural nets and specific mechanisms, tied together in arbitrary ways, would be much more difficult to envisage and test. It could always be bent to accommodate new data.

3. It is the Occam's candidate: Scripts are designed to be the simplest possible cognitive model which can account for the data, and so far, seems to be descriptively adequate. Occam's Razor requires us to consider simple theories first; so we should try to test this theory and prove it wrong before developing more complex ones. The ways in which script theory fails may be the clues to building a better theory.

4. It may be the origin of human symbol processing: The human mind has a powerful symbol processing capability; the main evidence for this is our remarkable and unique faculty of language. There is evidence that language, like the script theory, uses neat, powerful operations on tree-like information structures (eg syntax trees). While language is clearly much more powerful , it is possible that the basic symbolic script operations of primate social intelligence - as described in this paper - were extended first to the primate theory of mind, then to human symbol processing and language.

You may still feel that such a concise computational theory must somehow belittle the great richness of primate social behaviour. There are three reasons why it does not - first, the script theory is itself capable of generating quite complex learning and behaviour; second, the SIM interacts with other parts of the brain in complex ways to produce the behaviour we see; and third, we need to extend the theory to give a primate 'theory of mind' for higher apes and mankind.

Those are the arguments for favouring a tight, concise theory such as this over any looser, piecemeal theory. I hope readers are persuaded to try using scripts to express their own observations and ideas of primate social behaviour.


References

Anderson, J.R (1990) The Adaptive character of thought, Lawrence Erlbaum Associates

Anderson, J. R and M. Matessa (1991) A rational analysis of categorisation, Machine Learning, proceedings of the seventh international workshop (ML90)

Bower, G. J. B. Black and T. J. Turner (1979) Scripts in memory for text. Cognitive Psychology 11, 177-220

Bowlby, J. (1969) Attachment and Loss 1: Attachment, Hogarth, London

Buntine, W. (1990) A theory of learning classification rules. PhD thesis, Technology University of Sydney

Byrne, R.W. and A. Whiten (1988) Machiavellian Intelligence: Social intelligence and the evolution of intellect in monkeys, apes and humans, Clarendon Press

Byrne, R.W. and A. Whiten (1990) Tactical deception in primates: the 1990 database. Primate report 27, 1-101

Byrne, R.W. and A. Whiten (1992) Cognitive evolution in primates: evidence from tactical deception, Man 27, 609-627

Byrne, R. W. (1993) A formal notation to aid analysis of complex behaviour: understanding the tactical deception of primates, Behaviour 127 (3-4) 231 - 246

Charniak, E. and McDermott, D. (1989) Introduction to Artificial Intelligence

Cheney, D.L. and R.M.Seyfarth (1990) How monkeys see the world, University of Chicago Press

Clocksin, W. F. and Mellish C. S. (1979) Programming in Prolog

D'Amato, M. and Colombo, M. (1988) Representation of serial order in monkeys (Cebus Apella) J. Exp. Psychol. Anim. Behav. Proc. 14 131-9

Dasser, V. (1987) A Social Concept in Java Monkeys, Animal Behaviour 36, 225-30

Denker, J. , Schwarz, D., Wittner, B., Solla, S. Howard, R., Jackel, L. and Hopfield, J. (1987) Automatic learning, rule extraction and generalisation. Complex Systems 1: 877-922

Dennett, D. C. (1983) The Intentional Stance, Behavioral and Brain Sciences 3, 343-350

de Waal, F. (1982) Chimpanzee politics: power and sex among apes, Johns Hopkins University Press

Dickinson, A. (1980) Contemporary animal learning theory, Cambridge University Press

Fisher, D. (1987) Knowledge acquisition via incremental conceptual clustering, Machine Learning 2:139-172

Fodor, J. A. (1983) The Modularity of Mind, MIT Press, Cambridge, Mass.

Fodor, J. A. (1987) Psychosemantics, MIT Press, Cambridge, Mass.

Fodor J. and Z. Pylyshyn (1988) Connectionism and Cognitive Architecture, Cognition 28, 3-71

Graesser, A. C., S. B. Woll, D. J. Kowalski, and D. A. Smith (1980) Memory for typical and atypical actions in scripted activities. Journal of experimental psychology: human learning and memory 6(5) 503-515

Harcourt, A. H. (1988) Alliances in Contests and Social Intelligence, in Machiavellian Intelligence: Social intelligence and the evolution of intellect in monkeys, apes and humans, ed. Byrne, R.W. and A. Whiten , Clarendon Press

Haussler, D., M. Kearns and R. E. Schapire (1994) Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension, Machine Learning 14, 83-113

Hinde, R. A. (1982) Ethology, Fontana

Holland, J. H. , K. J. Holyoak, R. E. Nisbett and P. R. Thagard (1986) Induction: Processes of Inference, Learning and Discovery, MIT press, Cambridge, Mass.

Humphrey, N. K. (1976) The Social Function of Intellect, in Growing Points in Ethology, ed. P. P. G. Bateson and R. A. Hinde, Cambridge

Lebowitz, M. (1986) Concept learning in a rich domain: generalisation-based memory, in Machine Learning: an artificial intelligence approach, Vol. II, R. S. Michalski, J. G. Carbonell and T. M. Mitchell (eds), Morgan Kauffman, Los Angeles

Jackendoff, R. A. (1992) Languages of the Mind: Essays on Mental Representation, MIT Press.

Jolly, A. (1966) Lemur Social Behaviour and Primate Intelligence, Science 153, 501-6

Jolly, A. (1985) The Evolution of Primate Behaviour, 2nd edition, Macmillan, New York

Johnson-Laird, P. N. (1983) Mental Models, Cambridge University Press, Cambridge

Judge, P. G. (1982) Redirection of Aggression Based on Kinship in a Captive Group of Pigtail Macaques, International Journal of Primatology, 3, 301

Kummer, H. (1967) Tripartite Relations in Hamadryas Baboons, in Social Communication Among Primates, ed. S. A. Altmann, University of Chicago Press.

Marr, D. H. (1982) Vision, W. H. Freeman

Michalski, R. S. (1986) Understanding the nature of learning: issues and research directions, in Machine Learning: an artificial intelligence approach, Vol. II, R. S. Michalski, J. G. Carbonell and T. M. Mitchell (eds), Morgan Kauffman, Los Angeles

Mitchell, T. M. (1980) The need for biases in learning generalisations, Rutgers University technical report, reprinted in Readings in Machine Learning (1990), J. W. Shavlik and T. G. Dietterich (eds), Morgan Kauffman, San Mateo, Calif.

Nelson, K. (1978) How young children represent knowledge of their world in and out of language. In R. S. Siegler (ed) ChildrenÕs thinking: what develops ? Erlbaum, Hillsdale, N.J.

Nelson, K. (1985) Making sense: the acquisition of shared meaning, Academic press, N. Y.

Nelson, K. and J. M. Gruendel (1981) Generalised event representations: basic building blocks of cognitive development. In A. Brown and M. Lamd (eds) Advances in developmental psychology (vol 1) Erlbaum, Hillsdale, N. J.

Premack, D. and Woodruff, G. (1978) Does the Chimpanzee Have a Theory of Mind ? Behavioural and Brain Sciences 3, 111-32

Quinlan, J. R. (1986) Induction of decision trees, Machine learning 1, 81-106

Rumelhart, D. E. (1991) The architecture of mind: a connectionist approach, in M.I.Posner, ed., Foundations of Cognitive Science , MIT Press, Cambridge Mass.

Schank, R.C. and R.P.Abelson (1977) Scripts, Plans, Goals and Understanding: an Inquiry into Human Knowledge Structures, Lawrence Erlbaum Associates, Hillside, New Jersey

Schank, R. C. (1982) Dynamic memory: a theory of reminding and learning in computers and people, Cambridge University Press, Cambridge, UK.

Schmidt, C. F. and Marsella, S. C. (1991) Planning and plan recognition from a computational point of view, in Natural theories of mind: evolution, development and simulation of everyday mindreading, A. Whiten, ed., Blackwell, Oxford

Schulz, T. R. (1991) From agency to intention: a rule-based, computational approach, in Natural theories of mind: evolution, development and simulation of everyday mindreading, A. Whiten, ed., Blackwell, Oxford

Seyfarth R. M. and Cheney D. L. (1980) The Ontogeny of Vervet Monkey Alarm Calling Behaviour: A Preliminary Report, Z. Tierpsychol. 54, 37-56

Smuts, B. (1985) Sex and friendship in Baboons, Aldine, Chicago

Valiant, L. G. (1984) A theory of the learnable, Communications of the ACM, 27, 1134-1142

Vera, A. and Simon, A. H. (1993) Situated Action: A Symbolic Interpretation, Cognitive Science, 17:49-59

Worden, R.P. (1995a) The Primate Theory of Mind (Paper in draft)

Worden, R.P. (1995b) A Speed Limit for Evolution, Journal of Theoretical Biology 176, 137-152

Worden, R.P. (1995c) An Optimal Yardstick for Cognition (published in the electronic journal Psycoloquy)


Footnotes

[1]That is, without necessarily seeing the level of fear of one's peers, merely predicting their level of fear is enough to alter one's own level of fear.

[2]This example works in just the same way as the "Profumo is Shelley's mother" example of section 4.1.

[3]The bound is approximate, and holds for the average rate over many generations; for details see (Worden 1995b).