The fact that some verbs have alternating argument structures (as in give me the book and give the book to me) has been the origin of a fascinating puzzle in language acquisition - the fact that children make these alternations productively, and while they do sometimes over-generalise, they eventually correct this.
The account of alternating verb argument structures in this theory draws heavily on Pinker's (1989) account - in particular, using his analysis of the various verb meaning structures essentially unchanged. It differs in the following ways:
The core of this account is the use of m-scripts to embody the non-linguistic knowledge of possible alternations.
When the same external situation can be construed in two different ways Ñ when it can be represented by two different but related scripts Ñ there is an m-script which describes the relation between the two scripts. This m-script is a function from scripts to scripts, which can be applied by m-unification. Applying it to one construal delivers the other as result, in either direction. Such m-scripts are a part of our general, pre-linguistic, social intelligence, helping us to represent and reason about social situations [4.1, 4.2].
Consider, for instance, the locative alternation, as in the pair of sentences John sprayed paint on the wall and John sprayed the wall with paint, whose meaning scripts are Sand Srespectively. Sdenotes that John acted on the paint, making it move onto the wall, and Sdenotes that he changed the state of the wall, using the paint as an instrument. We recognise that these two different meanings can be exchanged, and the verb spray has two interchangeable argument structures.
Levin and Rappaport (1986) have identified 152 locative verbs, some of which have both locative forms, and some do not. Pinker (1989) has shown how these verbs can be grouped into 'narrow conflation classes' based on semantic criteria, so that some classes have the alternation, and others do not.
The m-script A which relates the two construals Sand Sis shown in figure 5.3. This m-script embodies the locative alternation in meaning structures; its two branches contain the respective meaning structures as proposed by Pinker, translated into the script notation.
The left-hand branch will match a script such as Sof the form "?A propels ?B on a path which ends in a spatial relationship ?R with object ?C", while the right-hand branch matches a script such as Sof the form "?A acts on ?C, using ?B as an instrument, with effect that ?C becomes covered in spatial relationship ?R".
A can be used to transform reversibly between Sand S. If A is m-unified with S, so that the left branch of A matches S, then the right branch of the result will be S; writing this as a function, S= A(S). Similarly by m-unification in the reverse direction, S= A.
This m-script A which embodies the locative alternation can be learnt by the same mechanism of m-intersection which is used for language learning Ñ by observing a number of occasions in which scripts Xand Xare alternative construals of the same situation, combining Xand Xbelow a common script node, and m-intersecting these learning examples together.
Although the mechanism for learning A is very like the language learning mechanism, it is in fact completely non-linguistic. A child with no vocabulary can learn A by observing situations, making the two construals, and m-intersecting the results. It is a piece of general social knowledge, learnt by the mechanisms which evolved to learn such things before language existed. Knowing A has nothing to do with linguistic competence.
Figure 5.3: M-script A which embodies the locative alternation for one of its narrow conflation classes.
It is not yet clear how broad or narrow are the alternation m-scripts which we learn. Do we learn just one broad m-script to do all locative alternations, or do we learn several narrow ones for different types of locative alternation ? I shall assume for simplicity that we learn just one broad alternation m-script, like figure 5.3 above.
What then distinguishes verbs which do alternate argmuenr structures, andverbs which do not ? The answer relates to the question of what elements of meaning are commonly encoded in verbs (and therefore justify coining a new verb). English transitive verbs may encode something about the path of a thing caused to move (as in the left branch of figure 7) and they may encode something about the resultant state of a direct object (as in the right branch of figure 7).
As Pinker (1989) has noted, the verbs which alternate do both of these at the same time. Spray the wall with paint tells us both about the manner of motion of the paint and the final state of the wall; so spray alternates. Fill the bucket with water tells us about the final state of the bucket, not about the manner of motion of the water; so fill does not alternate. In order to alternate, a verb must say something about its direct object (beyond what is said by some simpler verb such as move or change) in both argument configurations.
With this background, we can understand some empirical findings about children's use of alternating verbs:
(G1) Alternate argument structures for the same verb are learnt early and without confusion:(Pinker 1989) The basic learning process is proposed to be conservative - simply learning each alternate form from examples, by the usual primary learning process [3.8, 3.9]. For many verbs, both alternate forms will be learnt this way.
(G2) Alternations of argument structure are in broad classes, yet respect narrow-range rules : (Levin & Rappaport 1986; Pinker 1989) The broad-range classes include all verbs whose meaning scripts can be alternated (by m-unifying with the broad, construal-changing m-script [4.1]), to make an alternate construal of the same situation. These verbs have two `object-like' arguments (e.g. the wall and the paint, in the examples above). In each construal, one is the direct object and the other is an adjunct.
However, for many verbs in such a class, in one of the construals the verb says nothing interesting about its direct object, beyond what is already said by some simpler verb. Without its adjunct, in this construal it is a rather useless verb. In these cases there is no alternation; for instance, fill the glass with water cannot become fill the water into the glass, because fill the water says nothing more (about the water) than move the water. The narrow conflation classes (Pinker 1989) are defined by this semantic criterion.
In language generation, the adjunct meaning is stripped off first, so children never have occasion to say fill the water into the glass; for the meaning without the adjunct, the simpler verb move would be sufficient [2.3].
There may also be a uniqueness principle which prevents children from storing an m-script for the forbidden alternation, because it has the same meaning as move.
(G3) Children use the alternations productively: There is widespread evidence that children coin new alternations, for the dative, locative and causative alternations (Pinker 1989). Gropen et al (1991) have shown that children will use either alternate form for novel locative verbs, with preferences depending on the detailed meaning.
Two general (broader than single-word) m-scripts are involved:
Thus the children in Gropen et al's (1991) experiment may use (a) an alternation m-script to get the meaning they observe into the required form, where they can then (b) apply the broad `argument structure' m-script to infer that a novel verb keat has an argument structure (which they use in production) for that meaning.
However, it is consistent with the general `bottom-up' nature of learning in this theory that children are rather conservative in their learning and application of these argument structure m-scripts, as is observed.
(G4) Children make over-extensions, but correct them: While children are fairly conservative, nevertheless they do apply the alternations productively, and sometimes over-generalise (particularly the causative) in examples like Daddy giggled me.
In this theory, all such errors are corrected by gathering negative evidence, when children internally generate a part-sentence and compare it with what an adult said [3.10]. In this way, they observe that Where I would have said Joe giggled her Mummy said Joe made her giggle. After gathering enough examples, the child learns that giggle has no causative alternation.
(G5) There are `idiosyncratic' non-alternators, which children learn : In spite of the fact that most alternator verbs can be understood as belonging to Pinker's narrow semantic classes, giving a semantic guide to alternation, nevertheless we expect that over time the frontier between alternation and non-alternation may change. At any time there are bound to be a few `idiosyncratic' verbs whose alternation (or not) seems to have no good semantic basis. Compare `give' and `donate'; or for the causative alternation, `He burped the baby' versus `he cried the baby'. Some alternators are just a matter of usage.
Children need to learn these eccentric non-alternators They can use the mechanism of negative evidence [3.10] to do this.
(G6) Children learn passives of `action' verbs before others: Cross-linguistic data on acquisition of the passive shows children acquiring it fairly early in many languages, such as English passives late in the third year (Bowerman 1973), and earlier in some languages where the passive is prominent - e.g. Sesotho in the second year (Demuth 1989,1990). However, it seems that in both these languages, children acquire the passive form for prototypic action verbs (hit, kissed) earlier than non-action verbs (like or surpass) (Maratsos, Kuczaj and Chalkley 1979; Pinker, Lebeaux and Frost 1987; Demuth 1990)
In Pinker's analysis, which this theory follows, the passive is another (rather broad-range) alternation. It is so broad that it seems it must be learnt in some productive form; we could not learn all passives by hearing each verb in passive form several times. Like the other alternations, it rests on a piece of pre-linguistic knowledge - roughly that `If A does X to B, then B goes into a state of having had X done to him' - which can be expressed as an m-script. Therefore the mechanisms for learning it, for making productive generalisations of the passive, and for unlearning the passives of non-passivisable verbs, are all as described above.
It is likely that children can acquire the pre-linguistic m-script knowledge that `If A does X to B, then B goes into a state of having had X done to him' more easily and earlier for action verbs (where X may make a visible difference to B) than they can for others; this would account for the earlier acquisition of action passives.
In considering the use of pronouns and long-range movement phenomena, we come close to the limits of this theory as currently formulated [2.6]. Its account of these phenomena goes beyond the basic m-unification process of generation and production, requiring extra processes (to resolve pronoun identities, quantifier scopes and gap identities) going on in parallel with m-unification, to construct a sentence meaning. We need to make extra `procedural' assumptions about how people do this, before analysing how they learn to do it.
However, there are still some useful insights, and the challenge should not be ducked. A focus of interest in learnability of anaphora, pronouns and long-range movement has been the status of universal constraints, such as Chomsky's principles A,B and C. If these are universal, does the child need to learn them at all ? What must she learn in order to use them ? It turns out that in this theory, there are reasons to expect universal constraints, with consequences for learning. First we sketch how the phenomena are handled in language use.
First consider anaphora. The m-script for a reflexive pronoun such as himself has a form like a noun m-script - with no trump links, but with a right-hand branch entity describing a male of unknown identity. This identity is represented by a variable such as `?A'. There is an `instruction' slot on the m-script (which becomes an instruction slot in the meaning script) to be activated as soon as the entity is m-unified into any verb meaning scene, as one of the verb's arguments. At that point, the identity `?A' is (by instruction) to be equated with the identity of the agent of the verb meaning scene. So in Charlie hurt himself, immediately after the m-script for hurt is applied, the identity `?A' is set to Charlie's identity - giving the required effect. This implements Principle A.
Third person pronouns such as him are treated in the same way, but now the instruction is that the variable identity must not be unified with the verb agent - or with any other entity immediately beneath the verb meaning script. It must, at some time later, be unified with the identity of some other entity, usually from outside this script. This achieves the effects of Principle B. However, finding the right entity to unify with the variable entity requires some heuristic search of likely `nearby' entities.
Next consider relative pronouns like `who'. In the expression the boy who liked fish, the sequence of m-unifications in understanding is :
(1) the boy who liked fish
(2) [entity 1] who liked fish
(3) [entity 1] who liked [entity 2]
(4) [entity1] who [event]
(5) [entity 3]
In the step from (3) to (4), the m-script for like (which normally requires both agent and patient arguments) is m-unified without an agent argument - consuming a `gap'.
Relative pronoun m-scripts like who carry an instruction, which says in effect `look for any variable identity inside the [event] script, and unify it with the identity of the initial [entity]. Thus the boy becomes the agent in the liked scene, as required.
Had the phrase been the boy who I thought liked fish, as before the liked scene has a variable identity for its agent; and because the instruction is to look for any variable identity inside the event scene, even when liked is nested inside thought, its agent is equated to the boy - resolving a long-range dependency.
Finally consider quantifiers such as every, each, and so on. The meanings of quantified sentences are captured by attaching quantifier slots to script nodes, and equating certain identity slots within these script trees (as bound variables); for instance, in every boy loves his mother, the quantifier `every ?X' appears above a script meaning tree for boy ?X loves ?X's mother. To produce this meaning, a quantifier slot meaning `every' must move up the script tree (from where it was attached to the `boy' entity) and three identity slots ?X must be equated.
Handling any of these phenomena requires significant procedural extensions to the basic m-unification method of language understanding. They all require some kind of heuristic tree-searching procedures to be triggered by specific word m-scripts. Corresponding extensions are required for generation. I have implemented these extensions in the program which handles a fragment of English, and they work - that is, they do a fairly good job on most typical sentences. They do not yet handle all the hard examples studied by linguists.
It is a plausible assumption that a small number of these tree-searching and matching procedures are an innate pre-linguistic part of our social intelligence; representing things another agent does to himself, or searching for unknown entities, or quantifying over individuals, may often be required in social reasoning. Some language universals can be understood as innate limitations of the tree-crawling operations - for instance, that they cannot cross certain barriers (Chomsky 1986) in the script trees. We also assume that instructions to do these procedures can be attached to certain script nodes by word m-scripts.
To make these learnable, we require extensions to the learning theory. Possibly whenever the child finds a variable identity in a meaning script (with little else known about the entity) this automatically triggers a trial of some of the `search and match' procedures; and if one of them works, the instruction to do it is attached to an appropriate node. Then m-intersection of these learning examples will leave an `instruction' slot on some node of the word m-script, as appropriate for `himself' or `him' or `who'.
That is a sketch of an extended learning theory which I have not fully developed. To summarise:
However, even with just this sketch, we can start to understand some learning phenomena:
(H1) Anaphors and pronouns have complementary binding domains: Across many languages, it seems broadly that anaphors (reflexives) must be bound within some domain, and pronouns must be free within the same domain - while the extent of the domain may vary across languages (Kapur et al 1993). The m-script theory enables us to understand two aspects of this phenomenon: its existence and its learnability.
However, we need to assume that there is some way of specifying, in the m-script for a pronoun or reflexive, what script domain is to be searched for possible referents, and what possible referents within it are eligible; and that this specifiable information is also learnable by the m-intersection mechanism (or possibly an extension of it)
(H2) Reflexives are used correctly before pronouns: This has been established by observations in English, Chinese, Dutch, .....
Even though languages differ in their use of reflexives (e.g. in some languages they may be bound to subjects or objects, in other languages only to subjects), in all languages the reflexive must be bound within some domain (= subtree of the meaning script) while pronouns must be bound outside it. Therefore finding the right referent for a reflexive is always an easier task than finding a pronoun referent, and is likely to provide more successful positive learning examples early on, when the child's linguistic knowledge makes it hard for him to construct large meaning scripts. So we expect reflexives to be learnt first.
It is even possible that fully learning the pronoun rules requires a knowledge of reflexives, so that Principle B is discovered from negative evidence - that adults use reflexives (where the child may silently generate a pronoun), gathering negative evidence as in [3.9].
(H3) Pronoun reference principles have irregular edges: All attempts to systematise the rules for pronoun reference seem to have an unsatisfactory feel about them - that we will always be discovering some new exceptions and complexities; and that in any case, usages will change over time. It seems wrong that the rules of pronoun usage should have to have neat edges, just because of restrictions in our learning theory.
In this theory, as it is possible to learn from negative evidence, any idiosyncratic fact of usage (e.g. that pronoun X cannot be identified with entities of a certain type in certain circumstances) can in principle be learned in this way - removing an artificial constraint to `neat edges'. We need to assume that the negative evidence mechanism extends to the rather special `referent finding' information attached to pronoun m-scripts.
(H4) Some constraints on long-range movement are known from an early age: For instance, de Villiers has shown how children from age three onwards can recognise that in `When did she say she ripped her dress?' the when may refer to the saying or the ripping; but that in `When did she say how she ripped her dress?' it can refer only to the saying. They do this at an age where, it seems, they cannot possibly have gathered enough evidence about this (rather rare) type of question to have learnt the constraint. However, equally perplexing, in the example above many young children will answer the nested how question rather than the intended when question; this is so in several languages. De Villiers, Roeper and others have explored a host of similar issues in laboratory experiments, revealing fascinating limits to children's' performance.
If we frame the procedural, tree-crawling aspects of the m-script theory in certain `obvious' ways, then some of these constraints emerge naturally as things not doable by the mechanisms. However, I have not developed a single, compelling and economical account of how the tree-crawling operations are done, and it is quite possible that other formulations would make `forbidden' operations feasible. This is an area for further investigation.
(H5) Ross' Island Constraints are always obeyed: Two of Ross's (1967) island constraints are that relative clauses and coordinate constructs act as barriers to wh-movement. This forbids sentences such as What did you see the man that wore _ ? and `I like fish and chips. What do you like _ and chips ?'. It seems that children almost never violate these constraints in spontaneous speech; and De Villiers and Roeper (1995) have confirmed experimentally the relative clauses act as barriers to wh-movement.
In the m-script theory, when a gap is encountered in a relative clause, the m-script for the verb of the clause is applied with a missing argument entity. This leads to an unidentified entity (with a variable identity slot) in the verb meaning scene. Later, after the m-script for the relative pronoun is applied, another variable identity (from the pronoun meaning script) must be equated with this identity; there is a tree-search of the meaning script below the relative pronoun to find the appropriate entity to equate (so allowing long-range movement).
Although there is not space to give details here, both coordinate constructs and intermediate relative clauses interfere directly with this tree-search, making it infeasible in any simple form; so it is likely that these barrier phenomena (forbidding wh-movement from relative and coordinate constructs) arise from innate, pre-linguistic constraints on the tree-searching process. In this case, we would not expect children ever to violate the barrier constraints.
It is estimated that nearly half the world's population is functionally bilingual, and that most of these are `native speakers' of their two languages (Wolck 1987,1988). So bilingual language acquisition is not a peripheral, esoteric issue; it is a concern for any learning theory.
(I1) Two or three languages can be learnt simultaneously: Many bilinguals are exposed to both languages from birth, and acquire them at normal rates. There seems to be no limit on how different are the two (or sometimes three) languages acquired. This presents difficulties for some theories - for instance, for Principles and Parameters theories - where multiple sets of parameters must be postulated, losing much of the attractiveness of the theory.
In the m-script theory, acquisition of multiple languages is straightforward; no change to the mechanisms is required. Hearing sentences in one language, children learn the m-scripts for words, and so learn the syntax and semantics of the language; similarly for the other language. Apart from the word m-scripts, there is very little to learn. Two separate sets of m-scripts are acquired, and children use social/pragmatic knowledge to decide which m-scripts to use on which occasion. There is no extra difficulty even if the two sets of m-scripts embody very different grammars.
(I2) Children learn overlapping vocabularies for two languages: It was once believed that bilingual children go through an early `one language' stage in which, for instance, they do not learn words with the same meaning in both languages. Following more detailed and careful studies, this is no longer believed; young bilinguals' vocabularies contain pairs of words with the same meaning, from the earliest stages.
As was described above, the m-script theory can account for the lack of synonyms in languages without requiring a uniqueness principle, but there are independent grounds, both theoretical and empirical, for supposing that there is in fact a uniqueness principle at work in language learning.
This raises the question of how bilinguals learn words in two languages with the same meaning. It is likely that they learn very early a socially-conditioned `language context flag' which serves to distinguish the lexical entries for their two languages, and allows them to have distinct storage locations.
There is evidence that bilingual children distinguish their two languages phonetically very early, before they know any words; so throughout the course of word acquisition, this phonetic distinction can define two distinct contexts, and word meanings can then carry an implicit `context flag' which effectively breaks the uniqueness.
(I3) There is no evidence for a single grammatical system early in learning two languages: The Single System hypothesis (Volterra & Taescher 1978) proposed that bilingual children initially acquire just a single syntactic system along with a single vocabulary. Over the years, evidence has accumulated against this hypothesis, leading in stead to the Separate Development Hypothesis (SDH), that bilingual children acquire two syntactic systems from the start. There is currently little or no evidence against the SDH, and much evidence for it (de Houwer 1993).
This theory inevitably predicts the Separate Development Hypothesis. Since the syntax of each language is entirely embodied in the m-scripts of its words [2.3], as the child acquires the words for each language, she must inevitably acquire each syntax as well.
(I4) The course of bilingual language learning is very similar to the course of monolingual learning : All evidence suggests that the bilingual child learns both languages by going through just the same stages as a monolingual child, in the same order. This is so even for children bilingual in a spoken language and a sign language.
This result again is an inevitable (if vanilla-flavoured) prediction of the m-script theory, where the course of language acquisition is defined by the acquisition of word m-scripts. There is no good reason why a bilingual child should acquire the m-scripts for words of one language in any different sequence from that of a single-language child; so the sequence of language development stages is expected to be the same. (In a theory with a separate syntax-learning component, the presence of two syntaxes might well delay syntax relative to word acquisition.)
(I5) Code-switching is done most frequently with nouns: Bilingual children are generally sensitive to the needs of their listeners, using the appropriate language and only mixing the two when a mixed language context is appropriate (e.g. with another bilingual). When doing this, they code-switch for a variety of reasons, often within sentences. It is observed that code-switching occurs most with nouns.
Since (unlike other parts of speech such as verbs) noun m-scripts have zero `valency', requiring no other meaning elements in order to m-unify, they are the most easily substitutable elements in a language. The presence or absence of case inflections is the only impediment to noun-switching, and even that can be tacked on.
(I6) Neighbouring languages do not completely intermix: It is a little-remarked puzzle that neighbouring languages, which may share a long `frontier' of bilinguals, do not just diffuse into one another like two gases in a bottle. Why does this not happen ? Why are French and Dutch still distinct ?
In terms of this theory, if word m-scripts can propagate freely through generations by the m-intersection learning mechanism, why do they not propagate freely across language frontiers and beyond, intermixing the two languages ?
Part of the answer is social and behavioural - that people view their language as part of their identity, and often will not use words from other languages in order to be `one of us' rather than `one of them'. This is reflected in the fact that language change often diffuses outwards from areas of high socio-economic status - in a direction where the `social identity' resistance is least. However, this is probably not the whole story.
The m-scripts of a language evolve so as to align themselves into domains of regularity - sets of words with common syntactic structure, which work well together. They do this because any word which does not conform to the common structure will thereby be less usable [4.4] and will tend to die out. This same selection pressure therefore acts as a kind of `immune system' for a language, rejecting alien words which do not fit into the regular structures; so French word m-scripts cannot freely diffuse into a Dutch-speaking area. As with code switching, this resistance to alien forms is weakest for nouns. The domains of regularity extend over geographic space as well as over the space of meanings.
(I7) Creoles form very rapidly from Pidgins : When people of many different languages are thrown together, they soon develop a pidgin, which is a second language for all of them. The pidgin has very limited syntactic resources, and is very inefficient for communicating complex meanings. However, within as little as one generation, there starts to form a creole which is a very different language, and is the first language for its speakers. It is only learnt by children, while adults stick with the pidgin. The creole has a regular and productive syntax, and gives its speakers much better means to express complex meanings.
What is remarkable is the rapidity and reliability of the formation of creoles. This has been interpreted as evidence for a human biological endowment for language (eg Pinker 1994), and more specifically for a particular `bioprogram' form of language (Bickerton ). It may be evidence for such a large conjecture; however, it is also (much more directly) evidence for the rapidity and robustness of the language learning mechanism.
Productive creole forms can only spread rapidly through a population if they can start from small beginnings - if every learner can learn a construct, even at the stage when only a few speakers use it; when the signal may be masked by a lot of noise.
The learning mechanism of this theory evolved to learn useful social regularities in a noisy social milieu [4.1]; fierce social selection pressure over 20 million years has honed it to a highly robust Bayesian form, which can pick out a syntactic regularity even when it is diluted 20:1 or more in spurious noise [3.4]. It is just this rapid, robust learning that would be required to form a creole from a pidgin in a very few generations.
(I8) Creoles use simple analytic forms to express meanings : The remarkable resemblances between creoles in different parts of the world are evidence (in spite of controversy about the extent of influence of the `superstrate' languages) for some common processes going on in their formation. Bickerton (1984) has argued that this common influence is the `bioprogram' - an innate endowment for language which biases towards a certain form of language, seen clearnly in the creoles.
In this theory, the intepretation is different, and can be best understood in the `m-script evolution' picture [4.3]. Each language is a population of word m-scripts, and each word m-script replicates through a population of speakers by the learning mechanism. Language change is a process of evolution and selection of word m-scripts, analogous to natural selection of species - but operates much faster. The criteria for `fitness' of a particular word m-script are various, and include:
In just the same way, there are many different criteria for fitness of a species, and the criteria vary over time. When a piece of land has been devastated, certain fast-growing plant species can re-invade it rapidly, and thrive for a few generations until a more typical diverse ecology, with slow and fast-growing species, gradually establishes itself.
A pidgin is the language equivalent of devastated land, inhabited by only the most primitive word m-scripts. The development of a creole depends on which m-scripts can most rapidly invade and colonise this linguistic waste land. I suggest that, while for most mature languages the key factors determining word m-script fitness are (2) and (3), for a creole the dominant fitness criterion is just (1) - ease of learning. Ease of learning determines speed of invasion of the waste land. It is this altered form of m-script competition, rather than any innate bias in the human language endowment, which leads to the distinctive form of creoles.
Creoles are notable for the use of analytic forms which convey meanings in simple, separate chunks. For instance, tense, mood and aspect are conveyed by separate particles, rather than being built into verb morphology. Negation is conveyed by a simple pre-verbal particle.
The speed of learning in this theory is determined by the `six clean examples' heuristic [3.4]. If, for instance, past tense is conveyed by a separate morpheme for every verb, it can in principle be learnt from six good examples of past tense sentences - which will be heard much sooner than six examples of a particular verb in the past tense, as would be required if the past tense were built irregularly into verbs; and much faster than the secondary learning process, which would be required to learn a productive verb inflection for the past tense [3.11].
Children show a bias to these analytic forms, just because they are more easily learnable (Slobin 1985). The bias towards analytic forms, for easy learning, is even stronger in creoles.
(I9) Tense, Mood and Aspect are expressed in order TMA for creoles, MTA for most languages : Bickerton (1981) described a core Tense/Mood/Aspect system for creoles, which consists of pre-verbal particles marking A=[+nonpunctual] aspect, M=[+irrealis] mood and T=[+anterior] tense. These particles can be used in any combination, but always appear pre-verbally in the order TMA-verb. While individual creoles can have extensions to this scheme, the basic core TMA scheme applies to a high proportion of creoles of diverse origin (Arends et al 1994). However, for most mature languages this is not the preferred order; in a cross-linguistic survey Bybee (1985) found the dominant order to be MTA-verb, with tense closer than mood to the verb stem in 88% of cases. Why do creoles systematically differ from mature languages in this way ?
The difference can be understood as arising from three factors:
The creole irrealis marker seems to denote a binary distinction between an event that has actually happened and one which has not (which may be future, conditional, imagined, etc.). The tense marker denotes tense not relative to the time of speech, but relative to the event under discussion. Therefore it seems likely that, in a sentence meaning structure:
In this case, aspect is most closely entwined in the verb meaning, next comes mood, and last is tense. In a sense, aspect is inside the meaning of a verb, mood is on the surface of the meaning, and tense is outside it. Aspect is like the shape of an object, and tense is like its position relative to something else; shape is clearly a more intrinsic part of the object itself.
For learnability, the semantically closest meanings should be expressed nearest the verb stem. This accounts for the TMA order of the creole verb-modifying particles; but then why do mature languages tend to have MTA order ?
In most languages, the irrealis side of the realis/irrealis distinction is expanded into a wide range of modalities - of desire, intent, potential, obligation, and so on. This range of meanings cannot be expressed by a simple binary slot on the top node of the verb scene. As we saw for English modal verbs [5.6], the modalities are expressed by more complex script structures, exemplified by the m-script for can in figure 5.X. Fred can swim is expressed by a meaning script which can be roughly paraphrased as `Fred possesses an ability for Fred to be in a swimming scene'; Fred should swim can be expressed as `Fred has an obligation....' , and so on.
These more complex scripts can express a wide range of modalities, as mature languages require; but in these meaning scripts, the type of modality (obligation, ability,...) is detached from the verb meaning scene, and is now closer to the entity nodefor the agent. Therefore, for ease of learning, complex modalities should be expressed in MTA order - as they are in most mature languages.
This pervasive difference in structure between creoles and mature languages arises from the different range of meaning scripts they can express, combined with a script structure/learnability correlation.
Table 5.1 implies that 15 of the 101 comparisons with data require some extra assumptions, beyond the core theory, to get agreement with the data. The distinction between core theory and extra assumptions is clearly rather arbitrary; if some extra assumption is involved in the account of several pieces of data, and so starts to achieve some economy of hypothesis, then it may be incorporated in the core theory.
However, if for the moment we take the `core theory' to be that described in sections 2-4 of the paper, we can summarise the extra assumptions which are needed to get agreement with various pieces of data:
| DATA | EXTRA ASSUMPTION | |
| B19 | Children over-extend some words in production after having used them correctly | Secondary learning forms new nodes in the subsumption graph storage of word meanings, which make it easier in production to confuse words below that node. |
| B21 | Word meanings change by metaphor and metonymy | Meaning scripts have a `semantic field' slot which defines how they map onto other cognitive models (eg spatial). Changing the value of this slot creates metaphors. |
| B25 | Gender has little to do with sex | We can freely add a `sex' slot to the meaning scripts for inanimate objects. |
| C10 | Languages have regularities captured in X-bar syntax | Meaning scripts are more complex than the examples in this paper, in a way that supports semantic distinctions between different X-bar levels |
| D6 | In agglutinating languages, inflections are learnt from the outside inwards | Scripts with a time-order constraint `A immediately follows B' are easier to learn (or to use in production) than scripts with a time order constraint `A is somewhere within B'. |
| D7 | In agglutinating languages, children make no errors in ordering affixes | There is a value `not yet defined' (as opposed to `unknown') for inflection-controlled slots on meaning scripts, which prevents reassembly of agglutinated inflections in the wrong order. |
| D9 | There is transient over-regularisation of irregular forms | There is a procedural mechanism whereby a negative rule `form X never occurs' can block the application of a productive rule which gives form X. |
| D10 | English noun plurals and past tense verbs are over-regularised with low frequency | Several possible accounts: e.g. the irregular exception overrides the general rule, except when tense information is not well-enough defined in the child's meaning script to trigger the overrule (when tense is an afterthought). |
| D11 | Specific Language Impairment affects regular morphology | SLI is a deficit in the secondary learning mechanism |
| E8 | The rare `promise' control structure is learnt more slowly | Several possible accounts: e.g rarity of the form in adult speech, or that the meaning depends on `theory of mind' variables which cannot be directly observed. |
| F4 | Children often fail to invert subjects and auxiliaries in Wh-questions | Children learn and apply a broad m-script which implies that any statement or question can be freely extracted from forms such as `tell me' (e.g. tell me why you did it). |
| F5 | Complement verbs are sometimes overtensed | When children are learning auxiliaries and other complement-taking verbs, they have not yet mastered the morphology of the complement verbs. |
| H1 | Anaphors and pronouns have complementary binding domains | Each reflexive or pronoun m-script has attached information which defines what parts of the meaning script may be searched for referents, and what referents are eligible; this attached information is learnable. |
| H3 | Pronoun reference principles have irregular edges | As for (H1) above; and the attached `referent-finding' information is learnable from negative evidence |
| H4 | Some constraints on long-range movement are known from an early age | Tree-crawling procedures for matching wh-gaps are innate, and have intrinsic constraints which forbid certain kinds of long-range movement. |
Table 5.2: extra assumptions used in comparisons with data
Several of these extra assumptions concern the form of the script meaning representation; further systematic investigation of script meaning structures may clarify some of these assumptions. Other assumptions centre on procedural aspects of the theory, particularly for language production.