5.4 Morphology

The issue of closed-class morphemes and over-regularisation has been a test case for theories of language acquisition. The key evidence under debate is the fact that children learn regular and irregular morphology (for instance, of the English past tense) in a `U-shaped curve' - first learning inflections of both regular and irregular verbs correctly, then learning the productive rule and over-regularising the irregular forms, then finally (and slowly) removing errors in the irregular forms.

One difficulty has been that the over-regularisation error can only be corrected by the use of negative evidence - to establish that forms like `hitted' or `goed' cannot be used. The consensus is now that explicit negative evidence (e.g. correction by adults) cannot account for the correction (Marcus 1993); and many theories are reluctant to use implicit negative evidence - for fear of ruling out other rare, but legal, forms. The Bayesian theory of learning can use implicit negative evidence [3.6, 3.10], so can in principle account for the correction of over-regularisation - but as we shall see, some extra assumptions are needed to do so.

(D1) Individual word morphology is learnt before productive morphology rules: It seems clear across all languages that individual word morphologies are known before any productive rule is known.

The morphology of individual words is learnt by the primary learning process [3.9], which can start as soon as word learning begins. Productive regularities, which allow children to coin regular inflections they have never heard before, are learnt by the secondary process, which takes the individual word morphology m-scripts as its input [3.11]. This secondary process must be able to segment the inflection from the root, which it does by the normal segmentation learning process [3.7, 3.8].

Therefore in this theory, productive morphology cannot be learnt before a set of (typically six or more) individual word morphologies have been learnt.

(D2) Productive regular inflections can be learnt even if regular forms are not in a majority : In the English past tense forms which children are initially exposed to, irregular forms are a majority, yet the regular past tense rule is learnt.

The Bayesian learning mechanism is very robust, able to learn a regularity even if the learning examples which obey it are in a minority [3.3, 3.4]. Therefore even when the majority of a child's known verbs have irregular past forms, the regular inflection can still be learnt (by the secondary learning process) from the minority; the rule will be learnt provided it gives a more economical account of the data as a whole - that is, if more than about six verbs conform to it.

Then, when the child is asked to produce a past form of a new verb (where he has heard only the present form), the regular inflection rule, being the only known way to produce a past form of this verb, will be used.

(D3) We learn which dimensions of meaning are encoded by inflectional morphemes : An inflection can convey meaning in several different dimensions, such as gender, case, number, tense, or modality - along any one or several of these together. The task of learning which dimensions are signified, and how, is complicated by syncretism; an inflectional system may describe several dimensions at once, but particular inflections in that system may collapse one or more dimensions.

The learning examples for the primary learning process (to learn individual inflected words) have all possible dimensions of meaning represented within them; sex, number, semantic role etc. will all appear in the examples (as values of some slot on some node) because the child may put anything she observes about the scene into the right-hand branch of her learning example [3.7, 3.8]. If an inflection has no connection with some dimension of meaning (some slot), these learning examples will have randomly distributed, contradictory values of the slot, and so script intersection will remove the slot [2.1]. If, however, the inflection denotes some value of a slot, that value will survive the script intersection, to become part of the meaning of the inflected form.

So if a particular inflection denotes one, two or more dimensions of meaning, those pieces of meaning will all be learnt as part of individual word inflections, and so (by the secondary process) will then be learnt in the productive inflection rule. The learning of multi-dimensional inflectional paradigms (as proposed in Pinker's (1984) theory) emerges naturally from the general word learning process.

(D4) The speed of learning of inflectional morphology varies between languages: The age at which children first apply inflections productively to new stems varies across languages. Turkish children make productive inflections by the age of 2, and in other languages such as English (Kim et al 1994), children make productive inflections from age three.

There is probably no single explanation of this fact in the theory, because it arises from a variety of factors - phonological prominence, regularity, communicative need, agglutinative versus synthetic and so on. One of these factors is analysed in (D5) below.

Slobin (1982) cites 12 possible reasons why the inflectional morphology of Turkish my be expected to be learnt particularly early - that Turkish inflectional morphemes are (1) postposed, (2) syllabic (3) stressed (4) obligatory (5) tied to the noun (6) consistent with the verb-final typology of Turkish (7) ordered plural-possessive-case (8) nonsynthetic in their mapping of function into form (9) expressing only grammatical roles, not pragmatic information (10) exceptionless (11) applied consistently to all pro-forms (12) distinct (no homonyms).

It is consistent with the earning model of this theory that at least the factors (2), (3), (4), (5), (8), (9), (10), (11) and (12) should all lead to faster, more reliable learning; these effects could be modelled quantitatively, although I have not done so. So we have a basis for understanding the different speeds of learning productive inflectional morphology.

(D5) Productive inflections are learnt faster in agglutinating languages than synthetic languages : For instance, Slobin (1982) finds that affixes are learnt faster in Turkish (agglutinating) than in Serbo-Croat (fusional). In English, the more features encoded in a grammatical morpheme, the slower the child comes to use it (Brown 1973; de Villiers and de Villiers 1973, Pinker 1981).

Consider an agglutinating language in which there are typically N inflectional affixes to a stem, each with two possible forms; compare this with a synthetic language, having just one affix with 2N possible forms. Of the order of six individual words (complete with inflections) are needed for the secondary learning process to learn one productive regular inflection rule [3.11].

Then for the agglutinating language, about 12 word m-scripts must be known to drive the learning of one regular inflection; but for the synthetic language, about 6.2N+1 word m-scripts must be known, which may be much larger than 12. Thus in the agglutinating language, learning of regular inflections can start faster, because the required number of learning examples are gathered sooner.

(D6) In Agglutinating languages, inflections are learnt from the outside inwards: Inflectional and derivational morphemes nearest the word stem are the last to be used productively. There is a fairly natural explanation of this in the theory - that the child first learns words with several morphemes attached, by the primary learning process [3.9], then m-intersects several of these together to strip off the outer inflectional morpheme by the secondary learning process [3.11], and then by further application of the secondary process (a tertiary process ?) strips off the next inner morpheme, and so on.

However, while this is a plausible account within the theory, it is not unique; alternatives which do not make the right prediction are also possible. It seems that m-intersection is powerful enough to extract some `inner' inflectional morpheme straight away by secondary learning - so why does it not sometimes do so? In the verbs of semitic languages, there are only `inner' inflections between the consonants, and these must be learnt somehow.

I suspect that the true account involves a mixture of factors, including:

Why do languages change in this way ? There are many possible reasons. For instance, if tightly bound elements of meaning are not separated, this makes for more reliable communication in the presence of noise. (seealso (C13)).

(D7) In agglutinating languages, children make no errors in ordering affixes: (Slobin 1985) This looks at first like a simple enough prediction of the learning theory; if affixes are learnt from the outside in, as above, there seems to be no mechanism available in the theory to mislearn their order. However, it is not that simple, and it seems that an extra assumption (albeit a fairly innocent one) is needed to account for children's lack of affix-ordering errors in production.

Consider a hypothetical language in which the `inner' affix defines number and the outer affix defines case. The m-script for a particular case affix must pass across the number slot unmodified from its left branch to its right branch, so that the number information from the inner affix is not lost. It does this by a variable slot value [number:?N] on the entity node in left and right branches, which is easily induced from the learning examples. The m-script for the inner `number' affix needs no such variable slot value to pass across the case, as the case slot in the meaning will have been stripped off by the case affix.

In generation, these m-scripts can be used in two different ways: either they can be `pre-unified' with the noun stem to form an m-script for the full inflected noun before use, or each affix m-script and the stem m-script can be m-unified with the SMS on the fly. In either case, to prevent the two affix m-scripts from being used in the wrong order, the case affix m-scripts must have the constraint of requiring some value of the number slot to be defined, although it does not matter which value it is. As it stands, a variable identity such as ?N does not impose this constraint; we need to postulate a special kind of variable value which does. With this extra assumption, the affix-ordering constraint can be learnt and reliably applied.

(D8) Irregular forms are initially learnt correctly : It is well established that before the dip of the U-shaped curve, irregular forms are used correctly (Marcus et al 1992; Marcus 1995).

Before any regular morphology rules are learnt, irregular past forms such as hit or went are not different in any way from regular forms such as counted or waited; so they are all equally learnt correctly by the primary process, direct from primary learning examples [3.9].

(D9)There is transient over-regularisation of irregular forms : The dip in the U-shaped curve is now well established for many types of over-regularisation of irregular forms, such as English past tenses (Marcus et al 1992) and English noun plurals (Marcus 1995).

As soon as the child has learnt around 6 examples of the regular inflection, she may start to learn the productive rule by the secondary mechanism [3.11]; but will subject the rule to a test of statistical significance before acquiring it [3.11]; so possibly rather more than 6 examples are needed.

A regular inflectional rule, once learnt, may be applied to all words of a class (e.g. to all verbs, to form the past tense); at this point, the child knows two ways to produce a past form of an irregular verb: (a) Use the previously learnt irregular form m-script (e.g. went), or (b) m-unify together the stem m-script for go with the inflection m-script for -ed to produce an m-script goed, which is then used for generation in the usual way.

Possible reasons why the over-regularised form is only used sporadically are discussed under (D10) below.

To correct the error, the child uses the primary learning process to gather negative evidence [3.10]. That is, when learning any words from a sentence where an adult said went, the child may do both partial sentence understanding of words heard, and partial generation from the inferred meaning script (and in so doing, silently generate goed); so on these occasions the child may observe that “I would have said goed where the an adult said went”. These constitute negative evidence for goed, and so can be used by the Bayesian learning process to learn a negative rule - that “the regular past inflection of go never happens” (a negative rule is one with rule probability near zero)

If it is so easy to gather negative evidence, why does over-regularisation take a long time to eradicate ? There are three reasons:

  1. The Bayesian analysis shows that it may require a large number of learning examples to gain confidence in the exception rule [3.6].
  2. Over-regularisation is rare in the first place. Therefore a child does not `silently' generate over-regularisations very often; she does not use all her opportunities to gather negative evidence.
  3. As above, rate of learning depends on the rate of errors. If x denotes degree of belief in an over-regularisation, its rate of change (through unlearning) is governed by an equation of the form dx/dt = -lx ; so x is expected to undergo slow exponential decay.

Therefore this theory can broadly account for the slow correction of over-regularisations.

However, the theory does not yet fully define the mechanism whereby a negative rule (the rule that a certain form does not occur, in places where the child can apparently generate it by a productive rule) actually prevents the productive rule from being applied. To have a detailed account of how the prevention happens, we need to build a more detailed procedural model of the language production process. This could be done, but would involves extra assumptions about how language production works.

(D10) English noun plurals and past tense verbs are over-regularised with low frequency: Both for English past tensesand English noun plurals it is now known (Marcus et al 1992; Marcus 1995) that the over-regularisation is made only at low frequencies (less than 10%) even in the trough of the U-shaped curve.

The m-script theory does not give any unique reason for this low frequency of over-regularisation, but is not inconsistent with it; several accounts of this fact can be devised. The probabilistic nature of the theory (both for learning and for production) does not force it to make a categorical prediction that one or the other form will be used exclusively at any stage. Some possible components of an explanation are:

None of these accounts are particularly crisp or compelling, but the finding of sporadic over-regularisation certainly does not contradict a prediction of the theory.

(D11) Specific Language Impairment affects regular morphology: Although Specific Language Impairment (SLI) is probably not a homogeneous disorder, because of a lack of clear idendificatory criteria (Fletcher & Ingham 1993), nevertheless it does seem to show a strong tendency for impaired morphology, often in the absence of other clear deficits. This is so particularly in the cases of familial SLI studied by Gopnik et. al.

Because of the particular link with morphology, there is a possible interpretation in this theory, that SLI is an impairment of the secondary learning process [3.11] which is used to learn productive inflectional morphology. Although the primary and secondary processes are not entirely distinct - being points along a spectrum - it seems quite possible for the secondary end of the spectrum to be preferentially impaired by a genetic disorder.

Evidence on SLI in several languages suggests that phonological salience of morphology also plays a role - for instance, Italian SLI children show rather few deficits of morphology (Leonard et al 1987, 1992); but the `secondary learning' interpretation of SLI cuts across this distinction in interesting ways:

An `SLI = impaired secondary learning' picture seems broadly consistent with the rather confusing data. It also makes an interesting prediction - that SLI children should make fewer over-productivity errors (of several types, not just of morphology) than other children, when these follow from secondary learning - for instance, in alternating verb argument structures.

(D12) High use of closed-class morphemes at 20 months leads to low use at 28 months: Bates et al. (1988) found in their longitudunal study of 27 English-speaking children that the children who made highest use of closed-class morphemes at 20 months tended to be those who used them least at 28 months. This puzzling fact illustrates the difficulty of interpreting simple measures of child language.

The interpretation given by Bates et al (1988) for their findings can be couched in the m-script theory as follows: most of the closed-class forms produced at 20 months are unanalysed - part of larger rote-learned forms. These are narrow, context-specific m-scripts, learnt by the primary learning process [3.9]. Children who use more of these forms tend to analyse their input less into individual word m-scripts, and so are less ready for the secondary learning process [3.11] which must precede the productive use of closed-class morphemes seen at 28 months; thus there is an inverse correlation between use of closed-class morphemes at 20 and 28 months.

(D13) Ergative and accusative case markers are initially under-extended : In languages which use case marking to distinguish the two main arguments of a simple transitive verb, the case marking scheme is either nominative/accusative (in which case the accusative, rather than the nominative, is marked) or ergative/absolutive (where the ergative subject is marked). For intransitive verbs, the subject (being either absolutive or nominative) is not marked in either type of language. The markings of ergative or accusative achieve economy, in each case only marking one out of the three commonest verb arguments.

Children learn these markings soon and fairly reliably, and do not significantly over-extend them (e.g in Mayan Quiche: Pye 1979,1980). However, they do tend to under-extend the use of both the accusative and ergative markers, initially using them correctly only in simple `manipulative action scenes' (Slobin 1985) where the `who does what to what' is very clear; only later extending them to other transitive verbs such as `see', or `call-out'.

This initial under-extension has been observed in Russian accusative markers (Gvozdev 1949), and ergative markers in Kaluli (Schieffelin 1985). In Kaluli, the ergative marker is more reliably used with past-tense verbs, and is more likely to be omitted for future or negated verbs.

These facts have a fairly simple and direct explanation in the m-script theory. The key roles in transitive and intransitive verb scenes are marked by two distinct slots: an `actor' slot which defines who initiated the action ( = the subject for both transitive and intransitive verbs) and a `changed state' slot which defines which entity undergoes a change of state (object for a transitive verb, subject for an intransitive verb). Nominative/accusative markings are linked in m-scripts to the `actor' slot, while ergative/absolutive case markings are linked just to the `change state' slot [4.6].

Since these case markings correlate directly to simple, easily observable aspects of the meaning structure, they can be learnt early and reliably [3.4] (whether for individual nouns by the primary learning process [3.7], or productively by the secondary process [3.11]) ; and once learnt, there is no reason to expect any over-extension of either the accusative marker (= `did not initiate the action') or the ergative marker (= `did not undergo a change of state').

However, we would expect these markings to be used reliably only in cases where it is semantically clear who initiates the action and what undergoes a change of state. For manipulative action verbs, it is clear, especially in the past tense (where the state change has definitely happened); but for a verb like `see' it is less clear, for a future verb it is less clear, and for a negated verb it is particularly unclear. In all these cases, therefore, the child is likely to take the safe course of omitting any marker, as is observed.

So this form of under-extension arises not (as some others do) from the child initially learning a meaning which is too narrow and specific. The meaning is learnt correctly from the start, but the conditions for applying it are not always obvious.

5.5 Complementation and Control

Many English verbs such as want, tell, try and seem take a sentential complement, in which the subject is omitted. The rules for filling in the missing subject need to be acquired, and children do this rapidly and reliably; they start using complement-taking verbs quite soon after their first verbs, and make very few errors of control of the missing subject (Pinker 1984).

Learning of control relations occurs straightforwardly in the m-script theory, as a part of the normal m-intersection primary word learning process [3.9]. To see this, we need to see how control relations are represented and used in the m-scripts for complement-taking verbs.

The m-script for a typical control verb, wants , is shown in figure 5.1. In its left branch, it requires an animate entity scene (the person who wants something), the sound wants , and an event scene - the event that he wants to happen, which is the complement. In Charlie wants to eat cake this event scene will have been derived by m-unifying the complement verb to eat cake without a subject.

The right-hand meaning branch of wants represents the wanter (the entity defined by a trump link from the entity in the left branch) having a desire that a certain scene take place (the scene defined by a second trump link, from the complement event scene in the left branch; so this event is inserted into the full meaning). The control relation specifies who is the agent in this desired scene. The identity of this agent is given by a variable identity ?A, which is the same variable as the identity of the wanter. So in Charlie wants to eat cake, the m-unification automatically sets the identity ?A to Charlie, in all the places where it occurs - ensuring that it is Charlie who does the eating in the desired scene.

How is the wants m-script learned ? In the primary learning process, the child collects a small number of learning examples - where she hears the words and infers the intended meaning by other means - and m-intersects them together [3.8]. In each learning example, the identity of the agent in the desired scene is the same as the identity of the wanter. The first step in m-intersection is script intersection; this automatically detects equal slot values at different places in a script, and represents them by a shared variable value [2.1]. So the m-intersection automatically detects the control relation and embodies it in the m-script.

For Charlie wants Lucy to go away, there is a second, distinct wants m-script which consumes two entity scenes and the complement event scene in its left branch - and similarly, by a shared variable identity, ensures that it is Lucy who goes away, in the desired scene. These two m-scripts are learnt and used entirely separately.

(E1) Children acquire some complement-taking verbs early: Empirically, complement-taking verbs are not the first verbs learned, but they follow on soon after; several different types of complement-taking verb (subject-equi, object-equi, and raising to subject) are acquired within a few weeks of each other, typically at an age range 18 - 24 months (Pinker 1984; Tomasello 1992). They then form an important part of the child's language ability.

A complement-taking verb such as wants cannot be acquired until the child has enough knowledge of a few other verbs to apply each of them, in comprehension of the complement, without a subject. As soon as she has, the complement-taking verbs can be acquired straightforwardly by the primary learning process, as described above. Thus complement-taking verbs cannot be amongst the first verbs learned, but they can follow very soon after.

(E2) Children make few errors of control with complement-taking verbs: There is widespread evidence that (with the exception of tough-movement and the exceptional promise; Chomsky 1969) for most complement-taking verbs, children make very few errors in assigning the subject of the controlled verb. They seem to learn this correctly from the very start of learning complement-taking verbs (Pinker 1984; Sherman & Lust 1993)

The equality of identity slots, between the wanter and the agent in the wanted scene, will hold in every learning example which the child gathers when learning the m-script for wants. The script intersection learning process discovers this equality of identities [2.1], and encodes it as a shared variable identity value [id:?A] in both the `posessing' scene and the subordinate `what is desired' scene [3.8]. The shared variable identity slot embodies the control relation: whenever the word m-script for wants is used in generating or understanding a sentence, this control relation will be enforced by m-unification [2.3].

Learning a shared identity is a very basic and necessary part of the primate social learning mechanism [4.1], and is robustly built into the m-intersection word learning mechanism; so it is not something we would expect children to get wrong. Once learned, this shared variable will ensure that the child only uses the word where the same identity - e.g. of wanter and of agent in the wanted scene - is intended. So the m-script theory predicts that such errors of control are very rare, as is found empirically.

(E3) `Tough-movement' complements are acquired more slowly: Carol Chomsky (1969) reported that children do make errors of control in tough-movement verbs, such as Donald is hard to see , at much later ages - up to age 10 they may think that Donald is doing the seeing. However, the control relations for these verbs may involve long-range dependencies (as in Donald is hard to fool yourself you like ), so cannot be learned by the direct mechanism outlined above. Since there can be arbitrary levels of embedding, there is no fixed place on the meaning script where the shared identity may occur; so a shared variable identity cannot be learned by script intersection of learning examples.

Learning to fix long-range dependencies requires different learning mechanisms, which generally require more semantic and pragmatic knowledge, and take much longer to master.

(E4) Children make mistakes in inflection of embedded verbs: Pinker (1984) has summarised evidence that, while children are using control relations very reliably, at the same time they are making frequent errors of inflection of the embedded verb - either using the bare form , or sometimes over-tensing.

Children can start to learn complement-taking verbs at a very early stage, when they have learnt just a few other verbs by the primary learning process [3.9] - possibly before they have started on the secondary learning process which will give them productive control over verb inflections [3.11]. In this case, it is not surprising that a patchy knowledge of verb inflections prevents them from learning which complement-taking verbs require which inflections - and they learn m-scripts which are very permissive about complement verb inflection.

(E5) Verbs acquired intially without complementisers take some time to acquire them: Bloom et al (1984) observed that early-acquired complement-taking verbs - which require the complementiser to but were acquired without it - carry on being used without the complementiser, some time after other later-acquired verbs are used correctly with to. The complementiser is not retro-fitted to all verbs which require it at the same time.

For the early complement-taking verbs, which are acquired before to is recognised either as a preposition or a complementiser, it seems that the sound to is either ignored or incorporated in the verb sound, as in wanna . Recognition of to as a preposition may help its recognition elsewhere as a significant sound. Once to is recognised in the sound stream, we would expect it to be correctly acquired in the m-scripts of new complement-taking verbs which require it, as is observed.

Its slower incorporation in the early complement-takers can be accounted for much like the over-regularisation of morphology. For each `old' verb, when he starts to hear to, the child can learn a new m-script which has the complementiser - but initially regards the old and new m-scripts as alternatives. It is only later, by accumulating negative evidence [3.10] of the form `where I would have used want , an adult said want to`, that he learns that the form without complementiser is not used. This may require a lot of learning examples, which (as for inflectional over-regularisation) may take some time to gather [3.6].

(E6) Verbs with optional complementisers are correctly learned: A challenge for theories which use logical, all-or-nothing learning `choice points' is to frame the choice criteria so they can both (a) correct early learning errors or over-generalisations, and also (b) learn that for some verbs, such as help, the complementiser to is optional. It is hard to formulate a categorical learning rule which manages both these at the same time (Pinker 1984). For the Bayesian learning theory, this presents no difficulty, as it accumulates evidence incrementally without discrete choice points:

Optional complementisers: Examples of help both with and without the complementiser to will be heard; so two distinct m-scripts (with and without the complementiser) are learnt by the usual primary mechanism [3.9]. As both forms continue to be heard, no negative evidence will accumulate to eliminate either of them [3.6]. Both forms will persist.

Obligatory complementisers: If some verb with an obligatory complementiser is erroneously learnt without it (e.g. before the child has segmented the complementiser), then as soon as the child is sensitive to the complementiser, negative evidence against the no-complementiser form will accumulate [3.10], leading to its unlearning.

(E7) The `wanna' contraction is not made over a gap: Crain (1991) has shown that while children readily contract want to as wanna in questions such as (a) `What do you want to_ eat _ ?', they do not contract it in questions of the form (b) `Who do you want _ to brush your hair?', because the trace of the missing subject in the complement (shown as _ ) separates want from to.

As described above, there are two distinct m-scripts for want. One, for same-subject wishes, matches partially understood sentence fragments of the form <[entity] want to [event]>; while the second, for other-subject wishes, matches <[entity] want [entity 2] to [event]>.

The same-subject want is used in (a); since it is learned from examples in which want and to are contiguous, they can be learned as an undivided, contractible sound wanna. The other-subject want is used in (b); but since it is learned from examples in which want and to are separated by the other subject, they must be learned uncontracted; so they are much more likely to be said uncontracted.

(E8) The rare `promise' control structure is learnt more slowly: This phenomenon does not have any neat and crisp explanation in the theory, but nevertheless can be understood. A clear analysis of the various issues has been given by Pinker (1984). It seems that in this case, the true account may be rather multi-faceted and messy, including factors such as:

In summary, the key facts of acquisition of control relations are accounted for directly and naturally in the m-script theory.

5.6 Auxiliaries

The English auxiliary system is complex and irregular, a product of fairly recent language change. It is a challenge for any learning theory to explain how the child learns to use modal auxiliaries (such as can, could, will, must , and should), perfect have, progressive be, and passive be - including the complex and irregular rules about how they can, and cannot be used together, and in questions. As Baker (1979) and Pinker (1984) have noted, any simple syntactic theory, which makes broad generalisations, tends to over-generate many illegal forms.

In this theory, the account of `what is to be learned' about auxiliaries is very close to Pinker's (1984) account - except that while Pinker retains separate phrase structure rules, in this theory the phrase structure rules are built into the verb m-scripts.

Pinker proposes that the allowed used of each auxiliary verb are learnt in `paradigms' with two key dimensions - the verb morphology (which can have four values : infinitive, finite, perfect participle, and progressive participle) and sentence modality (which also has four possible values - neutral, inverted, negative and emphatic).

Each auxiliary selects rigidly for just one morphology of the verb in its complement (infinitive for can, perfect participle for have, etc.) ; but the auxiliaries also differ in their own possible verb morphologies, as summarised by Pinker (1984):

InfinitivePerf-partProg-partFinite
Can---can
Have (perf)have--has
Be (progr)bebeen-is
be (passive)bebeenbeingis
walkwalkwalkedwalkingwalks

The entries in this table (if learnable) would account for many of the regularities about which combinations of auxiliaries are allowed, such as John must have left or forbidden , such as John must can leave.

Auxiliaries and normal verbs also differ in the sentence modalities they can take part in. For the `finite' verb morphology these are (again from Pinker 1984):

NeutralInvertedNegativeEmphatic
Cancancancan'tcan
Walkwalks---
Do-doesdoesn'tdoes
Need-needneed-
Betterbetter-betterbetter
Used toused to---

The irregularity of the tables shows the irregularity of the auxiliary system. But if these tables, and their entries, can be learnt without over-generalisation, then auxiliaries can be used correctly.

To see how the table entries are learnt in the m-script theory, consider the word m-script for a typical English auxiliary, can, shown in figure 5.2.

Figure 5.2: m-script for `can'

The left branch of this structure defines the syntax of the word can, which requires three input scenes. In left-to-right order, scene (1) describes an entity (the subject), scene (2) has just the sound `can', and scene (3) describes an event (the meaning of the verb complement).

The left branch describes the meaning of can - that the subject of can has an ability to partake in an event scene (the legs of the man) as agent. The shared variable identity `?A' embodies the control relation - that the entity appearing before `can' in the sentence (scene(1)) is the same as the owner of the ability, and the same in the agent on the event.

In understanding the emphatic sentence `My friend can see you', the sequence of events is:

For generation, the m-unifications would go in reverse order, right to left. Therefore we can find all the information required for Pinker's paradigms in the m-script for can (emphatic):

  1. The slot (vfo:inf ) on scene 3 means the subordinate verb (see above ) must be infinitive.
  2. The slot (vfo:indic) on scene 4 in the right branch means that can itself delivers a finite indicative result.
  3. the slot (smo:emph) means that the result is emphatic.

Therefore the can m-script defines one table entry in the multi-dimensional paradigm table.

Auxiliary m-scripts like that for can are learnt by the primary learning mechanism described in section 3 - collecting learning examples (SMS) in which emphatic can is the only unknown word, and m-intersecting them together. The full meaning and constraints of emphatic can - including the paradigm slots above, and the trump links - are projected out by this process.

Note that the m-intersection also learns the control relation for the auxiliary (that if X can do something, it is X doing it) just as it does for verbs like want; the shared identity `?A' appears on the three nodes of the m-script, because in every learning example, the same individual appears on those three nodes.

The interrogative form of can is learnt as a separate m-script, with different constituent order on the left branch. So three m-scripts are learnt for can - neutral, interrogative, and emphatic.

Therefore the English auxiliary system is acquired by learning a large-ish number of independent m-scripts - filling out just the entries in the paradigms which are heard in adult speech.

(F1) Highly irregular English auxiliaries are learnt reliably: Children learn auxiliary verbs fairly early, and do so reliably (Pinker 1984). They obey Pinker's selection rules, summarised above, form the start. These rules can be encapsulated in m-scripts for each English auxiliary, with the verb morphology and sentence modality reflected as semantic slots in the meaning scripts.

The m-scripts for auxiliaries can be learnt by the usual primary learning mechanism, from (of the order of) six examples each [3.1]. It is not hard for a child to gather six learning examples for each auxiliary verb; from then on, she can use it reliably and effectively in comprehension or production.

(F2) Over-generalisation of auxiliaries does not occur: Children rarely seem to make auxiliary errors of the `John must can go' variety which would follow from over-generalisations - either amongst auxiliaries, or between auxiliaries and other verbs.

One may ask - having correctly learnt the auxiliaries by primary learning, does the theory predict that children might form broader generalisations by secondary learning, which could lead to over-generation ? In the Bayesian learning theory, any further generalisation (learnt by the scondary process) must pass a test of statistical significance [3.2, 3.11]. While I have not worked out the numbers for English auxiliaries, it seems almost certain that any broader generalisation would not pass this test, given the small number of auxiliaries and the absence of any `true' regularity across them (apart from the similarity of all modal auxiliaries, which might well be learned).

(F3) Errors of Auxiliary control almost never occur: For instance, children are never observed to say `he can see' meaning `he can be seen'. In this theory, the control relations of auxiliaries are learnt just like the control relations of other complement-taking verbs; if two entity nodes have the same identity in all learning examples, then the equal identity of those entity nodes is assured in the learned result, discovered by m-intersection [2.2]. Learning shared identity values is a core aspect of the learning theory, needed to learn any verb; so it is not surprising that it works reliably for auxiliary control relations.

(F4) Children often fail to invert subjects and auxiliaries in Wh-questions: Compared with virtually non-existent errors of control, the significantly-occurring failures of inversion in wh-questions such as `how you did that?' call for explanation. I have not found any neat or decisive account, but nevertheless the error can be understood. Compare the sentences:

(a) He told her he was hungry.

(b) He was hungry.

(c) Tell me who did it.

(d) Who did it ?

(e) Tell me where you hid it.

(f) *Where you hid it ?

From the comparison of (a) and (b), or (c) and (d), it may seem to the child that any sentence or question can be freely nested inside matrix verbs like `He told her', `I know', or `Tell me'; this knowledge can be embodied in a simple m-script, learned by m-intersection. Such an m-script, given (e), would then licence (f). In this account, non-inversion of auxiliaries is just a special case of non-inverted questions (like (f), which involves no auxiliaries). To correct this error, the usual slow process of accumulating negative evidence [3.10] is required.

(F5) Complement verbs are sometimes overtensed: Pinker (1984) notes a significant level of over-tensing errors in complement verbs after auxiliaries, as in Can you broke those ? - particularly with the auxiliary do .

Again, there is no single neat or satisfying account of this effect, but there are several possible contributory factors. One of these, as in (E4) and following Pinker (1984) is that when the child is first learning auxiliaries, he has not yet mastered regular verb morphology - and so cannot reliably distinguish between tensed and infinitive forms of all verbs. This means that constraints such as `can always requires an infinitive verb' may not be learned reliably at this stage - and even if it is, the finite and infinitive forms of some verbs may be confused. This agrees with the finding that overtensing errors are more common for irregular verbs.

5.7 Alternating Verb Argument Structures