Words apart: when one word becomes two

As any person working with language knows, the list of words from which we build our sentences is not a fixed one but rather is in a state of constant flux. Words (or lexemes in linguists’ terminology) are constantly being borrowed (such as ‘sauté’ from French), coined (such as ‘brexit’ from a blend of ‘Britain’ and ‘exit’) or lost (such as ‘asunder’, a synonym for ‘apart’). These happen all the time. However, two more logical processes exist that can alter the total number of entries in the dictionary of our language. Occasionally, lexemes may also merge, if two or more become one; or split, if one becomes two. These more exotic cases constitute a window into the fascinating workings of the grammar. In this blog I will present the story of one of these splitting events. It involves the Spanish verb saber, from Latin sapiō.

The verb’s original meaning must have been ‘taste’ in the sense of ‘having a certain flavour’, as in the sentence “Marmite tastes awful”. At some point it also began to be used figuratively to mean ‘come to know something’, not only by means of the sense of taste but also for knowledge arrived at by means of other senses. It is interesting that in the Germanic languages it seems that it was sight rather that taste that was traditionally used in the same way. Consider, for instance, the common use, in English, of the verb ‘see’ in contexts like “I see what you mean”, where it is interchangeable with ‘know’. Whether the source verb can be explained by the differences between traditional Mediterranean and Anglo-Saxon cuisines I’d rather not suggest for fear of deportation.

In any case, what must have been once a figurative use of the verb ‘taste’ became at some point the default way of expressing ‘know’. These are the two main senses of saber in contemporary Spanish and of its equivalents in most other Romance languages. The question I ask here is: do speakers of Spanish today categorize this as one word with two meanings? Or do they feel they are two different words that just happen to sound the same? There may be a way to tell.

In Spanish, unlike in English, a verb can take dozens of different forms. The shape of a verb changes depending on who is doing the action of the verb, whether the action is a fact or a wish etc. Thus, for example, speakers of Spanish say yo sé ‘I know’ but t sabes ‘you know’. They also use one form (so-called ‘indicative’) in sentences like yo veo que t sabes inglés ‘I see that you know English’ but a different form (so-called ‘subjunctive’) in yo espero que t sepas inglés ‘I hope that you know English’. The Real Academia Española, the prescriptive authority in the Spanish language, has ruled that, because saber is a single verb, it should have the same forms (sé, sabes etc.) regardless of its particular sense. Speakers, however, have trouble to abide by this rule, which is probably the reason why the need for a rule was felt in the first place. My native speaker intuition, and that of other speakers of Spanish, is that the verb may have a different form depending on its sense:

Forms of Spanish saber (forms starting with sab– in light gray, forms starting with sep– in dark gray)

The most obvious explanation for why this change could happen is that, when the two main senses of saber drifted sufficiently away from each other, speakers ceased to make the generalization that they were part of the same lexeme. When this happened, the necessity to have the same forms for the two meanings of saber dissappeared. But, why sepo?

Because cannibalism is on the wane (also in Spain) we hardly ever speak about how people taste. As a result, the first and second person forms of saber (e.g. irregular ) are only ever encountered by speakers under their meaning ‘know’. Because of this, they do not count as evidence for language users’ deduction of the full array of forms of saber. This meant that the first and second person forms of saber₂ ‘taste’, when needed (imagine someone saying sepo salado ‘I taste salty’ after coming out of the sea), had to be formed on the fly on evidence exclusive to its sense ‘taste’ (i.e. third persons and impersonal forms):

Because of the evidence available to speakers, at first sight it might seem strange that this ‘fill-in-the-gaps’ exercise did not result in the apparently more regular 1SG indicative form sabo. This would have resulted in a straightforward indicative vs subjunctive distinction in the stem. The chosen form, however, makes more sense when one observes the patterns of alternation present in other Spanish verbs:

Verbs that have a difference in the stem in the third person forms between indicative and subjunctive (cab- vs quep- or ca- vs caig-) overwhelmingly use the form of the subjunctive also in the formation of the first person singular indicative. This is a quirk of many Spanish verbs. It appears that, by sheer force of numbers, the pattern is spotted by native speakers and occasionally extended to other verbs which, like saber look like could well belong in this class.

In this way, the tiny change from to sepo allows us linguists to see that patterns like those of caber and caer are part of the grammatical knowledge of speakers and are not simply learnt by heart for each verb. In addition, it gives us crucial evidence to conclude that, today, there are in Spanish not one but two different verbs whose infinitive form is saber. Much like the T-Rex in Jurassic Park, we linguists can sometimes only see some things when they ‘move’.

A daggy blog post

One of the most ubiquitously Australian words is the word dag. A word known and loved by basically any Aussie.

Classic daggy dad
Fig. 1 – The classic daggy-dad weekend look

It’s a light-hearted insult referring to someone who is unfashionable or socially awkward, basically a bit of a dork (Fig 1). But like most insults in Australian English it’s also used affectionately as a term of endearment (what does this say about how Australians relate to each other?). Typically in these cases, it is used to convey a sense of regard for the unashamedness of the dag in question – to express the lovable quality of someone who is just oblivious to certain social norms.

Fig. 2 – An actual dag.

However, the origins of this this word are anything but loveable. According to the popular story (which appears to be supported by Macquarie Dictionary and The Australian National Dictionary), this usage is derived from the older meaning (attested in 1891) of the word dag to refer to a matted clot of wool and dung that forms around a sheep’s bum (Fig 2). By 1967 something  ‘dirty and unkempt’ could be referred to as daggy and by the 1980s we were using the word for Figure 2 for the unfashionable yet loveable dad in Figure 1.

As an Australian, I am proud of my dagginess and am pleased to know our daggy little word has a pretty gross origin.



A plurality of plurals

Of all the world’s languages, English is the most widely learnt by adults. Although Mandarin Chinese has the highest number of speakers overall, owing to the huge size of China’s population, second-language speakers of English outnumber those of Mandarin more than three times.

Considering that the majority of English speakers learn the language in adulthood, when our brains have lost much of their early plasticity, it’s just as well that some aspects of English grammar are pretty simple compared to other languages. Take for example the way we express the plural. With only a small number of exceptions, we make plurals by adding a suffix –s to the singular. The pronunciation differs depending on the last sound of the word it attaches to – compare the ‘z’ sound at the end of dogs to the ‘s’ sound at the end of cats, and the ‘iz’ at the end of horses – but it varies in a consistently predictable way, which makes it easy to guess the plural of an English noun, even if you’ve never heard it before.

That’s not the case in every language. Learners of Greek, for example, have to remember about seven common ways of making plurals. Sometimes knowing the final sounds of a noun and its gender make it possible to predict the plural, but  other times learners simply have to memorise what kind of plural a noun has: for example pateras ‘father’ and loukoumas ‘doughnut’ both have masculine gender and singulars ending in –as, but in Standard Greek their plurals are pateres and loukoumathes respectively.

This is similar to how English used to work. Old English had three very common plural suffixes, -as, -an and –a, as well as a number of less common types of plural (some of these survive marginally in a few high-frequency words, including vowel alternations like tooth~teeth and zero-plurals like deer). The modern –s plural descends from the suffix –as, which originally was used only for a certain group of masculine nouns like stān, ‘stone’ (English lost gender in nouns, too, but that’s a subject for another blog post).

How did the -s plural overtake these competitors to become so overwhelmingly predominant in English? Partly it was because of changes to the sounds of Old English as it evolved into Middle English. Unstressed vowels in the last syllables of words, which included most of the suffixes which expressed the gender, number and case of nouns, coalesced into a single indistinct vowel known as ‘schwa’ (written <ə>, and pronounced like the ‘uh’ sound at the beginning of annoying). Moreover, final –m came to be pronounced identically to –n. This caused confusion between singulars and plurals: for example, Old English guman ‘to a man’ and gumum ‘to men’ both came to be pronounced as gumən in Middle English. It also caused confusion between two of the most common noun classes, the Old English an-plurals and the a-plurals. As a result they merged into a single class, with -e in the singular and -en in the plural.

This left Middle English with two main types of plural, one with –en and one with –(e)s. Although a couple of the former type remain to this day (oxen and children), the suffix –es was gradually generalised until it applied to almost all nouns, starting in the North of England and gradually moving South.

A similar kind of mass generalisation of a single strategy for expressing a grammatical distinction is often seen in the final stages of language death, as a community of speakers transition from a minority to a majority language as their mother tongue. Nancy Dorian has spent almost 50 years documenting the dying East Sutherland dialect of Scots Gaelic as it is supplanted by English in three remote fishing villages in the Scottish highlands. In one study the Gaelic speakers were divided into fluent speakers and ‘semi-speakers’, who used English as their first language and Gaelic as a second language. Dorian found that the semi-speakers tended to overgeneralise the plural suffix –an, applying it to words for which fluent speakers would have used one of another ten inherited strategies for expressing plural number, such as changing the final consonant of the word (e.g. phũ:nth ‘pound’, phũnčh ‘pounds’), or altering its vowel (e.g. makh ‘son’, mikh ‘sons’).

But why should the last throes of a dying language bear any resemblance to the evolution of a thriving language like English? A possible link lies in second language acquisition by adults. At the same time as these changes were taking place, English was undergoing intense contact with Scandinavian settlers who spoke Old Norse. During the same period English shows many signs of Old Norse influence. In addition to many very common words like take and skirt (which originally had a meaning identical to its native English cognate shirt), English borrowed several grammatical features of Scandinavian languages, such as the suffix –s seen in third person singular present verbs like ‘she blogs’ (the inherited suffix ended in –th, as in ‘she bloggeth’), and the pronouns they, their and them, which replaced earlier hīe, heora and heom. Like the extension of the plural in –s, these innovations appeared earliest in Northern dialects of English, where settlements of Old Norse speakers were concentrated, and gradually percolated South during the 11th to 15th centuries.

It’s possible that English grammar was simplified in some respects as a consequence of what the linguist Peter Trudgill has memorably called “the lousy language-learning abilities of the human adult”. Research on second-language acquisition confirms what many of us might suspect from everyday experience, that adult learners struggle with inflection (the expression of grammatical categories like ‘plural’ within words) and prefer overgeneralising a few rules rather than learning many different ways of doing the same thing. In this respect, Old Norse speakers in Medieval England would have found themselves in a similar situation to semi-speakers of East Sutherland Gaelic – when confronted with a number of different ways of expressing plural number, it is hard to remember for each noun which kind of plural it has, but simple to apply a single rule for all nouns. After all, much of the complexity of languages is unnecessary for communication: we can still understand children when they make mistakes like foots or bringed.




One of the peculiar habits that strikes a foreign visitor to a restaurant in the US (alongside heaps of ice in your drink and the sneaky habit of leaving sales tax off the price) is that menus typically list main course dishes as ‘entrees’. But ‘entrée’  is a French word that means something like ‘entry’ or ‘entrance’, so shouldn’t it be the same thing as appetizer or hors-d’oeuvre or starter? It seems like some fundamental misunderstanding of the term, like the rectangular chocolate ‘croissants’ shamelessly marketed outside of France.

