Reindeer = rein + deer?

Reindeer = rein + deer?

In linguists’ jargon, a ‘folk etymology’ refers to a change that brings a word’s form closer to some easily analyzable meaning. A textbook example is the transformation of the word asparagus into sparrowgrass in certain dialects of English.

Although clear in theory, it is not easy to decide whether ‘folk etymology’ is called for in other cases. One which has incited heated coffee-time discussion in our department is the word reindeer. The word comes ultimately from Old Norse hreindyri, composed of hreinn ‘reindeer’ and dyri ‘animal’. In present-day English, some native speakers conceive of the word reindeer as composed of two meaningful parts: rein + deer. This is something which, in the Christian tradition at least, does make a lot of sense. Given that the most prominent role of reindeer in the West is to serve as Santa’s means of transport, an allusion to ‘reins’ is unsurprising. This makes the hypothesis of folk etymology plausible.

When one explores the issue further, however, things are not that clear. The equivalent words in other Germanic languages are often the same (e.g. German Rentier, Dutch rendier, Danish rensdyr etc.) even though the element ren does not refer to the same thing as in English. However, unlike in English, another way of referring to Rudolf is indeed possible in some of these languages that omits the element ‘deer’ altogether: German Ren, Swedish ren, Icelandic hreinn, etc.

Another thing that may be relevant is the fact that the word ‘deer’ has narrowed its meaning in English to refer just to a member of the Cervidae family and not to any living creature. Other Germanic languages have preserved the original meaning ‘animal’ for this word (e.g. German Tier, Swedish djur).

Since reindeer straightforwardly descends from hreindyri, it may seem that, despite the change in the meaning of the component words, we have no reason to believe that the word was altered by folk etymology at any point. However, the story is not that simple. Words that contained the diphthong /ei/ in Old Norse do not always appear with the same vowel in English. Contrast, for example, ‘bait’ [from Norse beita] and ‘hail’ [from heill] with ‘bleak’ [from bleikr] and ‘weak’ [from veikr]). An orthographic reflection of the same fluctuation can be seen in the different pronunciation of the digraph ‘ei’ in words like ‘receive’ and ‘Keith’ vs ‘vein’ and weight’. It is, thus, not impossible that the preexistence of the word rein in (Middle) English tipped the balance towards the current pronunciation of reindeer over an alternative one like “reendeer”. Also, had the word not been analyzed by native speakers as a compound of rein+deer, it is not unthinkable that the vowels may have become shorter in current English (consider the case of breakfast, etymologically descending from break + fast).

So, is folk etymology applicable to reindeer? The dispute rages on. Some of us don’t think that folk etymology is necessary to explain the fate of reindeer. That is, the easiest explanation (in William of Occam’s sense) may be to say that the word was borrowed and merely continued its overall meaning and pronunciation in an unrevolutionary way.

Others are not so sure. The availability of “fake” etymologies like rein+deer (or even rain+deer before widespread literacy) seems “too obvious” for native speakers to ignore. The suspicion of ‘folk etymology’ might be aroused by the presence of a few mild coincidences such as the “right” vowel /ei/ instead of /i:/, the fact that the term was borrowed as reindeer rather than just rein as in some other languages [e.g. Spanish reno] or by the semantic drift of deer exactly towards the kind of animal that a reindeer actually is. These are all factors that seem to conspire towards the analyzability of the word in present-day English but which would have to be put down to coincidence if they just happened for no particular reason and independently of each other. Even if no actual change had been implemented in the pronunciation of reindeer, the morphological-semantic analysis of the word has definitely changed from its source language. Under a laxer definition of what folk etymology actually is, that could on its own suffice to label this a case of folk etymology.

There seems to be, as far as we can see, no easy way out of this murky etymological and philological quagmire that allows us to conclude whether a change in the pronunciation of reindeer happened at some point due to its analyzability. To avoid endless and unproductive discussion one sometimes has to know when to stop arguing, shrug and write a post about the whole thing.

Tongue twisters

Tongue twisters

Today I offer links to three international recipes: from Germany we have Kabeljau mit gebratener Blutwurst, Rosenkohl und Lakritzsauce (‘cod with pan-fried blood sausage, brussels sprouts and licorice sauce’), from France Cabillaud à la nage de réglisse (‘cod in licorice sauce’), and from Spain we have Lomo de bacalao en salsa de regaliz con juliana de judias verdes (‘filet of cod in licorice sauce with  julienned green beans’).

We will report later on the Morph cook-off challenge, once we scare up some participants and tasters. In the meanwhile, take note of what all these recipes have in common: cod and licorice. While I can’t for the life of me fathom why anyone would think to combine them on a plate, they do share something in common. Not culinarily, but linguistically. Let’s look at the words for these two ingredients as written in the recipes. They’re each vaguely similar across all three languages, but in a way which is hard to put your finger on. The word for ‘cod’ in all three languages has a [k] (or [c] – they’re pronounced the same) and a [b], but the order switches German and French on the one hand, and Spanish on the other. Similarly with ‘licorice’, where the place of [l] and [r] switch between German on the one hand and French and Spanish on the other:

‘cod’ ‘licorice’
German Kabeljau Lakritz
French cabillaud réglisse
Spanish bacalao regaliz

All neatly lined up here for comparison:

‘cod’ ‘licorice’
German k b l r
French c b r l
Spanish b c r l

This looks like an example of metathesis, where two sounds in a word swap places, as in English comfort versus comfortable, where the [t] and [r] switch places in pronunciation if not spelling (for those of us who pronounce the [r] at all, that is).

Metathesis as a gastronomic selling point may need a bit of refinement, but it does make for some curious word histories. The case of ‘licorice’ is fairly clear. It started out as Greek glykyrrhīza ‘sweet root’ and was borrowed into Latin as liquiritia, where it is believed that the first part got slightly mangled because people thought it had something to do with liquor (an example of folk etymology). The Latin word was borrowed into Old High German as lakerize or lekerize, which is where the Modern German word comes from. Meanwhile, in Old French, Latin’s daughter language, the word ended up as licorece, which then made its way into English. It was after this that French made the switch to ricolece, swapping [l] and [r], whose first part again got mangled to réglisse through another bout of folk etymology, because people thought it had something to do with règle ‘ruler’ (since licorice will have been sold in the form of ruler-like bars).

The word ‘cod’ remains something of a mystery. The German and French word were both borrowed from Dutch, first attested (in Latin sources) as cabellauwus, represented in contemporary Dutch as kabeljauw. Spanish bacalao is not attested before 1500, and it is generally agreed that the spread of this word was due to Basque fishermen. But whether kabeljauw morphed into bacalao or vice versa, nobody knows. Equally, it could all be coincidence, and the resemblance between the two words is just chance, a point of view that gains some mild support from the fact that bacalao and its ilk refer to a salted fish, whereas kabeljauw and its cousins refer to the fresh fish. This is how Dutch ends up with two words, kabeljauw and bakkeljauw: the first being its native word, the second borrowed from Portuguese bacalhau in the former Dutch colony of Suriname and transported to the Netherlands with Surinamese immigrants, used to refer to a salted and dried fish (not necessarily cod). I have yet to see both on a menu, let along combined in a single dish, but the search has only started.

(Sources: Etymologisch Woordenboek van het NederlandsEtymologisches Wörterbuch des Deutschen, Dictionnaire électronique de l’Académie Française.)

Today’s vocabulary, tomorrow’s grammar

Today’s vocabulary, tomorrow’s grammar

If an alien scientist were designing a communication system from scratch, they would probably decide on a single way of conveying grammatical information like whether an event happened in the past, present or future. But this is not the case in human languages, which is a major clue that they are the product of evolution, rather than design. Consider the way tense is expressed in English. To indicate that something happened in the past, we alter the form of the verb (it is cold today, but it was cold yesterday), but to express that something will happen in the future we add the word will. The same type of variation can also be seen across languages: French changes the form of the verb to express future tense (il fera froid demain, ‘it will be cold tomorrow’, vs il fait froid aujourd’hui, ‘it is cold today’).

The future construction using will is a relatively recent development. In the earliest English, there was no grammatical means of expressing future time: present and future sentences had identical verb forms, and any ambiguity was resolved by context. This is also how many modern languages operate. In Finnish huomenna on kylmää ‘it will be cold tomorrow’, the only clue that the sentence refers to a future state of affairs is the word huomenna ‘tomorrow’.

How, then, do languages acquire new grammatical categories like tense? Occasionally they get them from another language. Tok Pisin, a creole language spoken in Papua New Guinea, uses the word bin (from English been) to express past tense, and bai (from English by and by) to express future. More often, though, grammatical words evolve gradually out of native material. The Old English predecessor of will was the verb wyllan, ‘wish, want’, which could be followed by a noun as direct object (in sentences like I want money) as well as another verb (I want to sleep). While the original sense of the verb can still be seen in its German cousin (Ich will schwimmen means ‘I want to swim’, not ‘I will swim’), English will has lost it in all but a few set expressions like say what you will. From there it developed a somewhat altered sense of expressing that the subject intends to perform the action of the verb, or at least, that they do not object to doing so (giving us the modern sense of the adjective ‘willing’). And from there, it became a mere marker of future time: you can now say “I don’t want to do it, but I will anyway” without any contradiction.

This drift from lexical to grammatical meaning is known as grammaticalisation. As the meaning of a word gets reduced in this way, its form often gets reduced too. Words undergoing grammaticalisation tend to gradually get shorter and fuse with adjacent words, just as I will can be reduced to I‘ll. A close parallel exists in in the Greek verb thélō, which still survives in its original sense ‘want’, but has also developed into a reduced form, tha, which precedes the verb as a marker of future tense. Another future construction in English, going to, can be reduced to gonna only when it’s used as a future marker (you can say I’m gonna go to France, but not *I’m gonna France). This phonetic reduction and fusion can eventually lead to the kind of grammatical marking within words that we saw with French fera, which has arisen through the gradual fusion of earlier  ferre habet ‘it has to bear’.

Words meaning ‘want’ or ‘wish’ are a common source of future tense markers cross-linguistically. This is no coincidence: if someone wants to perform an action, you can often be reasonably confident that the action will actually take place. For speakers of a language lacking an established convention for expressing future tense, using a word for ‘want’ is a clever way of exploiting this inference. Over the course of many repetitions, the construction eventually gets reinterpreted as a grammatical marker by children learning the language. For similar reasons, another common source of future tense markers is words expressing obligation on the part of the subject. We can see this in Basque, where behar ‘need’ has developed an additional use as a marker of the immediate future:

ikusi    behar   dut

see       need     aux

‘I need to see’/ ‘I am about to see’

This is also the origin of the English future with shall. This started life as Old English sceal, ‘owe (e.g. money)’. From there it developed a more general sense of obligation, best translated by should (itself originally the past tense of shall) or must, as in thou shalt not kill. Eventually, like will, it came to be used as a neutral way of indicating future time.

But how do we know whether to use will or shall, if both indicate future tense? According to a curious rule of prescriptive grammar, you should use shall in the first person (with ‘I’ or ‘we’), and will otherwise, unless you are being particularly emphatic, in which case the rule is reversed (which is why the fairy godmother tells Cindarella ‘you shall go to the ball!’). The dangers of deviating from this rule are illustrated by an old story in which a Frenchman, ignorant of the distinction between will and shall, proclaimed “I will drown; nobody shall save me!”. His English companions, misunderstanding his cry as a declaration of suicidal intent, offered no aid.

This rule was originally codified by Bishop John Wallis in 1653, and repeated with increasing consensus by grammarians throughout the 18th and early 19th centuries. However, it doesn’t appear to reflect the way the words were actually used at any point in time. For a long time shall and will competed on fairly equal terms – shall substantially outnumbers will in Shakespeare, for example – but now shall has given way almost entirely to will, especially in American English, with the exception of deliberative questions like shall we dance? You can see below how will has gradually displaced shall over the last few centuries, mitigated only slightly by the effect of the prescriptive rule, which is perhaps responsible for the slight resurgence of shall in the 1st person from approximately 1830-1920:

Until the eventual victory of will in the late 18th century, these charts (from this study) actually show the reverse of what Wallis’s rule would predict: will is preferred in the 1st person and shall in the 2nd , while the two are more or less equally popular in the 3rd person. Perhaps this can be explained by the different origins of the two futures. At the time when will still retained an echo of its earlier meaning ‘want’, we might expect it to be more frequent with ‘I’, because the speaker is in the best position to know what he or she wants to do. Likewise, when shall still carried a shade of its original meaning ‘ought’, we might expect it to be most frequent with ‘you’, because a word expressing obligation is particularly useful for trying to influence the action of the person you are speaking to. Wallis’ rule may have been an attempt to be extra-polite: someone who is constantly giving orders and asserting their own will comes across as a bit strident at best. Hence the advice to use shall (which never had any connotations of ‘want’) in the first person, and will (without any implication of ‘ought’) in the second, to avoid any risk of being mistaken for such a character, unless you actually want to imply volition or obligation.

How do we know when? The story behind the word “sciatica”

How do we know when? The story behind the word “sciatica”

My right arm has been bothering me lately. The nerve has become inflamed by a pinching at the neck, creating a far from desirable situation. When trying to explain the condition to a friend, I compared it to sciatica, but of the arm. I am not here to bore you with my ills, however, but to tell you a story precisely about that word, sciatica. You may wonder what is so special about it. It is true that it has a weird spelling with sc, just like science, and that it sounds a little bit like a fancy word, having come directly from Latin and retaining that funny vowel a at the end which not many words in English have. But more than that, the word sciatica gives us a crucial clue about changes which have transformed the way the English language sounds.

English is a funny language. Of all the European languages, it has changed the most in the last thousand years, and this is particularly apparent in its vowels. In the early Middle Ages, starting perhaps sometime in the mid-14th century, the lower classes in England started changing the way they pronounced the long vowels they had inherited from earlier generations. Some have even claimed that the upper class at the time, whose ability to use French had started to peter out in the 15th century, felt that one way they could make themselves stand out from the middle classes was by changing their way of speaking a bit. To do this, they took up the ‘bad’ habits of the lower classes and started pronouncing things the way the lower classes would. But in adopting the pronunciation of the lower classes, they also made it sound ‘refined’ to the ears of the middle classes, so that the middle classes also started to adopt the new pronunciation… and so the mess started.

Pairs of words like file and feel, or wide and weed, have identical consonants, differing purely in their vowels. They are also spelled differently: file and wide are written with <i…e>, while feel and weed are written with <ee>. The tricky part comes when you want to tell another person in writing how these words are pronounced. To do that one normally makes a comparison with other familiar words – for example, you could tell them ‘feel rhymes with meal’ –  but what do you do if the other person doesn’t speak English? In order to solve this problem, linguists in the late 19th century invented a special alphabet called the ‘International Phonetic Alphabet’ or ‘IPA’, in which each character corresponds to a single sound, and every possible sound is represented by a unique character. The idea was that this could function as a universal spelling system that anyone could use to record and communicate the sounds of different languages without any ambiguity or confusion. For file and wide, the Oxford English Dictionary website now gives two transcriptions in IPA, one in a standardised British and the other in standardised American: Brit. /fʌɪl/ & /wʌɪd/ (US /faɪl/ & /waɪd/). For feel and weed, we have Brit. /fiːl/ & /wiːd/ (US /fil/ & /wid/). So, in spelling, <i…e> represents /ʌɪ/ (or /aɪ/) and <ee> represents /iː/ (or /i/). But why is this so?

The answer lies in the spelling itself, which is a tricky thing, as we all know, and took many centuries to be fixed the way it is now. English spelling is a good example of a writing system where a given letter does not always correspond to one particular sound. There is no rule from which you can work out that wifi is pronounced as /wʌɪfʌɪ/ (or /waɪfaɪ/) – you know it simply because you have heard it pronounced and seen it written <wifi>. This is not obvious to other people whose native language is not English: as a native Spanish speaker, when I first saw the word wifi written somewhere, the first pronunciation that came to my mind was /wifi/ (like ‘weefee’) but not /wʌɪfʌɪ/.

Contemporary English spelling very much reflects the way people pronounced things at the end of the Middle Ages. So words like file and wide were pronounced with the vowel represented in IPA as <iː>, which today can be heard in words like feel and weed. At that time, the letter <i> (along with its variant <y>) represented the sound /iː/. The words feel and weed, on the other hand, were pronounced with the vowel represented in IPA by <eː>, sounding something like the words fell and wed, but a little longer. Most of the words that in the English of the Middle Ages were pronounced with the long vowels /iː/ and /eː/ are now pronounced with the diphthong /ʌɪ/ (or /aɪ/) and the vowel /iː/ (or /i/), respectively. These changes were part of a massive overhaul of the English vowel system known as the ‘Great Vowel Shift’, so-called because it affected all long vowels – of which there were quite a few – and it took centuries to complete. Some even claim that it’s still taking place. But if we fail to update our spelling as pronunciation changes, how can we tell when this shift happened? That is when the word sciatica comes in.

The word sciatica is now pronounced as /sʌɪˈatᵻkə/ (US /saɪˈædəkə/). Because of the spelling <i> in ‘sci…’, we know that the word would have been pronounced something like /siːˈatika/ (‘see-atica’) when it was introduced in English from Latin by doctors, who at that time still used Latin as the language of exchange in their science. But sciatica is not a very common English word, and does not even sound naturally English. So unless you are a doctor or a very educated person, there is a high chance of getting the spelling wrong. In a letter to her husband John in 1441, Margaret Paston wrote the following about a neighbour: “Elysabet Peverel hath leye seke xv or xvj wekys of þe seyetyka” – “Elisabeth Peverel has lain sick 15 or 16 weeks of the sciatica”. While my sympathies go to Elisabeth Peverel as I write this, the interesting thing here is the way the word sciatica is written by Margaret Paston, as seyetyka. Here the spelling with <ey> tells us a nice story: that the diphthongisation of Medieval /iː/ into something like /eɪ/ had already happened in 1441. Because of that word we know that Margaret Paston, her husband, and poor Elysabet Peverel not only said /seɪˈatikə/ but also /feɪl/, /weɪd/ and /teɪm/, rather than /fi:l/, /wi:d/ and /ti:m/, even if they still wrote them the old way with an <i> as file, wide and time, just as we do nowadays. From this we can also deduce by the laws of sound change that the other long vowels had also started to change their pronunciation, so that these people were already pronouncing feel and weed in the modern way, despite spelling them the old way with an <e>.

This mouthful of a word sciatica is thus the first word in the entire history of English to tell us about the Great Vowel Shift. It is true that its story doesn’t ease the pain that its meaning evokes, but at least it makes it easier to deal with it by entertaining the mind…

 

Guarantee and warranty: two words for the price of one

Guarantee and warranty: two words for the price of one

By and large, languages avoid having multiple words with the same meaning. This makes sense from the point of view of economy: why learn two words when one will do the job?

But occasionally there are exceptions, such as warranty and guarantee. This is one of several synonymous or near-synonymous pairs of words in English conforming to the same pattern – another example is guard and ward. The variants with gu- represent early borrowings from Germanic languages into the Romance languages descended from Latin. At the time these words were borrowed, the sound w had generally developed into v in Romance languages, but it survived after g, in the descendants of a few Latin words like lingua ‘tongue, language’. So when Romance speakers adapted Germanic words to the sounds of their own language, gu was the closest approximation they could find to Germanic w.

This is why French has some words like guerre ‘war’, where gu- corresponds to w- in English (this word may have been borrowed because the inherited Latin word for war, bellum, had become identical to the word for ‘beautiful’). Later, some of the words with gu- were borrowed back into English, which is why we have both borrowed guard and inherited ward. According to one estimate, 28.3% of the vocabulary of English has been borrowed from French (figures derived from actual texts rather than dictionaries come in even higher at around 40%), a debt that we have recently started repaying in earnest with loans like le shopping and le baby-sitting. This is all to the consternation of the Académie française, which aims to protect the French language from such barbarisms, as evidenced by the dire, ne pas dire (‘say, don’t say’) section of the académie‘s website advising Francophones to use homegrown terms like contre-vérité instead of anglicisms like fake news.

By Murraytheb at English Wikipedia - Transferred from en.wikipedia to Commons., Public Domain, https://commons.wikimedia.org/w/index.php?curid=3448702

In fact, warranty and guarantee reflect not one but two different waves of borrowing: the first from Norman French, which still retained the w- sound, likely through the influence of Scandinavian languages spoken by the original Viking invaders of Normandy. Multiple layers of borrowing can also be seen in words like castle, from Latin castellum via Norman French, and chateau, borrowed from later French, in which Latin c- had developed a different pronunciation.

Incidentally, Norman French is still continued not only in Normandy but also in the Channel islands of Guernsey, Jersey and Sark. The Anglo-Norman dialect of the island of Alderney died out during World War II, when most of the island’s population was evacuated to the British mainland, although efforts are underway to bring it back.

Words apart: when one word becomes two

Words apart: when one word becomes two

As any person working with language knows, the list of words from which we build our sentences is not a fixed one but rather is in a state of constant flux. Words (or lexemes in linguists’ terminology) are constantly being borrowed (such as ‘sauté’ from French), coined (such as ‘brexit’ from a blend of ‘Britain’ and ‘exit’) or lost (such as ‘asunder’, a synonym for ‘apart’). These happen all the time. However, two more logical processes exist that can alter the total number of entries in the dictionary of our language. Occasionally, lexemes may also merge, if two or more become one; or split, if one becomes two. These more exotic cases constitute a window into the fascinating workings of the grammar. In this blog I will present the story of one of these splitting events. It involves the Spanish verb saber, from Latin sapiō.

The verb’s original meaning must have been ‘taste’ in the sense of ‘having a certain flavour’, as in the sentence “Marmite tastes awful”. At some point it also began to be used figuratively to mean ‘come to know something’, not only by means of the sense of taste but also for knowledge arrived at by means of other senses. It is interesting that in the Germanic languages it seems that it was sight rather that taste that was traditionally used in the same way. Consider, for instance, the common use, in English, of the verb ‘see’ in contexts like “I see what you mean”, where it is interchangeable with ‘know’. Whether the source verb can be explained by the differences between traditional Mediterranean and Anglo-Saxon cuisines I’d rather not suggest for fear of deportation.

In any case, what must have been once a figurative use of the verb ‘taste’ became at some point the default way of expressing ‘know’. These are the two main senses of saber in contemporary Spanish and of its equivalents in most other Romance languages. The question I ask here is: do speakers of Spanish today categorize this as one word with two meanings? Or do they feel they are two different words that just happen to sound the same? There may be a way to tell.

In Spanish, unlike in English, a verb can take dozens of different forms. The shape of a verb changes depending on who is doing the action of the verb, whether the action is a fact or a wish etc. Thus, for example, speakers of Spanish say yo sé ‘I know’ but t sabes ‘you know’. They also use one form (so-called ‘indicative’) in sentences like yo veo que t sabes inglés ‘I see that you know English’ but a different form (so-called ‘subjunctive’) in yo espero que t sepas inglés ‘I hope that you know English’. The Real Academia Española, the prescriptive authority in the Spanish language, has ruled that, because saber is a single verb, it should have the same forms (sé, sabes etc.) regardless of its particular sense. Speakers, however, have trouble to abide by this rule, which is probably the reason why the need for a rule was felt in the first place. My native speaker intuition, and that of other speakers of Spanish, is that the verb may have a different form depending on its sense:

Forms of Spanish saber (forms starting with sab– in light gray, forms starting with sep– in dark gray)

The most obvious explanation for why this change could happen is that, when the two main senses of saber drifted sufficiently away from each other, speakers ceased to make the generalization that they were part of the same lexeme. When this happened, the necessity to have the same forms for the two meanings of saber dissappeared. But, why sepo?

Because cannibalism is on the wane (also in Spain) we hardly ever speak about how people taste. As a result, the first and second person forms of saber (e.g. irregular ) are only ever encountered by speakers under their meaning ‘know’. Because of this, they do not count as evidence for language users’ deduction of the full array of forms of saber. This meant that the first and second person forms of saber₂ ‘taste’, when needed (imagine someone saying sepo salado ‘I taste salty’ after coming out of the sea), had to be formed on the fly on evidence exclusive to its sense ‘taste’ (i.e. third persons and impersonal forms):

Because of the evidence available to speakers, at first sight it might seem strange that this ‘fill-in-the-gaps’ exercise did not result in the apparently more regular 1SG indicative form sabo. This would have resulted in a straightforward indicative vs subjunctive distinction in the stem. The chosen form, however, makes more sense when one observes the patterns of alternation present in other Spanish verbs:

Verbs that have a difference in the stem in the third person forms between indicative and subjunctive (cab- vs quep- or ca- vs caig-) overwhelmingly use the form of the subjunctive also in the formation of the first person singular indicative. This is a quirk of many Spanish verbs. It appears that, by sheer force of numbers, the pattern is spotted by native speakers and occasionally extended to other verbs which, like saber look like could well belong in this class.

In this way, the tiny change from to sepo allows us linguists to see that patterns like those of caber and caer are part of the grammatical knowledge of speakers and are not simply learnt by heart for each verb. In addition, it gives us crucial evidence to conclude that, today, there are in Spanish not one but two different verbs whose infinitive form is saber. Much like the T-Rex in Jurassic Park, we linguists can sometimes only see some things when they ‘move’.

A daggy blog post

A daggy blog post

One of the most ubiquitously Australian words is the word dag. A word known and loved by basically any Aussie.

Classic daggy dad
Fig. 1 – The classic daggy-dad weekend look

It’s a light-hearted insult referring to someone who is unfashionable or socially awkward, basically a bit of a dork (Fig 1). But like most insults in Australian English it’s also used affectionately as a term of endearment (what does this say about how Australians relate to each other?). Typically in these cases, it is used to convey a sense of regard for the unashamedness of the dag in question – to express the lovable quality of someone who is just oblivious to certain social norms.

Ewww
Fig. 2 – An actual dag.

However, the origins of this this word are anything but loveable. According to the popular story (which appears to be supported by Macquarie Dictionary and The Australian National Dictionary), this usage is derived from the older meaning (attested in 1891) of the word dag to refer to a matted clot of wool and dung that forms around a sheep’s bum (Fig 2). By 1967 something  ‘dirty and unkempt’ could be referred to as daggy and by the 1980s we were using the word for Figure 2 for the unfashionable yet loveable dad in Figure 1.

As an Australian, I am proud of my dagginess and am pleased to know our daggy little word has a pretty gross origin.

 

 

A plurality of plurals

A plurality of plurals

Of all the world’s languages, English is the most widely learnt by adults. Although Mandarin Chinese has the highest number of speakers overall, owing to the huge size of China’s population, second-language speakers of English outnumber those of Mandarin more than three times.

Considering that the majority of English speakers learn the language in adulthood, when our brains have lost much of their early plasticity, it’s just as well that some aspects of English grammar are pretty simple compared to other languages. Take for example the way we express the plural. With only a small number of exceptions, we make plurals by adding a suffix –s to the singular. The pronunciation differs depending on the last sound of the word it attaches to – compare the ‘z’ sound at the end of dogs to the ‘s’ sound at the end of cats, and the ‘iz’ at the end of horses – but it varies in a consistently predictable way, which makes it easy to guess the plural of an English noun, even if you’ve never heard it before.

That’s not the case in every language. Learners of Greek, for example, have to remember about seven common ways of making plurals. Sometimes knowing the final sounds of a noun and its gender make it possible to predict the plural, but  other times learners simply have to memorise what kind of plural a noun has: for example pateras ‘father’ and loukoumas ‘doughnut’ both have masculine gender and singulars ending in –as, but in Standard Greek their plurals are pateres and loukoumathes respectively.

This is similar to how English used to work. Old English had three very common plural suffixes, -as, -an and –a, as well as a number of less common types of plural (some of these survive marginally in a few high-frequency words, including vowel alternations like tooth~teeth and zero-plurals like deer). The modern –s plural descends from the suffix –as, which originally was used only for a certain group of masculine nouns like stān, ‘stone’ (English lost gender in nouns, too, but that’s a subject for another blog post).

How did the -s plural overtake these competitors to become so overwhelmingly predominant in English? Partly it was because of changes to the sounds of Old English as it evolved into Middle English. Unstressed vowels in the last syllables of words, which included most of the suffixes which expressed the gender, number and case of nouns, coalesced into a single indistinct vowel known as ‘schwa’ (written <ə>, and pronounced like the ‘uh’ sound at the beginning of annoying). Moreover, final –m came to be pronounced identically to –n. This caused confusion between singulars and plurals: for example, Old English guman ‘to a man’ and gumum ‘to men’ both came to be pronounced as gumən in Middle English. It also caused confusion between two of the most common noun classes, the Old English an-plurals and the a-plurals. As a result they merged into a single class, with -e in the singular and -en in the plural.

This left Middle English with two main types of plural, one with –en and one with –(e)s. Although a couple of the former type remain to this day (oxen and children), the suffix –es was gradually generalised until it applied to almost all nouns, starting in the North of England and gradually moving South.

A similar kind of mass generalisation of a single strategy for expressing a grammatical distinction is often seen in the final stages of language death, as a community of speakers transition from a minority to a majority language as their mother tongue. Nancy Dorian has spent almost 50 years documenting the dying East Sutherland dialect of Scots Gaelic as it is supplanted by English in three remote fishing villages in the Scottish highlands. In one study the Gaelic speakers were divided into fluent speakers and ‘semi-speakers’, who used English as their first language and Gaelic as a second language. Dorian found that the semi-speakers tended to overgeneralise the plural suffix –an, applying it to words for which fluent speakers would have used one of another ten inherited strategies for expressing plural number, such as changing the final consonant of the word (e.g. phũ:nth ‘pound’, phũnčh ‘pounds’), or altering its vowel (e.g. makh ‘son’, mikh ‘sons’).

But why should the last throes of a dying language bear any resemblance to the evolution of a thriving language like English? A possible link lies in second language acquisition by adults. At the same time as these changes were taking place, English was undergoing intense contact with Scandinavian settlers who spoke Old Norse. During the same period English shows many signs of Old Norse influence. In addition to many very common words like take and skirt (which originally had a meaning identical to its native English cognate shirt), English borrowed several grammatical features of Scandinavian languages, such as the suffix –s seen in third person singular present verbs like ‘she blogs’ (the inherited suffix ended in –th, as in ‘she bloggeth’), and the pronouns they, their and them, which replaced earlier hīe, heora and heom. Like the extension of the plural in –s, these innovations appeared earliest in Northern dialects of English, where settlements of Old Norse speakers were concentrated, and gradually percolated South during the 11th to 15th centuries.

It’s possible that English grammar was simplified in some respects as a consequence of what the linguist Peter Trudgill has memorably called “the lousy language-learning abilities of the human adult”. Research on second-language acquisition confirms what many of us might suspect from everyday experience, that adult learners struggle with inflection (the expression of grammatical categories like ‘plural’ within words) and prefer overgeneralising a few rules rather than learning many different ways of doing the same thing. In this respect, Old Norse speakers in Medieval England would have found themselves in a similar situation to semi-speakers of East Sutherland Gaelic – when confronted with a number of different ways of expressing plural number, it is hard to remember for each noun which kind of plural it has, but simple to apply a single rule for all nouns. After all, much of the complexity of languages is unnecessary for communication: we can still understand children when they make mistakes like foots or bringed.

 

Entree

Entree

One of the peculiar habits that strikes a foreign visitor to a restaurant in the US (alongside heaps of ice in your drink and the sneaky habit of leaving sales tax off the price) is that menus typically list main course dishes as ‘entrees’. But ‘entrée’  is a French word that means something like ‘entry’ or ‘entrance’, so shouldn’t it be the same thing as appetizer or hors-d’oeuvre or starter? It seems like some fundamental misunderstanding of the term, like the rectangular chocolate ‘croissants’ shamelessly marketed outside of France.

Read More Read More

A sorry excuse for Surrey

A sorry excuse for Surrey

It has recently come to my attention that my vowels are weird. This was pointed out to me by a fellow American colleague who declared that, unexpectedly, we do not say Surrey the same way, and that my pronunciation has a “weird” vowel. I’ve already experienced confusion more than once from locals when I utter the word, and it’s enough to make me a little self-conscious.

I was already vaguely aware that Californians do some strange things with vowels. A bit of online digging revealed that as a San Francisco Bay Area native, I can blame my weird vowels on the Northern California vowel shift (outlined by Penny Eckert here). This sound change is what makes the surfer’s (and my) way of saying duuuude so distinctive (the vowel is fronted). My international friends make fun of the way I say “aw, man!”. Here, man for me becomes something like /miyn/. Even I have to admit it sounds pretty funny.

I like to think I am relatively aware of linguistic behavior, but as this experience showed me, we as linguists may not be as well-equipped as we think to recognize our own quirks.