Double trouble treble

Double trouble treble

You’ll get in trouble if you drink a tripel, the strong pale ale brewed by the most hipster of monks, the Trappists.

The Lowlands are the Hoxton of Europe

Tripels have three times the strength (around 8-10% percent ABV) of the standard table beer historically consumed by the monks themselves. This enkel or ‘single’ beer was traditionally not available outside the cloisters, while the duppel (a double strength dark brown beer made with caramelized beet sugar) was sold to provide income for the monastery. Although the term enkel is no longer in common beer parlance (it is on the cusp of a comeback), duppel and tripel have held their ground. It is generally thought that the tripel takes its name from its threefold strength, but it is also sometimes claimed that it is because it has three times the malt of a regular brew. A quadrupel is VERY strong.

As we have seen already in this blog when counting sheep in Slovenian and yams in Ngkolumbu, means for the expression of quantities and multiplication are often linguistically fascinating. Not least the doublet treble and triple, which originate from the same etymological source.

The Latin word triplus ‘threefold, triple’ first entered English via Old French treble. Not satisfied with claiming the space previously occupied by the Old English adjective þrifeald ‘threefold’, it turned up again by the 15th century as the adjective triple.

This triad of modifiers (threefold, treble and triple) exemplify some of the pathways by which lexical synonymy can come about. The first word was formed through a compounding processes (i.e. the numeral three forming a new word with the multiplicative form –fold), the second entered the language through direct borrowing, and the third through a second wave of borrowing (either from Old French triple or Latin triplus).

We don’t just find words competing to express the same meaning, but also parts of words. The –fold element of threefold, tenfold and manifold, and the –plus of triplus, are argued to have developed from the same Proto Indo-European root *pel ‘to fold’. To complicate things even further, the now obsolete treblefold was attested between the 14th and 16th centuries. Words, it seems, like to fight for the same space, and can sometimes be incestuous.

Since entering English over 500 years ago, triple and treble have staked out different paths, but retained similar meanings in at least some of their manifestations, as explored by Catherine Soanes on the OxfordWords blog. In terms of frequency, triple is the stronger twin (or is it a triplet? quadruplet?), ending up triumphant with around 6 times more occurrences in the Oxford English Corpus.

But treble has some resilience. Although the official Scrabble board has double and triple word scores, treble word scores are occasionally referred to on the net (albeit erroneously, or in a devil-may-care way), such as in Charlie Brooker’s article on how to cheat at scrabble. I even found a ‘threefold word score’ on a Scrabble knock-off site. Lawyers to the ready!

This demonstrates that these adjectives really are semantically interchangeable for the most part, even though their distributions are not identical.

The take home? While not not every monastery sells the same tripel, they will all get you drunk.

Werewolves

Werewolves

Hallowe’en will soon be upon us, so it is only right we turn our attention to monsters. Consider the werewolf. It’s a wolf, sort of, as the name indicates, but what’s a were? The usual assumption is that it’s a leftover of an older word meaning ‘man’ that fell completely out of fashion by the 14th century. As a result we have what looks like a compound word, except that one of the parts doesn’t have any meaning on its own. Perhaps not, but that hasn’t stopped people from squeezing some value out of it nonetheless: if a werewolf is a person who turns into a wolf — or at any rate, part person, part wolf — then a were-bear is a mixture of person and bear, and so on down to were-turtles.

Actually, people don’t seem to be that literal-minded when it comes to word meanings, if the various were-creatures in circulation are any evidence. The monster from “Wallace and Gromit: Curse of the Were-Rabbit” is not half-human, half-rabbit, but more just kind of a monster rabbit, with a thicker pelt. (Visually calqued, I suspect, from the not-particularly wolf-like wolfman of the wolfman movies featuring Lon Chaney Jr.)

And were-fleas, to the extent that they exist, appear to be carriers of lycanthropism rather than human/insect conglomerates. None of this is yet reflected in the Oxford English Dictionary’s entry on were– (you need a subscription for that but it’s free if you have a UK public library card!). Give it a few decades more maybe.

Strangely, words for werewolf in other languages share a propensity for being compounds made up of ‘wolf’ plus some other completely opaque element. The first part of Czech vlkodlak is vlk, which means ‘wolf‘, but dlak on its own is not an independent word. (Not in Czech at any rate, but in the related language Slovenian the equivalent word volkodlak is clearly made up of volk ‘wolf’ and dlaka, which means ‘hair’ or ‘fur’.) And the French werewolf, loup-garou, has the word for ‘wolf’ in it (loup), but garou is not an independent word (other than being an unrelated homonym meaning ‘flax-leaved daphne’). That part seems to have been our very own Germanic word werewolf borrowed at an early date (earliest attestation as garwall from the 12th century). Both of these have, like werewolf, given rise to further monstrous hybrids like Czech prasodlak, from prase ‘pig’, or the French cochon-garou.

In fact, Czech and French have gone one step further than English. Though I just wrote that dlak and garou were not words, that was being a bit pedantic. Neither of them are listed in the authoritative Academy dictionaries of Czech and French, but nonetheless they do seem to have split off from their host body, rather as happened — if we can be permitted to mix monster metaphors — to the hero of 1959’s “The Manster (a.k.a The Split)”.

For example, this Czech website tells us about vlkodlaci i jiní dlaci ‘werewolves and other were-creatures’ (dlaci is the plural of dlak), and in French the phrase courir le garou ‘run the garou‘ used, at least, to be in circulation, meaning basically ‘go around at night being a werewolf”. That use in turn apparently spawned a verb garouter, meaning much the same thing. The curse lives on.

Optimal Categorisation: How do we categorise the world around us?

Optimal Categorisation: How do we categorise the world around us?

People love to categorise! We do this on a daily basis, consciously and subconsciously. When we are confronted with something new we try and figure out what it is by comparing it to something we already know. Say, for instance, I saw something flying through the air – I may think to myself that the object is a bird, or I may say it is a plane based on my previous experiences of birds and planes. Of course the object may turn out to be something completely new, perhaps even superman!

Is it a bird? Is it a plane? No it’s Superman!

Our love of classification runs deep in scientific enquiry. Botanists and zoologists classify plants and animals into different taxonomies. Even the humble linguist loves to classify – is this new word a noun or a verb? What about the new word zoodle that was recently added to the Merrriam Webster dctionary? Is it a thing? Or an action? Can I zoodle something or is it something I can pick up and touch? Well apparently zoodle is a noun which means ‘a long, thin strip of zucchini that resembles a string or narrow ribbon of pasta’. To be honest, I love eating zoodles, though until now I never knew what they were called!

The way people classify entities around them has become encoded in the different languages we speak in many different ways. The most obvious example that springs to mind is when we learn a new language, like French or German, we are confronted with a grammatical gender system. French has two genders – Masculine and Feminine. But German has three – Masculine, Feminine and Neuter. Other languages can have many more gender distinctions. Fula, a language spoken in west and central Africa, has twenty different gender categories!

So what exactly are grammatical gender systems and how are they realised in different languages? Gender systems categorise nouns into different groups and tend to appear not on the noun itself, but on other elements in the phrase. In German, nouns are split into three different gender categories – masculine, feminine and neuter. The gender of a noun is shown by using different articles (the word ‘the’ or ‘a’) and sometimes by changing the ending of an adjective, but never on the noun itself. Thus the word for ‘the’ in German is either der, die or das depending on whether the noun in the phrase is masculine, feminine or neuter.

(1)        der       Mann
              the       man

(2)        die        Frau
              the       woman

(3)        das       Haus
              the       house

This is called ‘agreement’ as the adjectives and articles must agree with the gender of the noun. In a language with gender, each noun typically can only occur in one gender category.

Not every language has a grammatical gender system, but they are highly pervasive, with around 40% of all languages having such a system. English is quite a poor example when it comes to gender. There is no real gender agreement in English, with the exception of pronouns. We have to say: Bill walked into the grocers. He bought some apples. Where the pronoun he must agree with the gender of the noun that was previously mentioned. English uses he, she and it as the only markers of gender agreement.

Languages behave differently in how they allocate nouns to the different genders, which can be very baffling for language learners! Why in French is chair feminine, la chaise, but in German it is masculine, der Stuhl? How a language allocates nouns to its gender categories can seem somewhat arbitrary – with the exception of the words for women and men, which fall into the feminine and masculine genders being the only semantically obvious choices.

But wait! If you thought the English gender system was dull, think again! A couple of months ago my piano was being restored and when it was being moved back into the lounge the piano movers kept saying: “pull her a little bit more” and “turn her this way”. The movers used the female pronouns to describe the piano. In English, countries, pianos, ships and sometimes even cars use the feminine pronouns.

Grammatical gender isn’t the only way languages classify nouns. Some languages use words called classifiers to categorise nouns. Classifiers are similar to English measure terms, which categorise the noun in terms of its quantity, such as ‘sheet of paper’ vs. ‘pack of paper’ or ‘slice of bread vs. ‘loaf of bread’. Classifiers are found in languages all over the world and are able to categorise nouns depending on the shape, size, quantity or use of the referent, e.g. ‘animal kangaroo’ (alive) vs. ‘meat kangaroo’ (not alive). Classifier systems are very different to gender systems as nouns in a language with classifiers can appear with different classifiers depending on what property of the noun you wish to highlight. There are many different types of classifier systems, but to keep things short I am just going talk about possessive classifiers, which are mainly found in the Oceanic languages, spoken in the South Pacific.

When an item is in your possession we use possessive pronouns in English to say who the item belongs to. For instance if I say ‘my coconut’ – the possessive pronoun is my. In many Oceanic languages a noun can occur with different forms for the word my depending on how the owner intends to use it. For instance the Paamese language, spoken in Vanuatu, has four possessive classifiers and I could use the ‘drinkable’ if I was talking about my coconut that I was going to drink. I would use the ‘edible’ classifier if I was going to eat my coconut. I would use the classifier for ‘land’ if I was talking about the coconut growing in my garden. Finally, I could use the ‘manipulative’ classifier if I was going to use my coconut for some other purpose – perhaps to sit on!

(4)        ani                   mak
              coconut           my.drinkable
              ‘my coconut (that I will drink)’

(5)        ani                   ak
              coconut           my.edible
              ‘my coconut (that I will eat)’

Why do languages have different ways of categorising nouns? How do these systems develop and change over time? Are gender systems easier to learn than classifier systems? Are gender and classifiers completely different systems? Or is there more similarity to them than meets the eye? These are some of the big questions in linguistics and psychology. We are excited to start a new research project at the Surrey Morphology Group, called optimal categorisation: the origin and nature of gender from a psycholinguistic perspective, that seeks to answer these fundamental questions. Over the next three years we will talk more about these fascinating categorisation systems, explain our experimental research methods, introduce the languages and speakers under investigation, and share our findings via this blog. Just look out for the ‘Optimal Categorisation’ headings!

The cat’s mneow: animal noises and human language

The cat’s mneow: animal noises and human language

As is well known, animals on the internet can have very impressive language skills: cats and dogs in particular are famous for their near-complete online mastery of English, and only highly trained professional linguists (including some of us here at SMG) are able to spot the subtle grammatical and orthographic clues that indicate non-human authorship behind some of the world’s favourite motivational statements.

Recent reports suggest that some of our fellow primates have also learnt to engage in complex discourse: again, the internet offers compelling evidence for this.

But sadly, out in the real world, animals capable of orating on philosophy are hard to come by (as far as we can tell). Instead, from a human point of view, cats, dogs, gorillas etc. just make various kinds of animal noises.

Why write about animals and their noises on a linguistics blog? Well, one good answer would be: the exact relationship between the vocalisations made by animals, on one hand, and the phenomenon of human spoken language, on the other, is a fascinating question, of interest within linguistics but far beyond it as well. So a different blog post could have turned now to discuss the semiotic notion of communication in the abstract; or perhaps the biological evolution of language in our species, complete with details about the FOXP2 gene and the descent of the larynx

But in fact I am going to talk about something a lot less technical-sounding. This post is about what could be called the human versions of animal noises: that is, the noises that English and other languages use in order to talk about them, like meow and woof, baa and moo.

At this point you may be wondering whether there is much to be gained by sitting around and pondering words like moo. But what I have in mind here is this kind of thing:

These are good fun, but they also raise a question. If pigs and ducks are wandering around all over the world making pig and duck noises respectively, then how come we humans appear to have such different ideas about what they sound like? Oink cannot really be mistaken for nöff or knor, let alone buu. And the problem is bigger than that: even within a single language, English, frogs can go both croak and ribbit; dogs don’t just go woof, but they also yap and bark. These sound nothing like each other. What is going on? Are we trying to do impressions of animals, only to discover that we are not very good at it?

Before going any further I should deal with a couple of red herrings (to stick with the zoological theme). For one thing, languages may appear to disagree more than they really do, just because their speakers have settled on different spelling conventions: a French coin doesn’t really sound all that different from an English quack. And sometimes we may not all be talking about the same sound in the first place. Ribbit is a good depiction of the noise a frog makes if it happens to belong to a particular species found in Southern California – but thanks to the cultural influence of Hollywood, ribbit is familiar to English speakers worldwide, even though their own local frogs may sound a lot more croaky. Meanwhile, it is easy to picture the difference between the kind of dog that goes woof and the kind that goes yap.

But even when we discount this kind of thing, there are still plenty of disagreements remaining, and they pose a puzzle bound up with linguistics. A fundamental feature of human language, famously pointed out by Saussure, is that most words are arbitrary: they have nothing inherently in common with the things they refer to. For example, there is nothing actually green about the sound of the word green – English has just assigned that particular sound sequence to that meaning, and it’s no surprise to find that other languages haven’t chosen the same sounds to do the same job. But right now we are in the broad realm of onomatopoeia, where you might not expect to find arbitrariness like this. After all, unlike the concept of ‘green’, the concept of ‘quack’ is linked to a real noise that can be heard out there in the world: why would languages bother to disagree about it?

 

First off, it is worth noticing that not all words relating to animal noises work in the same way. Think of cock-a-doodle-doo and crow. Both of these are used in English of the distinctive sound made by a cockerel, and there is something imitative about them both. But there is a difference between them: the first is used to represent the sound itself, whereas the second is the word that English uses to talk about producing it. That is, as English sees it, the way a cock crows is by ‘saying’ cock-a-doodle-doo, and never vice versa. Similarly, the way that a dog barks is by ‘saying’ woof. The representations of the sounds, cock-a-doodle-doo and woof, are practically in quotation marks, as if capturing the animals’ direct speech.

This gives us something to run with. After all, think about the work that words like crow and bark have to do. As they are verbs, you need to be able to change them according to person (they bark but it barks), tense, and so on. So regardless of their special function of talking about noises, they still have to operate like any other verb, obeying the normal grammar rules of English. Since every language comes with its own grammatical requirements and preferences about how words can be structured and manipulated (that is, its own morphology), this can explain some kinds of disparity across languages. For example, what we onomatopoeically call a cuckoo is a kukushka in Russian, featuring a noun-forming element shka which makes the word easier to deal with grammatically – but also makes it sound very Russian. Maybe it is this kind of integration into each language that makes these words sound less true to life and more varied from one language to another?

This is a start, but it must be far from the whole story. Animal ‘quotes’ like woof and cock-a-doodle-doo don’t need to interact all that much with English grammar at all. Nonetheless, they are clearly the English versions of the noises we are talking about:

And as we’ve already seen, the same goes for quack and oink. So even when it looks like we might just be ‘doing impressions’ of non-linguistic sounds, every language has its own way of actually doing those impressions.

Reassuringly, at least we are not dealing with a situation of total chaos. Across languages, duck noises reliably contain an open a sound, while pig noises reliably don’t. And there is widespread agreement when it comes to some animals: cows always go moo, boo or similar, and sheep are always represented as producing something like meh or beh – this is so predictable that it has even been used as evidence for how certain letters were pronounced in Ancient Greek. So languages are not going out of their way to disagree with each other. But this just sharpens up the question. For obvious biological reasons, humans can never really make all the noises that animals can. But given that people the world over sometimes converge on a more or less uniform representation for a given noise, why doesn’t this always happen?

In their feline wisdom, the cats of the Czech Republic can give us a clue. Like sheep, cats sound pretty similar in languages across the globe, and in Europe they are especially consistent. In English, they go meow; in German, it is miau; in Russian, myau; and so on. But in Czech, they go mňau (= approximately mnyau), with a mysterious n-sound inside. The reason is that at some point in the history of Czech, a change in pronunciation affected every word containing a sequence my, so that it came out as mny instead. Effectively, for Czech speakers from then on, the option of saying myau like everyone else was simply off the table, because the language no longer allowed it – no matter what their cats sounded like.

What does this example illustrate? First of all – as well as a morphology, each language has a phonology (sound structure), which constrains its speakers tightly: no language lets people use all the sounds they are physically able to make, and even the available sounds are only allowed to join up in certain combinations. So each language has to come up with a way of dealing with non-linguistic noises which will suit its own idea of what counts as a legitimate syllable. Moo is one thing, but it’s harder to find a language that allows syllables resembling the noise a pig makes… so each language compromises in its own way, resulting in nöff, knor, oink etc., none of which capture the full sonic experience of the real thing.

And second – things like oink, woof and mňau really must be words in the full sense. They aren’t just a kind of quotation, or an imitation performed off the cuff; instead they belong in a speaker’s mental dictionary of their own language. That is why, in general, they have to abide by the same phonological rules as any other word. And that also explains where the arbitrariness comes in: as with any word, language learners just notice that that is the way their own community expresses a shared concept, and from then on there is no point in reinventing the wheel. You don’t need to try hard to get a duck’s quack exactly right in order to talk about it – as long as other people know what you mean, the word has done its job.

So what speakers might lose in accuracy this way, they make up for in efficiency, by picking a predetermined word that they know fellow speakers will recognise. Only when you really want to draw attention to a sound is it worth coming up with a new representation of it and ignoring the existing consensus. To create something truly striking, perhaps you need to be a visionary like James Joyce, who wrote the following line of ‘dialogue’ for a cat in Ulysses, giving short shrift to English phonology in the process:

–Mrkgnao!

 

What’s the good of ‘would of’?

What’s the good of ‘would of’?

As schoolteachers the English-speaking world over know well, the use of of instead of have after modal verbs like would, should and must is a very common feature in the writing of children (and many adults). Some take this an omen of the demise of the English language,  and would perhaps agree with Fowler’s colourful assertion in A Dictionary of Modern English Usage (1926) that “of shares with another word of the same length, as, the evil glory of being accessory to more crimes against grammar than any other” (though admittedly this use of of has been hanging around for a while without doing any apparent harm: this study finds one example as early as 1773, and another almost half a century later in a letter of the poet Keats).

According to the usual explanation, this is nothing more than a spelling mistake. Following ‘would’, ‘could’ etc., the verb have is usually pronounced in a reduced form as [əv], usually spelt would’ve, must’ve, and so on. It can even be reduced further to [ə], as in shoulda, woulda, coulda. This kind of phonetic reduction is a normal part of grammaticalisation, the process by which grammatical markers evolve out of full words. Given the famous unreliability of English spelling, and the fact that these reduced forms of have sound identical to reduced forms of the preposition of (as in a cuppa tea), writers can be forgiven for mistakenly inferring the following rule:

‘what you hear/say as [əv] or [ə], write as of’.

But if it’s just a spelling mistake, this use of ‘of’ is surprisingly common in respectable literature. The examples below (from this blog post documenting the phenomenon) are typical:

‘If I hadn’t of got my tubes tied, it could of been me, say I was ten years younger.’ (Margaret Atwood, The Handmaid’s Tale)

Couldn’t you of – oh, he was ignorant in his speech – couldn’t you of prevented it?’ (Hilary Mantel, Beyond Black)

Clearly neither these authors nor their editors make careless errors. They consciously use ‘of’ instead of ‘have’ in these examples for stylistic effect. This is typically found in dialogue to imply something about the speaker, be it positive (i.e. they’re authentic and unpretentious) or negative (they are illiterate or unsophisticated).

 

These examples look like ‘eye dialect’: the use of nonstandard spellings that correspond to a standard pronunciation, and so seem ‘dialecty’ to the eye but not the ear. This is often seen in news headlines, like the Sun newspaper’s famous proclamation “it’s the Sun wot won it!” announcing the surprise victory of the conservatives in the 1992 general election. But what about sentences like the following from the British National Corpus?

“If we’d of accepted it would of meant we would have to of sold every stick of furniture because the rooms were not large enough”

The BNC is intended as a neutral record of the English language in the late 20th century, containing 100 million words of carefully transcribed and spellchecked text. As such, we expect it to have minimal errors, and there is certainly no reason it should contain eye dialect. As Geoffrey Sampson explains in this article:

“I had taken the of spelling to represent a simple orthographic confusion… I took this to imply that cases like could of should be corrected to could’ve; but two researchers with whom I discussed the issue on separate occasions felt that this was inappropriate – one, with a language-teaching background, protested vigorously that could of should be retained because, for the speakers, the word ‘really is’ of rather than have.”

In other words, some speakers have not just reinterpreted the rules of English spelling, but the rules of English grammar itself. As a result, they understand expressions like should’ve been and must’ve gone as instances of a construction containing the preposition of instead of the verb have:

Modal verb (e.g. must, would…) + of + past participle (e.g. had, been, driven…)

One way of testing this theory is to look at pronunciation. Of can receive a full pronunciation [ɒv] (with the same vowel as in hot) when it occurs at the end of a sentence, for example ‘what are you dreaming of?’. So if the word ‘really is’ of for some speakers, we ought to hear [ɒv] in utterances where of/have appears at the end, such as the sentence below. To my mind’s ear, this pronunciation sounds okay, and I think I even use it sometimes (although intuition isn’t always a reliable guide to your own speech).

I didn’t think I left the door open, but I must of.

The examples below from the Audio BNC, both from the same speaker, are transcribed as of but clearly pronounced as [ə] or [əv]. In the second example, of appears to be at the end of the utterance, where we might expect to hear [ɒv], although the amount of background noise makes it hard to tell for sure.

 “Should of done it last night when it was empty then” (audio) (pronounced [ə], i.e. shoulda)

(phone rings) “Should of.” (audio) (pronounced [əv], i.e. should’ve)

When carefully interpreted, writing can also be a source of clues on how speakers make sense of their language. If writing have as of is just a linguistically meaningless spelling mistake, why do we never see spellings like pint’ve beer or a man’ve his word? (Though we do, occasionally, see sort’ve or kind’ve). This otherwise puzzling asymmetry is explained if the spelling of in should of etc. is supported by a genuine linguistic change, at least for some speakers. Furthermore, have only gets spelt of when it follows a modal verb, but never in sentences like the dogs have been fed, although the pronunciation [əv] is just as acceptable here as in the dogs must have been fed (and in both cases have can be written ‘ve).

If this nonstandard spelling reflects a real linguistic variant (as this paper argues), this is quite a departure from the usual role of a preposition like of, which is typically followed by a noun rather than a verb. The preposition to is a partial exception, because while it is followed by a noun in sentences like we went to the party, it can also be followed by a verb in sentences like we like to party. But with to, the verb must appear in its basic infinitive form (party) rather than the past participle (we must’ve partied too hard), making it a bit different from modal of, if such a thing exists.

She must’ve partied too hard

Whether or not we’re convinced by the modal-of theory, it’s remarkable how often we make idiosyncratic analyses of the language we hear spoken around us. Sometimes these are corrected by exposure to the written language: I remember as a young child having my spelling corrected from storbry to strawberry, which led to a small epiphany for me, as that was the first time I realised the word had anything to do with either straw or berry. But many more examples slip under the radar. When these new analyses lead to permanent changes in spelling or pronunciation we sometimes call them folk etymology, as when the Spanish word cucaracha was misheard by English speakers as containing the words cock and roach, and became cockroach (you can read more about folk etymology in earlier posts by Briana and Matthew).

Meanwhile, if any readers can find clear evidence of modal of with the full pronunciation as  [ɒv], please comment below! I’m quite sure I’ve heard it, but solid evidence has proven surprisingly elusive…

Reindeer = rein + deer?

Reindeer = rein + deer?

In linguists’ jargon, a ‘folk etymology’ refers to a change that brings a word’s form closer to some easily analyzable meaning. A textbook example is the transformation of the word asparagus into sparrowgrass in certain dialects of English.

Although clear in theory, it is not easy to decide whether ‘folk etymology’ is called for in other cases. One which has incited heated coffee-time discussion in our department is the word reindeer. The word comes ultimately from Old Norse hreindyri, composed of hreinn ‘reindeer’ and dyri ‘animal’. In present-day English, some native speakers conceive of the word reindeer as composed of two meaningful parts: rein + deer. This is something which, in the Christian tradition at least, does make a lot of sense. Given that the most prominent role of reindeer in the West is to serve as Santa’s means of transport, an allusion to ‘reins’ is unsurprising. This makes the hypothesis of folk etymology plausible.

When one explores the issue further, however, things are not that clear. The equivalent words in other Germanic languages are often the same (e.g. German Rentier, Dutch rendier, Danish rensdyr etc.) even though the element ren does not refer to the same thing as in English. However, unlike in English, another way of referring to Rudolf is indeed possible in some of these languages that omits the element ‘deer’ altogether: German Ren, Swedish ren, Icelandic hreinn, etc.

Another thing that may be relevant is the fact that the word ‘deer’ has narrowed its meaning in English to refer just to a member of the Cervidae family and not to any living creature. Other Germanic languages have preserved the original meaning ‘animal’ for this word (e.g. German Tier, Swedish djur).

Since reindeer straightforwardly descends from hreindyri, it may seem that, despite the change in the meaning of the component words, we have no reason to believe that the word was altered by folk etymology at any point. However, the story is not that simple. Words that contained the diphthong /ei/ in Old Norse do not always appear with the same vowel in English. Contrast, for example, ‘bait’ [from Norse beita] and ‘hail’ [from heill] with ‘bleak’ [from bleikr] and ‘weak’ [from veikr]). An orthographic reflection of the same fluctuation can be seen in the different pronunciation of the digraph ‘ei’ in words like ‘receive’ and ‘Keith’ vs ‘vein’ and weight’. It is, thus, not impossible that the preexistence of the word rein in (Middle) English tipped the balance towards the current pronunciation of reindeer over an alternative one like “reendeer”. Also, had the word not been analyzed by native speakers as a compound of rein+deer, it is not unthinkable that the vowels may have become shorter in current English (consider the case of breakfast, etymologically descending from break + fast).

So, is folk etymology applicable to reindeer? The dispute rages on. Some of us don’t think that folk etymology is necessary to explain the fate of reindeer. That is, the easiest explanation (in William of Occam’s sense) may be to say that the word was borrowed and merely continued its overall meaning and pronunciation in an unrevolutionary way.

Others are not so sure. The availability of “fake” etymologies like rein+deer (or even rain+deer before widespread literacy) seems “too obvious” for native speakers to ignore. The suspicion of ‘folk etymology’ might be aroused by the presence of a few mild coincidences such as the “right” vowel /ei/ instead of /i:/, the fact that the term was borrowed as reindeer rather than just rein as in some other languages [e.g. Spanish reno] or by the semantic drift of deer exactly towards the kind of animal that a reindeer actually is. These are all factors that seem to conspire towards the analyzability of the word in present-day English but which would have to be put down to coincidence if they just happened for no particular reason and independently of each other. Even if no actual change had been implemented in the pronunciation of reindeer, the morphological-semantic analysis of the word has definitely changed from its source language. Under a laxer definition of what folk etymology actually is, that could on its own suffice to label this a case of folk etymology.

There seems to be, as far as we can see, no easy way out of this murky etymological and philological quagmire that allows us to conclude whether a change in the pronunciation of reindeer happened at some point due to its analyzability. To avoid endless and unproductive discussion one sometimes has to know when to stop arguing, shrug and write a post about the whole thing.

Today’s vocabulary, tomorrow’s grammar

Today’s vocabulary, tomorrow’s grammar

If an alien scientist were designing a communication system from scratch, they would probably decide on a single way of conveying grammatical information like whether an event happened in the past, present or future. But this is not the case in human languages, which is a major clue that they are the product of evolution, rather than design. Consider the way tense is expressed in English. To indicate that something happened in the past, we alter the form of the verb (it is cold today, but it was cold yesterday), but to express that something will happen in the future we add the word will. The same type of variation can also be seen across languages: French changes the form of the verb to express future tense (il fera froid demain, ‘it will be cold tomorrow’, vs il fait froid aujourd’hui, ‘it is cold today’).

The future construction using will is a relatively recent development. In the earliest English, there was no grammatical means of expressing future time: present and future sentences had identical verb forms, and any ambiguity was resolved by context. This is also how many modern languages operate. In Finnish huomenna on kylmää ‘it will be cold tomorrow’, the only clue that the sentence refers to a future state of affairs is the word huomenna ‘tomorrow’.

How, then, do languages acquire new grammatical categories like tense? Occasionally they get them from another language. Tok Pisin, a creole language spoken in Papua New Guinea, uses the word bin (from English been) to express past tense, and bai (from English by and by) to express future. More often, though, grammatical words evolve gradually out of native material. The Old English predecessor of will was the verb wyllan, ‘wish, want’, which could be followed by a noun as direct object (in sentences like I want money) as well as another verb (I want to sleep). While the original sense of the verb can still be seen in its German cousin (Ich will schwimmen means ‘I want to swim’, not ‘I will swim’), English will has lost it in all but a few set expressions like say what you will. From there it developed a somewhat altered sense of expressing that the subject intends to perform the action of the verb, or at least, that they do not object to doing so (giving us the modern sense of the adjective ‘willing’). And from there, it became a mere marker of future time: you can now say “I don’t want to do it, but I will anyway” without any contradiction.

This drift from lexical to grammatical meaning is known as grammaticalisation. As the meaning of a word gets reduced in this way, its form often gets reduced too. Words undergoing grammaticalisation tend to gradually get shorter and fuse with adjacent words, just as I will can be reduced to I‘ll. A close parallel exists in in the Greek verb thélō, which still survives in its original sense ‘want’, but has also developed into a reduced form, tha, which precedes the verb as a marker of future tense. Another future construction in English, going to, can be reduced to gonna only when it’s used as a future marker (you can say I’m gonna go to France, but not *I’m gonna France). This phonetic reduction and fusion can eventually lead to the kind of grammatical marking within words that we saw with French fera, which has arisen through the gradual fusion of earlier  ferre habet ‘it has to bear’.

Words meaning ‘want’ or ‘wish’ are a common source of future tense markers cross-linguistically. This is no coincidence: if someone wants to perform an action, you can often be reasonably confident that the action will actually take place. For speakers of a language lacking an established convention for expressing future tense, using a word for ‘want’ is a clever way of exploiting this inference. Over the course of many repetitions, the construction eventually gets reinterpreted as a grammatical marker by children learning the language. For similar reasons, another common source of future tense markers is words expressing obligation on the part of the subject. We can see this in Basque, where behar ‘need’ has developed an additional use as a marker of the immediate future:

ikusi    behar   dut

see       need     aux

‘I need to see’/ ‘I am about to see’

This is also the origin of the English future with shall. This started life as Old English sceal, ‘owe (e.g. money)’. From there it developed a more general sense of obligation, best translated by should (itself originally the past tense of shall) or must, as in thou shalt not kill. Eventually, like will, it came to be used as a neutral way of indicating future time.

But how do we know whether to use will or shall, if both indicate future tense? According to a curious rule of prescriptive grammar, you should use shall in the first person (with ‘I’ or ‘we’), and will otherwise, unless you are being particularly emphatic, in which case the rule is reversed (which is why the fairy godmother tells Cindarella ‘you shall go to the ball!’). The dangers of deviating from this rule are illustrated by an old story in which a Frenchman, ignorant of the distinction between will and shall, proclaimed “I will drown; nobody shall save me!”. His English companions, misunderstanding his cry as a declaration of suicidal intent, offered no aid.

This rule was originally codified by Bishop John Wallis in 1653, and repeated with increasing consensus by grammarians throughout the 18th and early 19th centuries. However, it doesn’t appear to reflect the way the words were actually used at any point in time. For a long time shall and will competed on fairly equal terms – shall substantially outnumbers will in Shakespeare, for example – but now shall has given way almost entirely to will, especially in American English, with the exception of deliberative questions like shall we dance? You can see below how will has gradually displaced shall over the last few centuries, mitigated only slightly by the effect of the prescriptive rule, which is perhaps responsible for the slight resurgence of shall in the 1st person from approximately 1830-1920:

Until the eventual victory of will in the late 18th century, these charts (from this study) actually show the reverse of what Wallis’s rule would predict: will is preferred in the 1st person and shall in the 2nd , while the two are more or less equally popular in the 3rd person. Perhaps this can be explained by the different origins of the two futures. At the time when will still retained an echo of its earlier meaning ‘want’, we might expect it to be more frequent with ‘I’, because the speaker is in the best position to know what he or she wants to do. Likewise, when shall still carried a shade of its original meaning ‘ought’, we might expect it to be most frequent with ‘you’, because a word expressing obligation is particularly useful for trying to influence the action of the person you are speaking to. Wallis’ rule may have been an attempt to be extra-polite: someone who is constantly giving orders and asserting their own will comes across as a bit strident at best. Hence the advice to use shall (which never had any connotations of ‘want’) in the first person, and will (without any implication of ‘ought’) in the second, to avoid any risk of being mistaken for such a character, unless you actually want to imply volition or obligation.

How do we know when? The story behind the word “sciatica”

How do we know when? The story behind the word “sciatica”

My right arm has been bothering me lately. The nerve has become inflamed by a pinching at the neck, creating a far from desirable situation. When trying to explain the condition to a friend, I compared it to sciatica, but of the arm. I am not here to bore you with my ills, however, but to tell you a story precisely about that word, sciatica. You may wonder what is so special about it. It is true that it has a weird spelling with sc, just like science, and that it sounds a little bit like a fancy word, having come directly from Latin and retaining that funny vowel a at the end which not many words in English have. But more than that, the word sciatica gives us a crucial clue about changes which have transformed the way the English language sounds.

English is a funny language. Of all the European languages, it has changed the most in the last thousand years, and this is particularly apparent in its vowels. In the early Middle Ages, starting perhaps sometime in the mid-14th century, the lower classes in England started changing the way they pronounced the long vowels they had inherited from earlier generations. Some have even claimed that the upper class at the time, whose ability to use French had started to peter out in the 15th century, felt that one way they could make themselves stand out from the middle classes was by changing their way of speaking a bit. To do this, they took up the ‘bad’ habits of the lower classes and started pronouncing things the way the lower classes would. But in adopting the pronunciation of the lower classes, they also made it sound ‘refined’ to the ears of the middle classes, so that the middle classes also started to adopt the new pronunciation… and so the mess started.

Pairs of words like file and feel, or wide and weed, have identical consonants, differing purely in their vowels. They are also spelled differently: file and wide are written with <i…e>, while feel and weed are written with <ee>. The tricky part comes when you want to tell another person in writing how these words are pronounced. To do that one normally makes a comparison with other familiar words – for example, you could tell them ‘feel rhymes with meal’ –  but what do you do if the other person doesn’t speak English? In order to solve this problem, linguists in the late 19th century invented a special alphabet called the ‘International Phonetic Alphabet’ or ‘IPA’, in which each character corresponds to a single sound, and every possible sound is represented by a unique character. The idea was that this could function as a universal spelling system that anyone could use to record and communicate the sounds of different languages without any ambiguity or confusion. For file and wide, the Oxford English Dictionary website now gives two transcriptions in IPA, one in a standardised British and the other in standardised American: Brit. /fʌɪl/ & /wʌɪd/ (US /faɪl/ & /waɪd/). For feel and weed, we have Brit. /fiːl/ & /wiːd/ (US /fil/ & /wid/). So, in spelling, <i…e> represents /ʌɪ/ (or /aɪ/) and <ee> represents /iː/ (or /i/). But why is this so?

The answer lies in the spelling itself, which is a tricky thing, as we all know, and took many centuries to be fixed the way it is now. English spelling is a good example of a writing system where a given letter does not always correspond to one particular sound. There is no rule from which you can work out that wifi is pronounced as /wʌɪfʌɪ/ (or /waɪfaɪ/) – you know it simply because you have heard it pronounced and seen it written <wifi>. This is not obvious to other people whose native language is not English: as a native Spanish speaker, when I first saw the word wifi written somewhere, the first pronunciation that came to my mind was /wifi/ (like ‘weefee’) but not /wʌɪfʌɪ/.

Contemporary English spelling very much reflects the way people pronounced things at the end of the Middle Ages. So words like file and wide were pronounced with the vowel represented in IPA as <iː>, which today can be heard in words like feel and weed. At that time, the letter <i> (along with its variant <y>) represented the sound /iː/. The words feel and weed, on the other hand, were pronounced with the vowel represented in IPA by <eː>, sounding something like the words fell and wed, but a little longer. Most of the words that in the English of the Middle Ages were pronounced with the long vowels /iː/ and /eː/ are now pronounced with the diphthong /ʌɪ/ (or /aɪ/) and the vowel /iː/ (or /i/), respectively. These changes were part of a massive overhaul of the English vowel system known as the ‘Great Vowel Shift’, so-called because it affected all long vowels – of which there were quite a few – and it took centuries to complete. Some even claim that it’s still taking place. But if we fail to update our spelling as pronunciation changes, how can we tell when this shift happened? That is when the word sciatica comes in.

The word sciatica is now pronounced as /sʌɪˈatᵻkə/ (US /saɪˈædəkə/). Because of the spelling <i> in ‘sci…’, we know that the word would have been pronounced something like /siːˈatika/ (‘see-atica’) when it was introduced in English from Latin by doctors, who at that time still used Latin as the language of exchange in their science. But sciatica is not a very common English word, and does not even sound naturally English. So unless you are a doctor or a very educated person, there is a high chance of getting the spelling wrong. In a letter to her husband John in 1441, Margaret Paston wrote the following about a neighbour: “Elysabet Peverel hath leye seke xv or xvj wekys of þe seyetyka” – “Elisabeth Peverel has lain sick 15 or 16 weeks of the sciatica”. While my sympathies go to Elisabeth Peverel as I write this, the interesting thing here is the way the word sciatica is written by Margaret Paston, as seyetyka. Here the spelling with <ey> tells us a nice story: that the diphthongisation of Medieval /iː/ into something like /eɪ/ had already happened in 1441. Because of that word we know that Margaret Paston, her husband, and poor Elysabet Peverel not only said /seɪˈatikə/ but also /feɪl/, /weɪd/ and /teɪm/, rather than /fi:l/, /wi:d/ and /ti:m/, even if they still wrote them the old way with an <i> as file, wide and time, just as we do nowadays. From this we can also deduce by the laws of sound change that the other long vowels had also started to change their pronunciation, so that these people were already pronouncing feel and weed in the modern way, despite spelling them the old way with an <e>.

This mouthful of a word sciatica is thus the first word in the entire history of English to tell us about the Great Vowel Shift. It is true that its story doesn’t ease the pain that its meaning evokes, but at least it makes it easier to deal with it by entertaining the mind…

 

Guarantee and warranty: two words for the price of one

Guarantee and warranty: two words for the price of one

By and large, languages avoid having multiple words with the same meaning. This makes sense from the point of view of economy: why learn two words when one will do the job?

But occasionally there are exceptions, such as warranty and guarantee. This is one of several synonymous or near-synonymous pairs of words in English conforming to the same pattern – another example is guard and ward. The variants with gu- represent early borrowings from Germanic languages into the Romance languages descended from Latin. At the time these words were borrowed, the sound w had generally developed into v in Romance languages, but it survived after g, in the descendants of a few Latin words like lingua ‘tongue, language’. So when Romance speakers adapted Germanic words to the sounds of their own language, gu was the closest approximation they could find to Germanic w.

This is why French has some words like guerre ‘war’, where gu- corresponds to w- in English (this word may have been borrowed because the inherited Latin word for war, bellum, had become identical to the word for ‘beautiful’). Later, some of the words with gu- were borrowed back into English, which is why we have both borrowed guard and inherited ward. According to one estimate, 28.3% of the vocabulary of English has been borrowed from French (figures derived from actual texts rather than dictionaries come in even higher at around 40%), a debt that we have recently started repaying in earnest with loans like le shopping and le baby-sitting. This is all to the consternation of the Académie française, which aims to protect the French language from such barbarisms, as evidenced by the dire, ne pas dire (‘say, don’t say’) section of the académie‘s website advising Francophones to use homegrown terms like contre-vérité instead of anglicisms like fake news.

By Murraytheb at English Wikipedia - Transferred from en.wikipedia to Commons., Public Domain, https://commons.wikimedia.org/w/index.php?curid=3448702

In fact, warranty and guarantee reflect not one but two different waves of borrowing: the first from Norman French, which still retained the w- sound, likely through the influence of Scandinavian languages spoken by the original Viking invaders of Normandy. Multiple layers of borrowing can also be seen in words like castle, from Latin castellum via Norman French, and chateau, borrowed from later French, in which Latin c- had developed a different pronunciation.

Incidentally, Norman French is still continued not only in Normandy but also in the Channel islands of Guernsey, Jersey and Sark. The Anglo-Norman dialect of the island of Alderney died out during World War II, when most of the island’s population was evacuated to the British mainland, although efforts are underway to bring it back.

A daggy blog post

A daggy blog post

One of the most ubiquitously Australian words is the word dag. A word known and loved by basically any Aussie.

Classic daggy dad
Fig. 1 – The classic daggy-dad weekend look

It’s a light-hearted insult referring to someone who is unfashionable or socially awkward, basically a bit of a dork (Fig 1). But like most insults in Australian English it’s also used affectionately as a term of endearment (what does this say about how Australians relate to each other?). Typically in these cases, it is used to convey a sense of regard for the unashamedness of the dag in question – to express the lovable quality of someone who is just oblivious to certain social norms.

Ewww
Fig. 2 – An actual dag.

However, the origins of this this word are anything but loveable. According to the popular story (which appears to be supported by Macquarie Dictionary and The Australian National Dictionary), this usage is derived from the older meaning (attested in 1891) of the word dag to refer to a matted clot of wool and dung that forms around a sheep’s bum (Fig 2). By 1967 something  ‘dirty and unkempt’ could be referred to as daggy and by the 1980s we were using the word for Figure 2 for the unfashionable yet loveable dad in Figure 1.

As an Australian, I am proud of my dagginess and am pleased to know our daggy little word has a pretty gross origin.