Double trouble treble

Double trouble treble

You’ll get in trouble if you drink a tripel, the strong pale ale brewed by the most hipster of monks, the Trappists.

The Lowlands are the Hoxton of Europe

Tripels have three times the strength (around 8-10% percent ABV) of the standard table beer historically consumed by the monks themselves. This enkel or ‘single’ beer was traditionally not available outside the cloisters, while the duppel (a double strength dark brown beer made with caramelized beet sugar) was sold to provide income for the monastery. Although the term enkel is no longer in common beer parlance (it is on the cusp of a comeback), duppel and tripel have held their ground. It is generally thought that the tripel takes its name from its threefold strength, but it is also sometimes claimed that it is because it has three times the malt of a regular brew. A quadrupel is VERY strong.

As we have seen already in this blog when counting sheep in Slovenian and yams in Ngkolumbu, means for the expression of quantities and multiplication are often linguistically fascinating. Not least the doublet treble and triple, which originate from the same etymological source.

The Latin word triplus ‘threefold, triple’ first entered English via Old French treble. Not satisfied with claiming the space previously occupied by the Old English adjective þrifeald ‘threefold’, it turned up again by the 15th century as the adjective triple.

This triad of modifiers (threefold, treble and triple) exemplify some of the pathways by which lexical synonymy can come about. The first word was formed through a compounding processes (i.e. the numeral three forming a new word with the multiplicative form –fold), the second entered the language through direct borrowing, and the third through a second wave of borrowing (either from Old French triple or Latin triplus).

We don’t just find words competing to express the same meaning, but also parts of words. The –fold element of threefold, tenfold and manifold, and the –plus of triplus, are argued to have developed from the same Proto Indo-European root *pel ‘to fold’. To complicate things even further, the now obsolete treblefold was attested between the 14th and 16th centuries. Words, it seems, like to fight for the same space, and can sometimes be incestuous.

Since entering English over 500 years ago, triple and treble have staked out different paths, but retained similar meanings in at least some of their manifestations, as explored by Catherine Soanes on the OxfordWords blog. In terms of frequency, triple is the stronger twin (or is it a triplet? quadruplet?), ending up triumphant with around 6 times more occurrences in the Oxford English Corpus.

But treble has some resilience. Although the official Scrabble board has double and triple word scores, treble word scores are occasionally referred to on the net (albeit erroneously, or in a devil-may-care way), such as in Charlie Brooker’s article on how to cheat at scrabble. I even found a ‘threefold word score’ on a Scrabble knock-off site. Lawyers to the ready!

This demonstrates that these adjectives really are semantically interchangeable for the most part, even though their distributions are not identical.

The take home? While not not every monastery sells the same tripel, they will all get you drunk.



Hallowe’en will soon be upon us, so it is only right we turn our attention to monsters. Consider the werewolf. It’s a wolf, sort of, as the name indicates, but what’s a were? The usual assumption is that it’s a leftover of an older word meaning ‘man’ that fell completely out of fashion by the 14th century. As a result we have what looks like a compound word, except that one of the parts doesn’t have any meaning on its own. Perhaps not, but that hasn’t stopped people from squeezing some value out of it nonetheless: if a werewolf is a person who turns into a wolf — or at any rate, part person, part wolf — then a were-bear is a mixture of person and bear, and so on down to were-turtles.

Actually, people don’t seem to be that literal-minded when it comes to word meanings, if the various were-creatures in circulation are any evidence. The monster from “Wallace and Gromit: Curse of the Were-Rabbit” is not half-human, half-rabbit, but more just kind of a monster rabbit, with a thicker pelt. (Visually calqued, I suspect, from the not-particularly wolf-like wolfman of the wolfman movies featuring Lon Chaney Jr.)

And were-fleas, to the extent that they exist, appear to be carriers of lycanthropism rather than human/insect conglomerates. None of this is yet reflected in the Oxford English Dictionary’s entry on were– (you need a subscription for that but it’s free if you have a UK public library card!). Give it a few decades more maybe.

Strangely, words for werewolf in other languages share a propensity for being compounds made up of ‘wolf’ plus some other completely opaque element. The first part of Czech vlkodlak is vlk, which means ‘wolf‘, but dlak on its own is not an independent word. (Not in Czech at any rate, but in the related language Slovenian the equivalent word volkodlak is clearly made up of volk ‘wolf’ and dlaka, which means ‘hair’ or ‘fur’.) And the French werewolf, loup-garou, has the word for ‘wolf’ in it (loup), but garou is not an independent word (other than being an unrelated homonym meaning ‘flax-leaved daphne’). That part seems to have been our very own Germanic word werewolf borrowed at an early date (earliest attestation as garwall from the 12th century). Both of these have, like werewolf, given rise to further monstrous hybrids like Czech prasodlak, from prase ‘pig’, or the French cochon-garou.

In fact, Czech and French have gone one step further than English. Though I just wrote that dlak and garou were not words, that was being a bit pedantic. Neither of them are listed in the authoritative Academy dictionaries of Czech and French, but nonetheless they do seem to have split off from their host body, rather as happened — if we can be permitted to mix monster metaphors — to the hero of 1959’s “The Manster (a.k.a The Split)”.

For example, this Czech website tells us about vlkodlaci i jiní dlaci ‘werewolves and other were-creatures’ (dlaci is the plural of dlak), and in French the phrase courir le garou ‘run the garou‘ used, at least, to be in circulation, meaning basically ‘go around at night being a werewolf”. That use in turn apparently spawned a verb garouter, meaning much the same thing. The curse lives on.

Optimal Categorisation: How do we categorise the world around us?

Optimal Categorisation: How do we categorise the world around us?

People love to categorise! We do this on a daily basis, consciously and subconsciously. When we are confronted with something new we try and figure out what it is by comparing it to something we already know. Say, for instance, I saw something flying through the air – I may think to myself that the object is a bird, or I may say it is a plane based on my previous experiences of birds and planes. Of course the object may turn out to be something completely new, perhaps even superman!

Is it a bird? Is it a plane? No it’s Superman!

Our love of classification runs deep in scientific enquiry. Botanists and zoologists classify plants and animals into different taxonomies. Even the humble linguist loves to classify – is this new word a noun or a verb? What about the new word zoodle that was recently added to the Merrriam Webster dctionary? Is it a thing? Or an action? Can I zoodle something or is it something I can pick up and touch? Well apparently zoodle is a noun which means ‘a long, thin strip of zucchini that resembles a string or narrow ribbon of pasta’. To be honest, I love eating zoodles, though until now I never knew what they were called!

The way people classify entities around them has become encoded in the different languages we speak in many different ways. The most obvious example that springs to mind is when we learn a new language, like French or German, we are confronted with a grammatical gender system. French has two genders – Masculine and Feminine. But German has three – Masculine, Feminine and Neuter. Other languages can have many more gender distinctions. Fula, a language spoken in west and central Africa, has twenty different gender categories!

So what exactly are grammatical gender systems and how are they realised in different languages? Gender systems categorise nouns into different groups and tend to appear not on the noun itself, but on other elements in the phrase. In German, nouns are split into three different gender categories – masculine, feminine and neuter. The gender of a noun is shown by using different articles (the word ‘the’ or ‘a’) and sometimes by changing the ending of an adjective, but never on the noun itself. Thus the word for ‘the’ in German is either der, die or das depending on whether the noun in the phrase is masculine, feminine or neuter.

(1)        der       Mann
              the       man

(2)        die        Frau
              the       woman

(3)        das       Haus
              the       house

This is called ‘agreement’ as the adjectives and articles must agree with the gender of the noun. In a language with gender, each noun typically can only occur in one gender category.

Not every language has a grammatical gender system, but they are highly pervasive, with around 40% of all languages having such a system. English is quite a poor example when it comes to gender. There is no real gender agreement in English, with the exception of pronouns. We have to say: Bill walked into the grocers. He bought some apples. Where the pronoun he must agree with the gender of the noun that was previously mentioned. English uses he, she and it as the only markers of gender agreement.

Languages behave differently in how they allocate nouns to the different genders, which can be very baffling for language learners! Why in French is chair feminine, la chaise, but in German it is masculine, der Stuhl? How a language allocates nouns to its gender categories can seem somewhat arbitrary – with the exception of the words for women and men, which fall into the feminine and masculine genders being the only semantically obvious choices.

But wait! If you thought the English gender system was dull, think again! A couple of months ago my piano was being restored and when it was being moved back into the lounge the piano movers kept saying: “pull her a little bit more” and “turn her this way”. The movers used the female pronouns to describe the piano. In English, countries, pianos, ships and sometimes even cars use the feminine pronouns.

Grammatical gender isn’t the only way languages classify nouns. Some languages use words called classifiers to categorise nouns. Classifiers are similar to English measure terms, which categorise the noun in terms of its quantity, such as ‘sheet of paper’ vs. ‘pack of paper’ or ‘slice of bread vs. ‘loaf of bread’. Classifiers are found in languages all over the world and are able to categorise nouns depending on the shape, size, quantity or use of the referent, e.g. ‘animal kangaroo’ (alive) vs. ‘meat kangaroo’ (not alive). Classifier systems are very different to gender systems as nouns in a language with classifiers can appear with different classifiers depending on what property of the noun you wish to highlight. There are many different types of classifier systems, but to keep things short I am just going talk about possessive classifiers, which are mainly found in the Oceanic languages, spoken in the South Pacific.

When an item is in your possession we use possessive pronouns in English to say who the item belongs to. For instance if I say ‘my coconut’ – the possessive pronoun is my. In many Oceanic languages a noun can occur with different forms for the word my depending on how the owner intends to use it. For instance the Paamese language, spoken in Vanuatu, has four possessive classifiers and I could use the ‘drinkable’ if I was talking about my coconut that I was going to drink. I would use the ‘edible’ classifier if I was going to eat my coconut. I would use the classifier for ‘land’ if I was talking about the coconut growing in my garden. Finally, I could use the ‘manipulative’ classifier if I was going to use my coconut for some other purpose – perhaps to sit on!

(4)        ani                   mak
              coconut           my.drinkable
              ‘my coconut (that I will drink)’

(5)        ani                   ak
              coconut           my.edible
              ‘my coconut (that I will eat)’

Why do languages have different ways of categorising nouns? How do these systems develop and change over time? Are gender systems easier to learn than classifier systems? Are gender and classifiers completely different systems? Or is there more similarity to them than meets the eye? These are some of the big questions in linguistics and psychology. We are excited to start a new research project at the Surrey Morphology Group, called optimal categorisation: the origin and nature of gender from a psycholinguistic perspective, that seeks to answer these fundamental questions. Over the next three years we will talk more about these fascinating categorisation systems, explain our experimental research methods, introduce the languages and speakers under investigation, and share our findings via this blog. Just look out for the ‘Optimal Categorisation’ headings!

The cat’s mneow: animal noises and human language

The cat’s mneow: animal noises and human language

As is well known, animals on the internet can have very impressive language skills: cats and dogs in particular are famous for their near-complete online mastery of English, and only highly trained professional linguists (including some of us here at SMG) are able to spot the subtle grammatical and orthographic clues that indicate non-human authorship behind some of the world’s favourite motivational statements.

Recent reports suggest that some of our fellow primates have also learnt to engage in complex discourse: again, the internet offers compelling evidence for this.

But sadly, out in the real world, animals capable of orating on philosophy are hard to come by (as far as we can tell). Instead, from a human point of view, cats, dogs, gorillas etc. just make various kinds of animal noises.

Why write about animals and their noises on a linguistics blog? Well, one good answer would be: the exact relationship between the vocalisations made by animals, on one hand, and the phenomenon of human spoken language, on the other, is a fascinating question, of interest within linguistics but far beyond it as well. So a different blog post could have turned now to discuss the semiotic notion of communication in the abstract; or perhaps the biological evolution of language in our species, complete with details about the FOXP2 gene and the descent of the larynx

But in fact I am going to talk about something a lot less technical-sounding. This post is about what could be called the human versions of animal noises: that is, the noises that English and other languages use in order to talk about them, like meow and woof, baa and moo.

At this point you may be wondering whether there is much to be gained by sitting around and pondering words like moo. But what I have in mind here is this kind of thing:

These are good fun, but they also raise a question. If pigs and ducks are wandering around all over the world making pig and duck noises respectively, then how come we humans appear to have such different ideas about what they sound like? Oink cannot really be mistaken for nöff or knor, let alone buu. And the problem is bigger than that: even within a single language, English, frogs can go both croak and ribbit; dogs don’t just go woof, but they also yap and bark. These sound nothing like each other. What is going on? Are we trying to do impressions of animals, only to discover that we are not very good at it?

Before going any further I should deal with a couple of red herrings (to stick with the zoological theme). For one thing, languages may appear to disagree more than they really do, just because their speakers have settled on different spelling conventions: a French coin doesn’t really sound all that different from an English quack. And sometimes we may not all be talking about the same sound in the first place. Ribbit is a good depiction of the noise a frog makes if it happens to belong to a particular species found in Southern California – but thanks to the cultural influence of Hollywood, ribbit is familiar to English speakers worldwide, even though their own local frogs may sound a lot more croaky. Meanwhile, it is easy to picture the difference between the kind of dog that goes woof and the kind that goes yap.

But even when we discount this kind of thing, there are still plenty of disagreements remaining, and they pose a puzzle bound up with linguistics. A fundamental feature of human language, famously pointed out by Saussure, is that most words are arbitrary: they have nothing inherently in common with the things they refer to. For example, there is nothing actually green about the sound of the word green – English has just assigned that particular sound sequence to that meaning, and it’s no surprise to find that other languages haven’t chosen the same sounds to do the same job. But right now we are in the broad realm of onomatopoeia, where you might not expect to find arbitrariness like this. After all, unlike the concept of ‘green’, the concept of ‘quack’ is linked to a real noise that can be heard out there in the world: why would languages bother to disagree about it?


First off, it is worth noticing that not all words relating to animal noises work in the same way. Think of cock-a-doodle-doo and crow. Both of these are used in English of the distinctive sound made by a cockerel, and there is something imitative about them both. But there is a difference between them: the first is used to represent the sound itself, whereas the second is the word that English uses to talk about producing it. That is, as English sees it, the way a cock crows is by ‘saying’ cock-a-doodle-doo, and never vice versa. Similarly, the way that a dog barks is by ‘saying’ woof. The representations of the sounds, cock-a-doodle-doo and woof, are practically in quotation marks, as if capturing the animals’ direct speech.

This gives us something to run with. After all, think about the work that words like crow and bark have to do. As they are verbs, you need to be able to change them according to person (they bark but it barks), tense, and so on. So regardless of their special function of talking about noises, they still have to operate like any other verb, obeying the normal grammar rules of English. Since every language comes with its own grammatical requirements and preferences about how words can be structured and manipulated (that is, its own morphology), this can explain some kinds of disparity across languages. For example, what we onomatopoeically call a cuckoo is a kukushka in Russian, featuring a noun-forming element shka which makes the word easier to deal with grammatically – but also makes it sound very Russian. Maybe it is this kind of integration into each language that makes these words sound less true to life and more varied from one language to another?

This is a start, but it must be far from the whole story. Animal ‘quotes’ like woof and cock-a-doodle-doo don’t need to interact all that much with English grammar at all. Nonetheless, they are clearly the English versions of the noises we are talking about:

And as we’ve already seen, the same goes for quack and oink. So even when it looks like we might just be ‘doing impressions’ of non-linguistic sounds, every language has its own way of actually doing those impressions.

Reassuringly, at least we are not dealing with a situation of total chaos. Across languages, duck noises reliably contain an open a sound, while pig noises reliably don’t. And there is widespread agreement when it comes to some animals: cows always go moo, boo or similar, and sheep are always represented as producing something like meh or beh – this is so predictable that it has even been used as evidence for how certain letters were pronounced in Ancient Greek. So languages are not going out of their way to disagree with each other. But this just sharpens up the question. For obvious biological reasons, humans can never really make all the noises that animals can. But given that people the world over sometimes converge on a more or less uniform representation for a given noise, why doesn’t this always happen?

In their feline wisdom, the cats of the Czech Republic can give us a clue. Like sheep, cats sound pretty similar in languages across the globe, and in Europe they are especially consistent. In English, they go meow; in German, it is miau; in Russian, myau; and so on. But in Czech, they go mňau (= approximately mnyau), with a mysterious n-sound inside. The reason is that at some point in the history of Czech, a change in pronunciation affected every word containing a sequence my, so that it came out as mny instead. Effectively, for Czech speakers from then on, the option of saying myau like everyone else was simply off the table, because the language no longer allowed it – no matter what their cats sounded like.

What does this example illustrate? First of all – as well as a morphology, each language has a phonology (sound structure), which constrains its speakers tightly: no language lets people use all the sounds they are physically able to make, and even the available sounds are only allowed to join up in certain combinations. So each language has to come up with a way of dealing with non-linguistic noises which will suit its own idea of what counts as a legitimate syllable. Moo is one thing, but it’s harder to find a language that allows syllables resembling the noise a pig makes… so each language compromises in its own way, resulting in nöff, knor, oink etc., none of which capture the full sonic experience of the real thing.

And second – things like oink, woof and mňau really must be words in the full sense. They aren’t just a kind of quotation, or an imitation performed off the cuff; instead they belong in a speaker’s mental dictionary of their own language. That is why, in general, they have to abide by the same phonological rules as any other word. And that also explains where the arbitrariness comes in: as with any word, language learners just notice that that is the way their own community expresses a shared concept, and from then on there is no point in reinventing the wheel. You don’t need to try hard to get a duck’s quack exactly right in order to talk about it – as long as other people know what you mean, the word has done its job.

So what speakers might lose in accuracy this way, they make up for in efficiency, by picking a predetermined word that they know fellow speakers will recognise. Only when you really want to draw attention to a sound is it worth coming up with a new representation of it and ignoring the existing consensus. To create something truly striking, perhaps you need to be a visionary like James Joyce, who wrote the following line of ‘dialogue’ for a cat in Ulysses, giving short shrift to English phonology in the process:



What’s the good of ‘would of’?

What’s the good of ‘would of’?

As schoolteachers the English-speaking world over know well, the use of of instead of have after modal verbs like would, should and must is a very common feature in the writing of children (and many adults). Some take this an omen of the demise of the English language,  and would perhaps agree with Fowler’s colourful assertion in A Dictionary of Modern English Usage (1926) that “of shares with another word of the same length, as, the evil glory of being accessory to more crimes against grammar than any other” (though admittedly this use of of has been hanging around for a while without doing any apparent harm: this study finds one example as early as 1773, and another almost half a century later in a letter of the poet Keats).

According to the usual explanation, this is nothing more than a spelling mistake. Following ‘would’, ‘could’ etc., the verb have is usually pronounced in a reduced form as [əv], usually spelt would’ve, must’ve, and so on. It can even be reduced further to [ə], as in shoulda, woulda, coulda. This kind of phonetic reduction is a normal part of grammaticalisation, the process by which grammatical markers evolve out of full words. Given the famous unreliability of English spelling, and the fact that these reduced forms of have sound identical to reduced forms of the preposition of (as in a cuppa tea), writers can be forgiven for mistakenly inferring the following rule:

‘what you hear/say as [əv] or [ə], write as of’.

But if it’s just a spelling mistake, this use of ‘of’ is surprisingly common in respectable literature. The examples below (from this blog post documenting the phenomenon) are typical:

‘If I hadn’t of got my tubes tied, it could of been me, say I was ten years younger.’ (Margaret Atwood, The Handmaid’s Tale)

Couldn’t you of – oh, he was ignorant in his speech – couldn’t you of prevented it?’ (Hilary Mantel, Beyond Black)

Clearly neither these authors nor their editors make careless errors. They consciously use ‘of’ instead of ‘have’ in these examples for stylistic effect. This is typically found in dialogue to imply something about the speaker, be it positive (i.e. they’re authentic and unpretentious) or negative (they are illiterate or unsophisticated).


These examples look like ‘eye dialect’: the use of nonstandard spellings that correspond to a standard pronunciation, and so seem ‘dialecty’ to the eye but not the ear. This is often seen in news headlines, like the Sun newspaper’s famous proclamation “it’s the Sun wot won it!” announcing the surprise victory of the conservatives in the 1992 general election. But what about sentences like the following from the British National Corpus?

“If we’d of accepted it would of meant we would have to of sold every stick of furniture because the rooms were not large enough”

The BNC is intended as a neutral record of the English language in the late 20th century, containing 100 million words of carefully transcribed and spellchecked text. As such, we expect it to have minimal errors, and there is certainly no reason it should contain eye dialect. As Geoffrey Sampson explains in this article:

“I had taken the of spelling to represent a simple orthographic confusion… I took this to imply that cases like could of should be corrected to could’ve; but two researchers with whom I discussed the issue on separate occasions felt that this was inappropriate – one, with a language-teaching background, protested vigorously that could of should be retained because, for the speakers, the word ‘really is’ of rather than have.”

In other words, some speakers have not just reinterpreted the rules of English spelling, but the rules of English grammar itself. As a result, they understand expressions like should’ve been and must’ve gone as instances of a construction containing the preposition of instead of the verb have:

Modal verb (e.g. must, would…) + of + past participle (e.g. had, been, driven…)

One way of testing this theory is to look at pronunciation. Of can receive a full pronunciation [ɒv] (with the same vowel as in hot) when it occurs at the end of a sentence, for example ‘what are you dreaming of?’. So if the word ‘really is’ of for some speakers, we ought to hear [ɒv] in utterances where of/have appears at the end, such as the sentence below. To my mind’s ear, this pronunciation sounds okay, and I think I even use it sometimes (although intuition isn’t always a reliable guide to your own speech).

I didn’t think I left the door open, but I must of.

The examples below from the Audio BNC, both from the same speaker, are transcribed as of but clearly pronounced as [ə] or [əv]. In the second example, of appears to be at the end of the utterance, where we might expect to hear [ɒv], although the amount of background noise makes it hard to tell for sure.

 “Should of done it last night when it was empty then” (audio) (pronounced [ə], i.e. shoulda)

(phone rings) “Should of.” (audio) (pronounced [əv], i.e. should’ve)

When carefully interpreted, writing can also be a source of clues on how speakers make sense of their language. If writing have as of is just a linguistically meaningless spelling mistake, why do we never see spellings like pint’ve beer or a man’ve his word? (Though we do, occasionally, see sort’ve or kind’ve). This otherwise puzzling asymmetry is explained if the spelling of in should of etc. is supported by a genuine linguistic change, at least for some speakers. Furthermore, have only gets spelt of when it follows a modal verb, but never in sentences like the dogs have been fed, although the pronunciation [əv] is just as acceptable here as in the dogs must have been fed (and in both cases have can be written ‘ve).

If this nonstandard spelling reflects a real linguistic variant (as this paper argues), this is quite a departure from the usual role of a preposition like of, which is typically followed by a noun rather than a verb. The preposition to is a partial exception, because while it is followed by a noun in sentences like we went to the party, it can also be followed by a verb in sentences like we like to party. But with to, the verb must appear in its basic infinitive form (party) rather than the past participle (we must’ve partied too hard), making it a bit different from modal of, if such a thing exists.

She must’ve partied too hard

Whether or not we’re convinced by the modal-of theory, it’s remarkable how often we make idiosyncratic analyses of the language we hear spoken around us. Sometimes these are corrected by exposure to the written language: I remember as a young child having my spelling corrected from storbry to strawberry, which led to a small epiphany for me, as that was the first time I realised the word had anything to do with either straw or berry. But many more examples slip under the radar. When these new analyses lead to permanent changes in spelling or pronunciation we sometimes call them folk etymology, as when the Spanish word cucaracha was misheard by English speakers as containing the words cock and roach, and became cockroach (you can read more about folk etymology in earlier posts by Briana and Matthew).

Meanwhile, if any readers can find clear evidence of modal of with the full pronunciation as  [ɒv], please comment below! I’m quite sure I’ve heard it, but solid evidence has proven surprisingly elusive…

No we [kæn]

No we [kæn]

If something bad happened to someone you hold in contempt, would you give a fig, a shit or a flying f**k? While figs might be a luxury food item in Britain, their historical status as something that is valueless or contemptible puts them on the same level as crap, iotas and rats’ asses for the purposes of caring.

In English, we have a wide range of tools for expressing apathy. But we don’t always agree on how to express it, and even use seemingly opposite affirmative and negative sentences to express very similar concepts.  Consider the confusing distinction between ‘I couldn’t care less’ vs. ‘I could care less’ which are used in identical contexts by British and American speakers of English to mean pretty much the same thing. This mind-boggling pattern makes sense when we realise that those cold-hearted people who couldn’t care less have a care-factor of zero, while the others don’t care much, but could do so even less, if necessary.

Putting aside such oddities, negation is normally crucial to interpreting a sentence – words like ‘not’ determine whether the rest of the sentence is affirmative or negative (i.e. whether you’re claiming it is true or false). Accordingly, languages tend to mark negation clearly, sometimes in more than once place within a sentence. One of the world’s most robust languages in this respect is Bierebo, an Austronesian language spoken in Vanuatu, where no less than three words for expressing negation are required at once (Budd 2010: 518):

Mara   a-sa-yal              re         manu  dupwa  pwel.
NEGl   3PL.S-eat-find   NEG2  bird     ANA      NEG3
‘They didn’t get to eat the bird.’

While marking negation three times might seem a little inefficient, this pales in comparison to the problems that arise when you don’t clearly indicate it all. We only have to turn to English to see this at work, where the distinction between Received Pronunciation can [kæn] and can’t [kɑ:nt] is frequently imperceptible in American varieties where final /t/ is not released, resulting in [kæn] or [kən] in both affirmative and negative contexts.

You might think that once a word or affix or sound that indicates negation has been removed from a word, there isn’t anywhere else to go. But some Dravidian languages spoken in India really push the boat out in this respect. Instead of adding some sort of negative word or affix to an affirmative sentence to signal negation, the tense affix (past –tt or future -pp) is taken away, as shown by the contrast between literary Tamil affirmatives and negatives.

pati-tt-ēn                    pati-pp-ēn                  patiy-ēn
‘I learned’                  ‘I will learn.’               ‘I do/did/will not learn.’

This is highly unusual from a linguistic point of view, and it’s tempting to think that languages avoid this type of negation because it is difficult to learn or doesn’t make sense design-wise. But historical records show similar patterns have been attested across Dravidian languages for centuries. This demonstrates that inflection patterns of this kind can be highly sustainable when they come about – so we might be stuck with the can/can’t collapse for a while to come.

On prodigal loanwords

On prodigal loanwords

Most people at some point in their life will have heard someone remark on how their language X (where X is any language) is getting corrupted by other languages and generally “losing its X-ness”. Today I would like to focus on one aspect of the so-called corruption of languages by other languages — lexical borrowings – and show that it’s perhaps not that bad.

European French (at least the French advertised by the Académie Française) is certainly a language about which its speakers worry, so much so that there is even an institution in charge of deciding what is French and what is not (see Helen’s earlier post). A number of English-looking/sounding words now commonly used in spoken French have indeed been taken from English, but English first took them from French!

For instance, the word flirter ‘to court someone’ is obviously adapted from English to flirt and it has the same meaning in both languages. But the English word is the adaptation of the French word fleurette in the expression conter fleurette! The expression conter fleurette is no longer used (casually) in spoken French.

“How could the universe live without your beauty?” “I wonder how sincere he is…”

Other examples of English words borrowed from (parts of) French expressions which then get adapted into French are in (2).

Thus un rosbif is an adaptation into French of roast beef which is itself an adaptation into English of the passive participle of the verb rostir “roast” which later became rôtir in Modern French, and buef “ox/beef” which later became boeuf in the Modern French.

The word un toast comes from English toast with the meaning “piece of toasted bread”. The English word itself was borrowed from tostée, an Old French noun derived from the verb toster which is not used in Modern French. The word pédigré comes from English pedigree but this word is itself adapted from French pied de grue “crane foot”, describing the shape of junctions in genealogical trees.

Pied de grue ‘Crane foot’

Finally, the verb distancer is transitive in Modern French, which means that it requires a direct object: thus the sentence in (a) is good because the verb distancer “distance” has a direct object, the phrase la voiture blanche  “the white car”. By contrast, the construction in (b) is not acceptable (signified by the * symbol) because it lacks an object.

a. La voiture rouge a distancé la voiture blanche.
‘The red car distanced the white car.’
b. *La voiture rouge a distancé.

The (transitive) Modern French verb distancer comes from English to distance which itself is a borrowing from the no-longer-used Old French verb distancer which was uniquely intransitive with the meaning “be far” (that is, in Old French, distancer could only be used in a construction with no direct object).

Another instance is (3): the word tonnelle ‘bower, arbor’ was borrowed into English and became tunnel under the influence of the local pronunciation. The word tunnel was then borrowed by French to refer exclusively to …. wait for it … tunnels. Both words now subsist in French with different meanings.

Une tonnelle ‘a bower’, Un tunnel ‘a tunnel’

Other examples of words that were borrowed into English and ‘came back’ into French with a different meaning are in (4).

The ancestor of tennis is the jeu de paume during which players would say tenez “there you go” as they were about to serve (at that time the final “z” was pronounced [z], it is not in Modern French). This word was adapted into English and became tennis which was then borrowed back into French to refer to the sport jeu de paume evolved into.

Jeu de paume vs. tennis

The Middle French word magasin used to refer to a warehouse, a collection of things. This word was borrowed into English and came to refer to a collection of things on paper. The word magazine was then borrowed back into French with this new meaning.

The history of the word budget also interesting. The word bouge used to mean “bag” and a small bag was therefore bougette (the -ette suffix is used as a diminutive, e.g. fourche “pitchfork” – fourchette “fork”). The word was borrowed into English where its pronunciation was “nativized” and it came to refer to a small bag of money. It was then borrowed back into French with the new meaning of “allocated sum of money”. Finally, ticket was borrowed from English which borrowed it from French estiquet, which referred to a piece of paper where someone’s name was written.

This happens in other languages of course. For instance, Turkish took the word pistakion ‘pistachio’ from (Ancient) Greek which became fistik. (Modern) Greek then borrowed this word back from Turkish which was then spelled phistiki with the meaning ‘pistachio’.

The main lesson I draw from the existence of ‘prodigal loanwords’ is that one’s impressions of language corruption often lack the perspective to actually ground that impression in reality. A French speaker looking at flirter ‘flirt’ may think that this is another sign of the influence of English — and they would be right — without being aware that this is after all a French word fleurette just coming back home.

Do you know other examples of prodigal loanwords? Please, share by commenting on this post!

L’aventure des langues en Occident, Henriette Walter
Honni soit qui mal y pense, Henriette Walter
Jérôme Serme. 1998. Un exemple de résistance à l’innovation lexicale: les “archaïsmes” du français régional, Thèse Lyon II
Javier Herráez Pindado. 2009. Les emprunts aller-retour entre le français et l’anglais dans le sport. Universidad Politécnica de Madrid.

Linguistic problem? Call in a violin

Linguistic problem? Call in a violin

Like brain surgeons, breakfast cooks and other professionals, linguists fall into two groups: believers and sceptics. Take the fact that wheat is singular in English and oats is plural. Believers are confident that there is a thoroughly good reason for differences like this, based on meaning. Sceptics aren’t easily convinced, and they talk shiftily about rules that once obtained but are since lost, partial regularities, conflicting motivations and simple exceptions. And things can get surprisingly heated, as in the linguistic skirmishes of the late 1980s and early 1990s, which centred on the discussion precisely of wheat and oats. (The feelings and the porridge have cooled sufficiently for it to be safe to mention these contentious nouns again.)

Many oats = much porridge

We talk about one or more scalpels or spatulas (these are count nouns), but we don’t usually count health, wealth or porridge (these are mass nouns). Mass nouns in English are typically singular, as indeed wheat is. So nouns like oats are unusual in being plural, and having no contrasting singular. They are known in the trade as pluralia tantum ‘plural only’. (In contrast, there are languages like Manam where all mass nouns are plural – they treat them all like oats.)

It’s not just mass nouns. We also find that there are nouns which we would expect to be ordinary count nouns which are actually pluralia tantum nouns in English. Examples include scissors, binoculars, trousers, slacks … The believers, who believe there must be a good reason for these nouns to behave in this way, argue as follows: It’s as we’d expect. These are all nouns whose referents have symmetrical parts (usually two, hence they are often called bipartites). Case proven.

But wait: bicycle has two significant parts, emphasised by its form in bi- (rather like binoculars). Why isn’t it subject to the generalization? Why isn’t it like binoculars? And while we’re on it, how about bigraph, shirt, duo and Bactrian camel? They all have two significant parts but are normal count nouns, just like letter, skirt, quartet and elephant.

Even so (say the believers) it’s not just English. French has les ciseaux (plural) ‘the scissors’, Russian has nožnicy (plural) ‘scissors’. These are pluralia tantum nouns – that can’t be coincidences. And yet, sceptically speaking, French has le pantalon ‘the trousers’ and Russian has binokl ‘binoculars’, and both are regular count nouns with singular and plural.

There are indeed various “usual suspects”, which regularly show up as pluralia tantum nouns in different languages, with sufficient frequency to persuade the believers and yet with more than enough no-shows to leave the sceptics unconvinced.

To resolve the issue once and for all (!), we need:

  1. A new item (not one from the “usual suspects” list)
  2. which can have one significant part or more than one (so that we can evaluate the force of the semantic regularity)
  3. with two different terms, one plurale tantum and one not
  4. and comparable forms in different related languages

And then we shall have a clear prediction: more than one significant part >> plurale tantum noun, one significant part >> ordinary count noun. We could resolve the dispute. But where could we hope to find such a creature, outside the laboratory? Here a drum roll would be particular apposite, for it is time for the entry of the Slavonic violins.

In the Balkans, the Slavs have a traditional instrument called the gusle, pictured below. You can hear someone playing it here. (This isn’t to be confused with the East Slavonic gusli, which is quite different, like a psaltery or small harp).

Serbian Gusle

Now the key (sorry) thing for us, is that the gusle in Serbia typically has one string (see the picture). Or rather have one string, since it’s a plurale tantum noun. Got that – so far, gusle, a plurale tantum noun, a traditional violin with one string. Similarly in Slovenian. But a normal singular in Macedonian and Bulgarian. There are different forms in dialects, but the message so far is one string, may be a plurale tantum noun or not.

But then of course there are all those romantic Slavonic symphonies. With classic violins, with four strings. What do they call those? Well, Slovenian, Macedonian and Serbo-Croat all have violina, and it’s a regular noun with singular and plural. Not looking too good for the believers here.

At this point, to be sure we’re conducting the research properly, it would be good to be certain that we’re talking about a classic violin, and just one (not a whole bank of them in a symphony orchestra). Well here a Nobel prize-winner comes to our aid. Ivo Andrić won the literature prize in 1961. He is famous for The Bridge on the Drina. But for us, we need the scene in the book in which two people are practising a Schubert sonatina. That’s one (classical) violin and one piano. Given the popularity of the novel, it’s been translated into most of the Slavonic languages, sometimes more than once. Moreover, to help thing along here, there’s a handy resource, the Parasol site, which allows us to search the parallel translations (that’s von Waldenfels, Ruprecht and Meyer, Roland (2006-): ParaSol, a Corpus of Slavic and Other Languages. Available at Bern, Regensburg). As expected we find violina in Slovenian, Serbo-Croat and in Macedonian. Bulgarian is unique in having cigulka, but again it’s a regular noun with singular and plural. In the East Slavonic languages (Russian, Belarusian and Ukrainian) it is skripka (skrypka in Belarusian). A regular noun with singular and plural. But now in Polish we find the same root, skrzypce, but this is a plurale tantum noun. And yes, they all have four strings.

What about the keen concert-goers who speak Czech and Slovak? Well, they use housle and husle respectively. You can see, I think, where those terms come from, now applied to the classic violin, and yes, they are both pluralia tantum.

Didn’t Andrić mention gusle too? He did indeed, and gave it an important part (sorry) in his story. For the languages into which it is translated as an outside rather than local instrument it stays as a plurale tantum noun.

It gets better. The West Slavonic languages Upper and Lower Sorbian aren’t yet in the ParaSol corpus for this text, so we need to refer to dictionary sources. Stone (2002) gives three terms for ‘violin’ in Upper Sorbian: wiolina (a regular noun) and two pluralia tantum nouns husle and fidle. And it gets even better – the traditional Sorbian violin has three strings (see it here).

In a word, then, there are terms based on different roots, and they can be used of different instruments. But an instrument with four symmetrical parts is likely to be designated by a normal count noun, and one with a single string is likely to be designated by a plurale tantum noun. This is hardly in harmony with the world-view of the believers. But data are no bar to belief.

Brave new words

Brave new words

Words are all around us. And there are a lot of them out there! The Oxford English Dictionary contains full entries for over 170,000 words in current use and over 47,000 obsolete words. Yet, surprisingly, the Economist newspaper reports that most adult native speakers only have a vocabulary of between 20,000–35,000 words. Defining precisely what we mean by a ‘word’ is no mean feat, of course, but even so there is a huge chasm between these two figures.

So, if speakers of English typically know between 12 and 20% of the words recorded in the OED, one might understandably assume that there really wouldn’t be any need to go about creating new ones. Yet barely a day goes by when we don’t encounter a new word in some form or another, whether that be a word that is eventually fully adopted into the language, an ‘incorrect’ word, or even a one-time use word created on the spur of the moment, perhaps for comic effect.

But when we hear a new word for the first time, how are we supposed to know what it means?

Well, this partly depends on how the new word was formed. If the new word is a ‘blend’, then the meaning of the new word might be easily recoverable from its component parts, particularly if aided by context. For instance, the meaning of hangry (angry or frustrated due to hunger) would be quite transparent in ‘We ordered our food over an hour ago. What’s going on? I’m beginning to feel really hangry’, even if you’d never come across the word before. (NB. Given the findings in a hot-off-the-press article from less than a month ago, however, it would appear that the concept of hangriness is a little more involved that the component words might suggest!)

A hangry cat

In a similar vein, when a work colleague, who often takes the same train to work as me, suggested that we should trainstorm ideas during our commute, both the activity and the location were neatly conveyed in a single word that I immediately understood, despite the fact that I’d never heard it before and may never hear it again.

Likewise, if the word you’re hearing for the first time follows the general rules of the language, then it is usually a straightforward task to understand what is really meant. This scenario certainly applies when interpreting child language, which often follows language-internal rules even where they should be overridden by irregular forms, e.g. I goed to the shop and buyed a toy). This was illustrated fairly recently by my three-year-old daughter who, after lining up all her soft toy animals on the edge of her bed, proudly announced that she was the petshopper and asked if I would like to buy a pet.

But new words may also ‘break the rules’ as it were, and still be easy for us to interpret, perhaps by analogy with another similar word. At some point in time, in the not too distant past, what I presume must have been a well-paid marketing team came up with the notion of sun-blushed tomatoes. It’s a wonderful word which conveys a sense of sweetness from having been sat in the sun for a while, but a juiciness from not having been dried out in the same way as sun-dried tomatoes (compare the two images below – I know which ones I would prefer!). However, the verb to blush is intransitive, which means it shouldn’t be allowed to take an object. We can say ‘the sun dried the tomatoes’, but we can’t say ‘the sun blushed the tomatoes’ (and perhaps this is why the term ‘sunblush’ is also quite common nowadays). But by analogy to things that have been sun-dried or, more poetically, sun-kissed, it just works.

Shrivelled sun-dried tomatoes vs. juicy sun-blushed tomatoes

And if you’re Nigella, of course, you might take this process one step further and come up with your own recipe for moonblush tomatoes. These are tomatoes that have been cooked overnight (hence the reference to the moon) in the residual heat of a cooling oven (NB. there are no known cases of anyone having successfully used this method of cooking tomatoes prior to sundown). Google the term ‘moonblush’ and you’ll get 174,000 hits, a vast number of which will reference Nigella Lawson in some way, showing just how unique the word is!

Yet another category of new words are those which, on the surface, appear to follow some rule of word formation in the language, but actually leave you scratching your head when you encounter them for the first time, wondering what they mean. This scenario is often symptomatic of the word having been purposefully coined by someone, say for marketing purposes, who didn’t foresee the potential confusion.

Postcrete = fence post concrete

On a recent trip to a DIY store, I spotted big bags of postcrete. Since I wasn’t there to buy said product, I could have just ignored it, but as a linguist I am, unfortunately, subject to the occupational hazard of being unable to go about my daily life without questioning such things. I realised it had something to do with concrete, for obvious reasons – well, I suppose it could have been somehow related to Crete – and so began thinking to myself ‘I wonder what is used before that?’ I’d assumed the post part of the word was being used as a prefix indicating ‘after in time or in order’. Only later did I learn it was a special fast-setting concrete for bedding in fence posts!

Thinking of detoxing? Be prepared!

Similar confusion ensued when a colleague saw an advertisement which said “why detox, when you can pretox?” Presumably by analogy with detox, itself a relatively new word meaning the removal of toxins from one’s body, it did at first glance seem like the advert was recommending the opposite, i.e. to add toxins to one’s body. Using pretox as a verb probably contributed to the confusion, since words beginning with pre in English are almost invariably verbs meaning do x prior to something else (e.g. precook, preboard, prebook).

Finally, there will always be new words that we have never heard before and whose meaning we are unable to deduce from our existing knowledge of the language. I experienced this just two days ago when the word peng was mentioned in a TV commercial. Fortunately, in this digital age, those of us who are more chronologically gifted than secondary school pupils have the Urban Dictionary on hand to help out.

So, while new words may arise for all manner of reasons and in all manner of contexts, perhaps the most remarkable thing about them is our (almost) unfailing capacity to understand them despite never having heard them uttered before.

How to count to 1296 in Ngkolmpu

How to count to 1296 in Ngkolmpu

In order to feed his family for the year, and prove himself a worthy man, a man living in southern New Guinea is expected to grow 1296 yams (dioscorea sp.) each season. In Ngkolmpu, a language spoken by around 200 people who live in this region in a single village 15kms within the Indonesian side of the border between West Papua and Papua New Guinea, there is a single word for this number ntamnao.

To speakers of English, this seems like an arbitrarily specific number; yet to Ngkolmpu speakers it’s perfectly natural. Ngkolmpu, along with most of its related languages, has what is known as a senary numeral system also known as a base-six system. In English, we use a decimal system which is based on recursions of ten units while senary systems are based around recursions of six. In Ngkolmpu, the words for one to six are naempr, yempoka, yuow, eser, tampui and traowow. Seven is naempr traowo naempr or ‘one six and one;’ thirteen is yempoka traowo naempr or ‘two six and one.’ You should be starting to see the pattern now. But what happens when you get to six groups of six, i.e. 62 or 36? Well there is a specific word for that ptae.  In fact, in Ngkolmpu there are words for 62, 63, 64 and 65. That’s all the way up to 7776! Related language Komnzo even has a word wi which is used for 66 or 46,656! If you want to learn how to count to 7776 in Ngkolmpu the entire system is presented in Table 1.

1 naempr
2 yempoka
3 yuow
4 eser
5 tampui
6 61 traowo
7 naempr traowo naempr
8 naempr traowo yempoka
13 yempoka traowo naempr
36 62 ptae
216 63 tarumpao
1296 64 ntamnao
7776 65 ulamaeke

Table 1 – Senary numerals in Ngkolmpu

While we are used to decimal counting systems in English, lots of languages around the world use different systems. What is remarkable is that these senary systems are essentially unique to the southern New Guinea region. As far as we know, the only languages which use base-six are found in this region. In Ndom, a completely unrelated language to Ngkolmpu spoken on Yos Sudarso Island around 250kms away have a sort of light six-base system. Ndom displays unique words for the numbers one to six, but no words higher terms and no way to construct them from lower numerals; this is what is known as a ‘restricted numeral system.’ As far as we know, this complex base-six system as we see in Ngkolmpu and its relatives are an entirely unique development. This then raises a crucial question: How and why did such a system emerge?

Pic 1 – Yams and plantains for distribution after a feast

This is a hard question to answer. The leading theory on this is based on the primary use of the counting systems: yam tallying. In the communities of southern New Guinea, the various species of dioscorea aka yam are extremely important for every part of life. They are the primary food staple and, as we said before, the general consensus is that it takes a ntamnao of yams to feed a family for a year. Good yam gardeners count their yams to ensure they have enough food for the year but just as importantly for the bragging rights that accompany being a good gardener. Additionally, yams serve many ceremonial roles, for instance a wedding feast can’t be held without a ntamnao of yams which are meticulously counted, brought to the bride’s village and counted again with all parties present. Smaller feasts might require a tarumpao (216) which are counted and distributed to participants as in Picture 2. The significance of counting yams in these cultures has been hypothesised as the motivation for the development this counting system; something we don’t really see anywhere else in the world. The next question is why base six and not some other number? Well, the main yams consumed in this region are teardrop shaped with a round end and a narrow end. These when placed into small piles naturally fall into neat piles of 6 (Picture 3). This provides a motivation for a specifically 6 based system and supports the claim that numeral system emerged through the practice of tallying yams.

Pic 2 – 6 yams in a pile

The Ngkolmpu system only has numerals up to 7776 but hypothetically could be used to count to any number. Numeral systems of this type are known as ‘unrestricted numeral systems.’ We take this for granted in English but in smaller communities these are typically not that common. For example, in Marind a culturally dominant language spoken by around 9000 people in the same region as Ngkolmpu have words for one and two only. Counting is done by counting fingers and toes without any productive means for extending beyond that. Similar are the body part tallies of New Guinea such as the Oksapmin body part tally where one can count up to 27 by listing names for the places along the fingers, hands, arms and head for values up to 27 (Picture 4). This is very different to the Ngkolmpu system as we see in Table 1.

Pic 4 – Oksapmin body tally system

It was previously thought that unrestricted numeral systems could only develop in cultures which had sufficient organisational bureaucracy to warrant such a system. What the southern New Guinea situation shows is that the agrarian practices of yam cultivation under certain conditions also allow for the development of advanced counting systems. So, it looks like if people want to count something enough, they can develop the systems to do so which is remarkable.

The next time you have to count up something in multiples of six spare a thought for the Ngkolmpu and their wonderful counting system.