Morphological Redundancy – Why say something twice when once will do?

Morphological Redundancy – Why say something twice when once will do?

In Batsbi (a language spoken in the Caucusus in North-East Georgia), if you want to say ‘she is ripping the dress’ you might say something like yoxyoyanw k’ab. In this word, each instance of ‘y’ (highlighted in bold) indicates that it is indeed just one dress that she is ripping.

Linguists call this phenomenon multiple exponence, where a single meaning is indicated within a word more than once, for no apparent reason. This, when you think about it, is pretty weird. Typically we think of languages as incremental in nature: intuitively, we assume that when we add something to a word or a sentence we are adding meaning to that word or sentence. But in multiple exponence this clearly can’t be the case. The dress in the Batsbi example is no more singular than any other singular object in the world, so why have three ‘y’s’ rather than just the one we would expect?

In other words, why say something twice when once will do? The short answer is we don’t know (yet!) – sorry to disappoint! But what I can answer is a slightly different question: what does it actually mean to say something twice?

Multiple exponence is not the only way you might say something twice within a word. There is another phenomenon known as overlapping exponence, where the same meaning is indicated by multiple markers in a word (as with multiple exponence), but each marker is also doing some other job. For example, in Filomeno Mata Totonco (a language from Mexico) you say ‘you are coming’ using the word tanpaati. This word has two suffixes, paa and ti, both of which mean ‘you’ (second person). However, the paa also indicates that the event is progressive (like the English –ing), while the other suffix ti indicates that the subject is singular rather than plural. So speakers of this language mention that it’s you who is coming twice, but we couldn’t remove either of the suffixes from the word without affecting the meaning, as both of them also tell us something else about what’s going on.

In Wipi, a language spoken in the Fly River Delta on the south coast of Papua New Guinea, if you want to say that you are building two houses you would use the word arangen which literally means ‘I build two’. This word is rather interesting since you need both the prefix, a, and the suffix, en, to know that this is indeed only two houses as opposed some other number of houses. Yet neither of these affixes actually means ‘two.’ Instead, the suffix en is ambiguous between one or two; we might say it means less than three. The prefix a, in contrast, is used when you are building two or more houses; in other words, it means more than one. Thus, if you are building more than one house but also less than three, there is only one interpretation: you are building two houses. This is called distributed exponence. It’s remarkable that speakers of Wipi say how many houses they are building twice, but in order to know the exact number of houses, you need to listen both times!

The Fly River Delta

It’s amazing really, when you look closely at a simple question like what does it mean to say something twice?, that there is such complexity and diversity in the answer. Beyond what we saw, there are all sorts of in-between cases and the multiple types can interact. As such, teasing them apart can be a real challenge. When I say something twice, it might be that each time gives you more information in subtly different ways. It is untying this kind of subtle diversity which hopefully gives us some hint as to why speakers and languages would ever do such a thing to begin with.

Sense and polarity, or why meaning can drive language change

Sense and polarity, or why meaning can drive language change

Generally a sentence can be negative or positive depending on what one actually wants to express. Thus if I’m asked whether I think that John’s new hobby – say climbing – is a good idea, I can say It’s not a good idea; conversely, if I do think it is a good idea, I can remove the negation not to make the sentence positive and say It’s a good idea. Both sentences are perfectly acceptable in this context.

From such an example, we might therefore conclude that any sentence can be made positive by removing the relevant negative word – most often not – from the sentence. But if that is the case, why is the non-negative response I like it one bit not acceptable, odd when its negative counterpart I don’t like it one bit is perfectly acceptable and natural?

This contrast has to do with the expression one bit: notice that if it is removed, then both negative and positive responses are perfectly fine: I could respond I don’t like it or, if I do like it, I (do) like it.

It seems that there is something special about the phrase one bit: it wants to be in a negative sentence. But why? It turns out that this question is a very big puzzle, not only for English grammar but for the grammar of most (all?) languages. For instance in French, the expression bouger/lever le petit doigt `lift a finger’ must appear in a negative sentence. Thus if I know that John wanted to help with your house move and I ask you how it went, you could say Il n’a pas levé le petit doigt `lit. He didn’t lift the small finger’ if he didn’t help at all, but I could not say Il a levé le petit doigt lit. ‘He lifted the small finger’ even if he did help to some extent.

Expressions like lever le petit doigt `lift a finger’, one bit, care/give a damn, own a red cent are said to be polarity sensitive: they only really make sense if used in negative sentences. But this in itself is not the most interesting property.

What is much more interesting is why they have this property. There is a lot of research on this question in theoretical linguistics. The proposals are quite technical but they all start from the observation that most expressions that need to be in a negative context to be acceptable are expressions of minimal degrees and measures. For instance, a finger or le petit doigt `the small finger’ is the smallest body part one can lift to do something, a drop (in the expression I didn’t drink a drop of vodka yesterday) is the smallest observable quantity of vodka, etc.

Regine Eckardt, who has worked on this topic, formulates the following intuition: ‘speakers know that in the context of drinking, an event of drinking a drop can never occur on its own – even though a lot of drops usually will be consumed after a drinking of some larger quantity.’ (Eckardt 2006, p. 158). However the intuition goes, the occurrence of this expression in a negative sentence is acceptable because it denies the existence of events that consist of just drinking one drop.

What this means is that if Mary drank a small glass of vodka yesterday, although it is technically true to say She drank a drop of vodka (since the glass contains many drops) it would not be very informative, certainly not as informative as saying the equally true She drank a glass of vodka.

However imagine now that Mary didn’t drink any alcohol at all yesterday. In this context, I would be telling the truth if I said either one of the following sentences: Mary didn’t drink a glass of vodka or Mary didn’t drink a drop of vodka. But now it is much more informative to say the latter. To see this consider the following: saying Mary didn’t drink a glass of vodka could describe a situation in which Mary didn’t drink a glass of vodka yesterday but she still drank some vodka, maybe just a spoonful. If however I say Mary didn’t drink a drop of vodka then this can only describe a situation where Mary didn’t drink a glass or even a little bit of vodka. In other words, saying Mary didn’t drink a drop of vodka yesterday is more informative than saying Mary didn’t drink a glass of vodka yesterday because the former sentence describes a very precise situation whereas the latter is a lot less specific as to what it describes (i.e. it could be uttered in a situation in which Mary drank a spoonful of vodka or maybe a cocktail that contains 2ml of vodka, etc)

By using expressions of minimal degrees/measures in negative environments, the sentences become a lot more informative. This, it seems, is part of the reason why languages like English have changed such that these words are now only usable in negative sentences.

The headache-bringer-oner(er) of the English agentive suffix

The headache-bringer-oner(er) of the English agentive suffix

The task of the light-turner-offer-onerer

Recently, a friend jokingly mentioned that he was thinking of hiring a light-turner-offer-onerer so that he wouldn’t have to get off the sofa to operate the light switch. In doing so, he made use of the extremely productive agentive suffix -er (also -or), which we use in English to derive a noun from a verb, to express the person or thing that carries out the action of the verb. The interpretation of this suffix is particularly transparent, even when used in completely novel ways, as in the recent article in The Economist newspaper cleverly titled The Baby Crisperer, drawing an analogy with The Horse Whisperer, while making reference to the gene-editing technology CRISPR-Cas9.

The butcher, the bak-er and the candlestick mak-er

But the striking thing about the opening example is the multiple occurences of the agentive suffix. Most of the time in English the agentive suffix is simply added to the end of a word, regardless of whether the word in question has a single element (e.g. baker) or is a compound word (e.g. candlestick maker). But in the humourous example of the light switch operator, we are faced with a phrasal verb (or rather two phrasal verbs, turn off and turn on with the second instance of turn elided) and, in this case, the agentive suffix is added to each element of the phrasal verb. Omitting any of them (with the exception of the final -er, but we’ll come to that later) feels instinctively wrong (e.g. light-turner-offer-on, light-turner-off-onerer, light-turn-offer-oner, light-turner-off-on, light-turn-off-oner etc.).

So, just what’s going on here? Well, the issue lies in the fact that English phrasal verbs consist of a verb (which by itself has a different meaning) followed by a preposition or adverb, and it is precisely this ordering that appears to trip speakers up. In English, suffixes (by definition) come at the end of a word, but when a word has various elements to it, such as a compound word, there are multiple places that could potentially host a suffix. Since the meaningful element of many English compounds comes at the end (e.g. a houseboat is a type of boat that people live in, while a boathouse is a type of house for boats), it usually goes without saying that the suffix attaches to the final word, but if that ordering is upset in any way we tend to see different forms competing with each other (e.g. mothers-in-law vs. mother-in-laws, directors-general vs. director-generals).

A boathouse (left) and a houseboat (right)

Drawing a parallel with inflectional suffixes, which only affect the verb in a phrasal verb (e.g. wash up > he washes up > he washed up, pass by > she passed by > she’s passing by), we might expect the same to be true when it comes to the agentive suffix -er. Indeed, this is precisely what we see with established forms like passer-by (recorded in the OED as early as 1568). The historical form knocker-up (recorded in the OED from 1861), which referred to a person who would rouse workers by knocking on their window, also followed this pattern; it’s worth noting, however, that the form knocker-upper also exists, as seen in this BBC article about the profession, but it’s unclear whether this is a recent innovation or not. (NB. With the demise of this profession, readers can be excused for interpreting the term knocker-up(per) as a man with a predisposition for getting women pregnant.)

A knocker-up(per) at work

Other terms derived with the -er suffix, however, do not adhere to the pattern of marking only the verb element of a phrasal verb. For instance, we often talk of a property in need of renovation as a fixer-upper. Although we do encounter the forms fixer-up and fix-upper, fixer-upper is by far the most widely used term (recorded in the OED from 1948, and with 41 million Google hits, as opposed to fewer than 180 thousand hits for either fixer-up or fix-upper); no doubt the US reality TV show about home renovations, Fixer Upper, has helped popularise this term, in the US at least.

In many cases, a form which marks both elements of a phrasal verb co-exists with a form which marks only the first element of the phrasal verb, with the former appearing to be a much more recent development. Below are some examples of this (with dates showing the earliest recorded occurrences in the OED):

washer-up (1907)       washer-upper (1961)
picker-up (1611)         picker-upper (1913)
looker-up (1867)         looker-upper (1934)
opter-out (1968)         opter-outer (not recorded)

The form opter-outer was not found in the OED, but is sometimes encountered (a Google search results in around 100 hits), such as in this Telegraph article about opting out of a pension. The opposite term, opter-inner, results in a mere 2 hits, however suprising that might seem following last year’s barrage of GDPR opt-in-related emails that we were all subjected to. (Perhaps this reflects the fact that, in the pre-GDPR world, we tended to opt out of things, rather than the reverse?) One of those hits is this short web article, where the writer is bemoaning the amount of spam emails she receives; in it, she not only uses the forms opted-in and opter-inner – the former illustrating the fact that inflectional suffixes generally only attach to the verbal element of the phrasal verb – but also uses opt-in as a noun, stating that “not all opt-ins are created equal”, where the inflectional suffix is instead on the preposition.

But what’s even more interesting than the -er suffix appearing on both elements of a phrasal verb is that some speakers take this process one step further: once every element has been marked with the -er suffix, it’s as if the word as a whole then needs marking with the suffix again, leading to variants like washer-upperer, doubling up on the suffix on the final element. Based on Google searches, the form with the double suffix is surprisingly less common that I (as a speaker of British English) ever thought it was – washer-upperer returns a mere 244 hits on a Google search, while washer-upper returns 47,500 and washer-up returns 110,000 – although it’s entirely possible that in spoken language forms like this are much more frequent, and the Google search of what people are prepared to commit to writing are skewing the results. In any case, common or otherwise, such forms exist. OK, so no doubt some forms with a double -erer suffix are produced for humourous effect, as our opening example of the light-turner-offer-onerer was, but might there be an explanation for why speakers produce these forms in the first place?

One possible explanation is that speakers add the final -er by analogy with agentive nouns formed from verbs that themselves end in -er and which thereby end in the same -erer sequence, such as gatherer, plasterer, murderer? If this is the case, we might hypothesise that the first -er on the particle serves to make the phrasal verb ‘feel’ more verb-like (from the perspective of the suffix), giving the second -er which performs the agentive function something that it is happy to attach to. Could this possibly explain why Vermont Mountain Real Estate have listed a property on their books as being “a good place to fix upper,” perhaps mistakenly interpreting the -er suffix on the adverb as somehow forming a verb (maybe even a back formation from “fix upperer”)? (A much less interesting explanation, of course, is that this is just a typo.)

This house is a good place to fix upper!

The locus of the plural marker -s in agentive nouns of this sort lends some weight to this idea. In forms that mark only the first element of the phrasal verb, such as passer-by and washer-up, the plural marker almost always attaches to the first element together with the agentive suffix, just as we would expect with inflectional suffixes (recall he washes up, she passed by), so we talk of the passers-by or the washers-up, but are less comfortable with the washer-ups (athough it should come as no surprise by now that both forms are found).

But if both elements of the phrasal verb take the agentive suffix, the plural marker attaches to the rightmost of the two (or more) suffixes. We can no longer say the washers-upper, but have to say the washer-uppers. When both elements take the agentive suffix, speakers appear to reanalyse the word as a single unit which no longer permits suffixes to occur internally (i.e. on a non-final element). And once it’s been reanalysed as a single unit, it almost seems right to then want to attach the -er suffix to the unit as a whole.

So while some may argue that this doubling up of the suffix is done intentionally, as a sort of metalinguistic joke, there are reasons to believe this isn’t always the case and that sometimes such forms (albeit markedly colloquial in nature) are produced because they just feel right and/or are following a rule in a speaker’s internal grammar.

Anyway, thinking about all this has brought on a headache, so I’m off to make myself an automatic day-maker-betterer(er)!

A fun bit of marketing, using the agentive suffix
Adventures in Historical Linguistics

Adventures in Historical Linguistics

While linguistics do not cut the same kind of glamorous profile in fiction as, say, international espionage or organized crime, it does pop up now and again. Even historical linguistics. Having stumbled across a couple older examples recently (thus, historical fictional historical linguistics), I commend them to our readers as an alternative to the cheap thrills that might otherwise tempt them.

Leon Groc’s Le deux mille ans sous la mer (‘2000 years under the sea’), from 1924, starts out with our heroes supervising the construction of a tunnel under the English Channel. They discover a mysterious inscription on a rock face. Fortunately, one of the party is a philologist, and identifies it as Chaldean (i.e. a form of Aramaic)! And a particularly archaic variety at that. This impresses the rest of the party, at least as much as the content of the inscription itself: Impious invaders, you shall not go any further. However, a subsequent mining accident forces them to break through the rock, where they discover a cavern inhabited by race of pale blind people, descendants of Chaldeans (or to be more precise, speakers of Chaldean) who had sought refuge in that cavern from some long-forgotten disaster, only to discover they couldn’t find a way out. The learned philologist applies his practical knowledge of Chaldean in communicating them. I won’t spoil the fun for those of you planning to read it; but it does not go well.

James De Mille’s A Strange Manuscript Found in a Copper Cylinder from 1888 features members of a British expedition surveying the South Pacific becoming stranded in an unknown country with – once again – some cave dwellers, who call themselves Kosekin and speak a Semitic language. In the usual fashion of such stories in this period, there is a narrative within a narrative, in this case the manuscript directly relating the adventure, and the commentary of the members of the yacht party who discovered it. While the core narrator (named More) merely recognizes some affinity to Arabic, one of the members of the yacht party just so happens – once again – to have a philological background, which, after a lengthy digression on the comparative method and Grimm’s law, leads him to conclude that the underground race speaks a language descended from Hebrew:

I can give you word after word that More has mentioned which corresponds to a kindred Hebrew word in accordance with ‘Grimm’s Law.’ For instance, Kosekin ‘Op,’ Hebrew ‘Oph;’ Kosekin ‘Athon,’ Hebrew ‘Adon;’ Kosekin ‘Salon,’ Hebrew ‘Shalom.’ They are more like Hebrew than Arabic, just as Anglo-Saxon words are more like Latin or Greek than Sanscrit.

Further proof of the power of historical linguistics in a tight situation comes from  E. Charles Vivian’s City of Wonder (1923). Again in the South Pacific, a group of adventurers is attacked by a strange woman (speaking, of course, a strange language) in charge of a monkey army. Taking stock after having slaughtered the attackers, the narrator asks one of his companions:

“What is the language she used?” I asked.

“The nearest I can tell you, so far, is that it’s a sort of bastard Persian,” he answered. “It’s a dialect built on a Sanskrit foundation—in my youth I studied Sanskrit, for it’s the key to every Aryan language or dialect in the East, and I always meant to come East. I must stuff you two.”

“Stuff us?” Bent asked.

“Fill you up with words that will be useful—it’s astonishing what you can do in a language if you know three or four hundred words in common use. If you hear it and have to make yourself understood in it, the construction of sentences very soon comes to you. That is, if the language is built on an Aryan foundation, as this is.”

It’s that easy! You just need to learn the method.

Back underground, Howard De Vere’s A Trip to the Center of the Earth, first published in New York Boys’ Weekly in 1878, is a story I haven’t been able to track it down yet, but from the description in E.F. Bleiler’s Science Fiction: The Early Years, it promises to be one of the high points in early dime novel treatments of historical linguistics. A pair of boys exploring Kentucky’s Mammoth Cave come across an underground world where

pallid underground people speak English of a sort, in which inflections have disappeared and certain alterations have taken place.

What could those certain alterations be? As an added bonus, the story is of culinary interest, as the next sentence of Bleiler’s description goes:

Geophagists, they live on a nourishing clay, access to which is sometimes barred by gigantic spiders of extraordinary venomosity.

Alongside lost race fantasies, futuristic science fiction is another obvious vehicle for literary forays into historical linguistics. Régis Messac’s Quinzinzinzili from 1935 is a particularly interesting variant, being – as far as I know – the only serious fictional treatment of contact linguistics. (Admittedly I haven’t looked elsewhere.) Set in the period after a fictional World War II which everybody in this interwar period seemed to be expecting anyway), its narrator is trapped in a post-apocalyptic world alone with a particularly annoying handful of pre-teens. (And thus probably the most gruesome post-apocalyptic story ever written.) They are largely French speakers, but there are Portuguese speakers and English speakers among them as well. They develop a sort of pidginized French, colored by a spontaneous sound changes such as the nasalization of all vowels, along with curious semantic shifts. The title Quinzinzinzili reflects this all, being their rendition of the second clause in the Lord’s Prayer in Latin (qui es in cœlis ‘who art in Heaven’), used as a name for their inchoate deity. I won’t say any more because I think everybody should read it. Way better than Lord of the Flies, which it preceded and superficially resembles. (And which has no noteworthy linguistic content.)

And if anybody knows a good source for back issues of  New York Boys’ Weekly, our lines are open.

A Rainbow of Shared Diversity: Culture and Language in the South Pacific

A Rainbow of Shared Diversity: Culture and Language in the South Pacific

When we think of life in the South Pacific we often imagine relaxing in the shade of a coconut palm listening to the soothing sound of Israel Kamakawiwoʻole’s ‘over the rainbow’ (the official song of this blog post and mandatory listening!). But the South Pacific is in fact culturally diverse, and linguistically too, with around 600 languages in the Oceanic family spread across Micronesia, Melanesia and Polynesia.

The original migration of the Oceanic speaking people started around 1600 BC from the north east of New Guinea and they went on to colonise the uninhabited islands of the Pacific Ocean, with New Zealand being the last country to be inhabited by Polynesian seafarers as late as 1285 CE. The vast distances have created huge cultural differences amongst contemporary Oceanic peoples, yet they all speak languages that stem from Proto-Oceanic – the ancestral language of all of Oceania. For example, the Polynesians are famed for their ability to cross vast swathes of Ocean by using star charts made out of sticks, whereas the Melanesians were not great seafarers. However all Oceanic peoples share similar horticultural practices of cultivating yam and taro root crops, which form the basis of an Oceanic diet.

The enormous cultural diversity amongst the Oceanic speaking people has led to widespread variation in the languages spoken in the South Pacific. In particular we can see the cultural influence on the various languages in how they encode possessive relationships in the language. In the most basic way, an Oceanic language makes a difference in the way it treats alienable and inalienable possessions. We’re not talking UFOs here! Inalienable possessions are those that have an inherent connection with the person to whom they belong – such as body parts or members of the family. Alienable possessions are items that can easily be transferred from one owner to another, such as food, baskets, or other household items.

In Port Sandwich, a language spoken in Vanuatu, possessions that are considered inalienable often have a suffix that encodes the possessor (my, your, his/her) directly attached to the possessed noun

(1)    naru-ngg
son-my
‘my son’

Whereas when speaking about sandwiches (and all other alienable possessions) in Port Sandwich, encoding is indirect. The possessor suffix is not able to attach directly to the possessed noun, but instead must attach to a separate marker of possession:

(2)    sanwis        isa-ngg
sandwich        POSS.MARKER-my
‘my sandwich’

Sandwiches aside, in many Oceanic languages this indirect construction that is used for alienable possessions has expanded to include various different semantic types of possession. Languages have separate possessive markers, often called classifiers. Many languages have a three-way split, such as in the language Wuvulu (spoken in the Western Islands off the north coast of Papua New Guinea), for possessions that are eaten, drunk or everything else:

3a. ana-u  niu                      b. numa-mu       upu                         c. ape-muponata
FOOD-my       fish.                DRINK-your  coconut                  GENERAL-your dog
‘my fish (to eat)’                     ‘your coconut (to drink)’             ‘your dog (as a pet)’

Some languages make even more semantic distinctions between alienable items. These classifiers often encode culturally important semantic distinctions. Vera’a, spoken in northern Vanuatu, has eight different possessive classifiers: food, drinks, canoes, houses, beds and mats, prized possessions, long-term possessions, and one for everything else. The Micronesian languages have the largest inventory of classifiers in Oceanic. The Chuuk language has developed thirty-five distinct classifiers, yes, thirty-five! Several of which are used to categorise different types of edible possessions. For example, there is a classifier for cooked food, one for raw food, one for leftover food, and even one that is used with food taken on a journey – great for classifying take-away food!

The yéméti classifier in the Chuuk language for food for a journey is great for take-away pizza, whereas the nikita classifier could be used the day after when you want to eat the leftover pizza – if there is any!

In other languages, speakers are able to create new classifiers when they need to on an ad-hoc basis. This mechanism is particularly prevalent in the languages of Micronesia and New Caledonia. Nêlêmwa, spoken in New Caledonia, can create new classifiers by repeating the possessed noun and adding a suffix to show the possessor, for example mwa ‘house’ (4a) can have the possessor suffix attached (4b), but if a speaker adds an adjective then the possessed noun must be repeated and the directly possessed noun functions as a classifier (4c). In this way a speaker of Nêlêmwa can create new classifiers whenever the need arises.

4a. mwa                        b. mwa-n                    c. mwa-n mwa     doo
house                           house-his                     house-his         house   earth
house                           ‘his house’                    ‘his earth-house’

Though cultural diversity plays a role in the formation of classifiers that are unique to particular languages in the Pacific, there is a commonality among classifiers, and languages that are located far apart often have classifiers that encode similar semantics, which means that though culturally diverse, some important cultural aspects are shared across the Oceanic peoples. For example, many of the Micronesian languages have developed classifiers for beds, mats and pillows. But the language of Vera’a spoken in Northern Vanuatu (over 2500 kilometers away) has also developed a classifier for sleep-related possessions. Similarly, classifiers for domesticated animals have developed in the languages of Micronesia, in Mussau and Seimat (both spoken on the offshore islands of Papua New Guinea), and in Nêlêmwa and Iaai, spoken in New Caledonia. The words used for these classifiers can’t be traced back to a single historical root, which means that these are sporadic innovations in these languages and point to the shared cultural life of the Oceanic peoples.

Just as speakers of different languages can name varying numbers of colours in a rainbow, with Israel Kamakawiwoʻole’s mother tongue Hawaiian distinguishing six colours in contrast to English’s seven, speakers of Oceanic languages differ in the number of ways of categorising their possessions.

A whole nother story

A whole nother story

Words do some truly inventive things when they change, and change they always do. Some switch their sounds around, like when hros became hors, nowadays spelt with an extra e as horse. Some lose their sense of having an internal composition, like when wāl-hros ‘whale-horse’ became walrus. Some cave in to peer pressure and change their looks to conform with others, including one of my favourite cases in English, when under the influence of similarly-meaning words probably, possibly, plausibly which all end in -bly, we get supposably, which is how in some varieties of modern English you can say ‘supposedly’. One the of truly odd things that words do though, is to start stealing sounds from their neighbours.

A famous case in English is an apron, which used to be a napron, until the n got snaffled by the a. It goes the other way too. A newt was originally an ewt. Of course, in Middle English when this n-theivery was underway, there were a few more words complicit in the heist, for example my napron also became mine apron, and your napron became yourn apron, since at that stage in English, words like my/mine, your/yourn worked like a/an. So, ever wondered why the nickname for Edward is Ned? As in mine Ed, ourn Ed? Got it? Speaking of which, nickname was originally ekename and was also involved in a swindling of n from the previous word (the eke-, which is related to eke in ‘eke out a living’, meant an addition or supplement, so mine ekename was my additional name).

It’s not only in English that words have indulged in this shifty business. In late Latin, the word originally borrowed from Greek apotheca would have been l’aboteca, which you may recognise today as Italian la bottega, Spanish la bodega or French and English boutique. In Danish, the plural pronoun meaning ‘you’ is I, related to English ye, but in closely related Swedish it’s ni with an extra n. Where did it get it? Theft. The corresponding plural verbs used to end in -en, like haven i ‘have you?’, and you can see what happened next. In fact, the same game played out a thousand years earlier with singular ‘you’ in several West Germanic languages, except this time it was the verb that kept a piece of the pronoun, when phrases like habēs thū ‘have you?’ became habēst thū, which you might recognise as English havest thou.

How does all this shifting of sounds between words come about? To get an idea, try saying quickly: ‘an apron, a napron, an apron’, and you’ll already have a sense of how this is possible. Unlike on the printed page, words in spoken language stream forth in a smooth and almost seamless flow, and the human brain performs some impressively deft reverse-engineering to slice that stream back up into words. In fact, picking out the individual words in speech is one of the first monumental intellectual tasks we embark on as infants, even before we start learning what the words mean. Recent research suggests that we may even begin this process from within the womb, where we get pre-season access to language courtesy of the muffled rhythms of speech that seep in to us from outside.

Now, you may well wonder how anyone, let alone an infant, can slice up a speech stream into individual words without knowing any of the meanings. Good question. It would appear that the brain operates like a finely tuned statistical inference machine, storing and calculating the relative frequencies at which sounds follow one another, and from this it can begin to pinpoint where the word boundaries are located, since at those boundaries, it is much less predictable what sounds will come next. The trick, then, is that word boundaries are zones of unpredictability, irrespective of their meanings. Of course, we might ask next, why is it that the sounds are so predictable inside the words? One of the reasons for that has to do with what linguists term ‘phonology’: the fascinating way in which sound sequences themselves are intricately structured and highly non-random within the words of human languages, but I’m afraid that for now, that’s a whole nother story.

Double trouble treble

Double trouble treble

You’ll get in trouble if you drink a tripel, the strong pale ale brewed by the most hipster of monks, the Trappists.

The Lowlands are the Hoxton of Europe

Tripels have three times the strength (around 8-10% percent ABV) of the standard table beer historically consumed by the monks themselves. This enkel or ‘single’ beer was traditionally not available outside the cloisters, while the duppel (a double strength dark brown beer made with caramelized beet sugar) was sold to provide income for the monastery. Although the term enkel is no longer in common beer parlance (it is on the cusp of a comeback), duppel and tripel have held their ground. It is generally thought that the tripel takes its name from its threefold strength, but it is also sometimes claimed that it is because it has three times the malt of a regular brew. A quadrupel is VERY strong.

As we have seen already in this blog when counting sheep in Slovenian and yams in Ngkolumbu, means for the expression of quantities and multiplication are often linguistically fascinating. Not least the doublet treble and triple, which originate from the same etymological source.

The Latin word triplus ‘threefold, triple’ first entered English via Old French treble. Not satisfied with claiming the space previously occupied by the Old English adjective þrifeald ‘threefold’, it turned up again by the 15th century as the adjective triple.

This triad of modifiers (threefold, treble and triple) exemplify some of the pathways by which lexical synonymy can come about. The first word was formed through a compounding processes (i.e. the numeral three forming a new word with the multiplicative form –fold), the second entered the language through direct borrowing, and the third through a second wave of borrowing (either from Old French triple or Latin triplus).

We don’t just find words competing to express the same meaning, but also parts of words. The –fold element of threefold, tenfold and manifold, and the –plus of triplus, are argued to have developed from the same Proto Indo-European root *pel ‘to fold’. To complicate things even further, the now obsolete treblefold was attested between the 14th and 16th centuries. Words, it seems, like to fight for the same space, and can sometimes be incestuous.

Since entering English over 500 years ago, triple and treble have staked out different paths, but retained similar meanings in at least some of their manifestations, as explored by Catherine Soanes on the OxfordWords blog. In terms of frequency, triple is the stronger twin (or is it a triplet? quadruplet?), ending up triumphant with around 6 times more occurrences in the Oxford English Corpus.

But treble has some resilience. Although the official Scrabble board has double and triple word scores, treble word scores are occasionally referred to on the net (albeit erroneously, or in a devil-may-care way), such as in Charlie Brooker’s article on how to cheat at scrabble. I even found a ‘threefold word score’ on a Scrabble knock-off site. Lawyers to the ready!

This demonstrates that these adjectives really are semantically interchangeable for the most part, even though their distributions are not identical.

The take home? While not not every monastery sells the same tripel, they will all get you drunk.

Werewolves

Werewolves

Hallowe’en will soon be upon us, so it is only right we turn our attention to monsters. Consider the werewolf. It’s a wolf, sort of, as the name indicates, but what’s a were? The usual assumption is that it’s a leftover of an older word meaning ‘man’ that fell completely out of fashion by the 14th century. As a result we have what looks like a compound word, except that one of the parts doesn’t have any meaning on its own. Perhaps not, but that hasn’t stopped people from squeezing some value out of it nonetheless: if a werewolf is a person who turns into a wolf — or at any rate, part person, part wolf — then a were-bear is a mixture of person and bear, and so on down to were-turtles.

Actually, people don’t seem to be that literal-minded when it comes to word meanings, if the various were-creatures in circulation are any evidence. The monster from “Wallace and Gromit: Curse of the Were-Rabbit” is not half-human, half-rabbit, but more just kind of a monster rabbit, with a thicker pelt. (Visually calqued, I suspect, from the not-particularly wolf-like wolfman of the wolfman movies featuring Lon Chaney Jr.)

And were-fleas, to the extent that they exist, appear to be carriers of lycanthropism rather than human/insect conglomerates. None of this is yet reflected in the Oxford English Dictionary’s entry on were– (you need a subscription for that but it’s free if you have a UK public library card!). Give it a few decades more maybe.

Strangely, words for werewolf in other languages share a propensity for being compounds made up of ‘wolf’ plus some other completely opaque element. The first part of Czech vlkodlak is vlk, which means ‘wolf‘, but dlak on its own is not an independent word. (Not in Czech at any rate, but in the related language Slovenian the equivalent word volkodlak is clearly made up of volk ‘wolf’ and dlaka, which means ‘hair’ or ‘fur’.) And the French werewolf, loup-garou, has the word for ‘wolf’ in it (loup), but garou is not an independent word (other than being an unrelated homonym meaning ‘flax-leaved daphne’). That part seems to have been our very own Germanic word werewolf borrowed at an early date (earliest attestation as garwall from the 12th century). Both of these have, like werewolf, given rise to further monstrous hybrids like Czech prasodlak, from prase ‘pig’, or the French cochon-garou.

In fact, Czech and French have gone one step further than English. Though I just wrote that dlak and garou were not words, that was being a bit pedantic. Neither of them are listed in the authoritative Academy dictionaries of Czech and French, but nonetheless they do seem to have split off from their host body, rather as happened — if we can be permitted to mix monster metaphors — to the hero of 1959’s “The Manster (a.k.a The Split)”.

For example, this Czech website tells us about vlkodlaci i jiní dlaci ‘werewolves and other were-creatures’ (dlaci is the plural of dlak), and in French the phrase courir le garou ‘run the garou‘ used, at least, to be in circulation, meaning basically ‘go around at night being a werewolf”. That use in turn apparently spawned a verb garouter, meaning much the same thing. The curse lives on.

Optimal Categorisation: How do we categorise the world around us?

Optimal Categorisation: How do we categorise the world around us?

People love to categorise! We do this on a daily basis, consciously and subconsciously. When we are confronted with something new we try and figure out what it is by comparing it to something we already know. Say, for instance, I saw something flying through the air – I may think to myself that the object is a bird, or I may say it is a plane based on my previous experiences of birds and planes. Of course the object may turn out to be something completely new, perhaps even superman!

Is it a bird? Is it a plane? No it’s Superman!

Our love of classification runs deep in scientific enquiry. Botanists and zoologists classify plants and animals into different taxonomies. Even the humble linguist loves to classify – is this new word a noun or a verb? What about the new word zoodle that was recently added to the Merrriam Webster dctionary? Is it a thing? Or an action? Can I zoodle something or is it something I can pick up and touch? Well apparently zoodle is a noun which means ‘a long, thin strip of zucchini that resembles a string or narrow ribbon of pasta’. To be honest, I love eating zoodles, though until now I never knew what they were called!

The way people classify entities around them has become encoded in the different languages we speak in many different ways. The most obvious example that springs to mind is when we learn a new language, like French or German, we are confronted with a grammatical gender system. French has two genders – Masculine and Feminine. But German has three – Masculine, Feminine and Neuter. Other languages can have many more gender distinctions. Fula, a language spoken in west and central Africa, has twenty different gender categories!

So what exactly are grammatical gender systems and how are they realised in different languages? Gender systems categorise nouns into different groups and tend to appear not on the noun itself, but on other elements in the phrase. In German, nouns are split into three different gender categories – masculine, feminine and neuter. The gender of a noun is shown by using different articles (the word ‘the’ or ‘a’) and sometimes by changing the ending of an adjective, but never on the noun itself. Thus the word for ‘the’ in German is either der, die or das depending on whether the noun in the phrase is masculine, feminine or neuter.

(1)        der       Mann
              the       man

(2)        die        Frau
              the       woman

(3)        das       Haus
              the       house

This is called ‘agreement’ as the adjectives and articles must agree with the gender of the noun. In a language with gender, each noun typically can only occur in one gender category.

Not every language has a grammatical gender system, but they are highly pervasive, with around 40% of all languages having such a system. English is quite a poor example when it comes to gender. There is no real gender agreement in English, with the exception of pronouns. We have to say: Bill walked into the grocers. He bought some apples. Where the pronoun he must agree with the gender of the noun that was previously mentioned. English uses he, she and it as the only markers of gender agreement.

Languages behave differently in how they allocate nouns to the different genders, which can be very baffling for language learners! Why in French is chair feminine, la chaise, but in German it is masculine, der Stuhl? How a language allocates nouns to its gender categories can seem somewhat arbitrary – with the exception of the words for women and men, which fall into the feminine and masculine genders being the only semantically obvious choices.

But wait! If you thought the English gender system was dull, think again! A couple of months ago my piano was being restored and when it was being moved back into the lounge the piano movers kept saying: “pull her a little bit more” and “turn her this way”. The movers used the female pronouns to describe the piano. In English, countries, pianos, ships and sometimes even cars use the feminine pronouns.

Grammatical gender isn’t the only way languages classify nouns. Some languages use words called classifiers to categorise nouns. Classifiers are similar to English measure terms, which categorise the noun in terms of its quantity, such as ‘sheet of paper’ vs. ‘pack of paper’ or ‘slice of bread vs. ‘loaf of bread’. Classifiers are found in languages all over the world and are able to categorise nouns depending on the shape, size, quantity or use of the referent, e.g. ‘animal kangaroo’ (alive) vs. ‘meat kangaroo’ (not alive). Classifier systems are very different to gender systems as nouns in a language with classifiers can appear with different classifiers depending on what property of the noun you wish to highlight. There are many different types of classifier systems, but to keep things short I am just going talk about possessive classifiers, which are mainly found in the Oceanic languages, spoken in the South Pacific.

When an item is in your possession we use possessive pronouns in English to say who the item belongs to. For instance if I say ‘my coconut’ – the possessive pronoun is my. In many Oceanic languages a noun can occur with different forms for the word my depending on how the owner intends to use it. For instance the Paamese language, spoken in Vanuatu, has four possessive classifiers and I could use the ‘drinkable’ if I was talking about my coconut that I was going to drink. I would use the ‘edible’ classifier if I was going to eat my coconut. I would use the classifier for ‘land’ if I was talking about the coconut growing in my garden. Finally, I could use the ‘manipulative’ classifier if I was going to use my coconut for some other purpose – perhaps to sit on!

(4)        ani                   mak
              coconut           my.drinkable
              ‘my coconut (that I will drink)’

(5)        ani                   ak
              coconut           my.edible
              ‘my coconut (that I will eat)’

Why do languages have different ways of categorising nouns? How do these systems develop and change over time? Are gender systems easier to learn than classifier systems? Are gender and classifiers completely different systems? Or is there more similarity to them than meets the eye? These are some of the big questions in linguistics and psychology. We are excited to start a new research project at the Surrey Morphology Group, called optimal categorisation: the origin and nature of gender from a psycholinguistic perspective, that seeks to answer these fundamental questions. Over the next three years we will talk more about these fascinating categorisation systems, explain our experimental research methods, introduce the languages and speakers under investigation, and share our findings via this blog. Just look out for the ‘Optimal Categorisation’ headings!

The cat’s mneow: animal noises and human language

The cat’s mneow: animal noises and human language

As is well known, animals on the internet can have very impressive language skills: cats and dogs in particular are famous for their near-complete online mastery of English, and only highly trained professional linguists (including some of us here at SMG) are able to spot the subtle grammatical and orthographic clues that indicate non-human authorship behind some of the world’s favourite motivational statements.

Recent reports suggest that some of our fellow primates have also learnt to engage in complex discourse: again, the internet offers compelling evidence for this.

But sadly, out in the real world, animals capable of orating on philosophy are hard to come by (as far as we can tell). Instead, from a human point of view, cats, dogs, gorillas etc. just make various kinds of animal noises.

Why write about animals and their noises on a linguistics blog? Well, one good answer would be: the exact relationship between the vocalisations made by animals, on one hand, and the phenomenon of human spoken language, on the other, is a fascinating question, of interest within linguistics but far beyond it as well. So a different blog post could have turned now to discuss the semiotic notion of communication in the abstract; or perhaps the biological evolution of language in our species, complete with details about the FOXP2 gene and the descent of the larynx

But in fact I am going to talk about something a lot less technical-sounding. This post is about what could be called the human versions of animal noises: that is, the noises that English and other languages use in order to talk about them, like meow and woof, baa and moo.

At this point you may be wondering whether there is much to be gained by sitting around and pondering words like moo. But what I have in mind here is this kind of thing:

These are good fun, but they also raise a question. If pigs and ducks are wandering around all over the world making pig and duck noises respectively, then how come we humans appear to have such different ideas about what they sound like? Oink cannot really be mistaken for nöff or knor, let alone buu. And the problem is bigger than that: even within a single language, English, frogs can go both croak and ribbit; dogs don’t just go woof, but they also yap and bark. These sound nothing like each other. What is going on? Are we trying to do impressions of animals, only to discover that we are not very good at it?

Before going any further I should deal with a couple of red herrings (to stick with the zoological theme). For one thing, languages may appear to disagree more than they really do, just because their speakers have settled on different spelling conventions: a French coin doesn’t really sound all that different from an English quack. And sometimes we may not all be talking about the same sound in the first place. Ribbit is a good depiction of the noise a frog makes if it happens to belong to a particular species found in Southern California – but thanks to the cultural influence of Hollywood, ribbit is familiar to English speakers worldwide, even though their own local frogs may sound a lot more croaky. Meanwhile, it is easy to picture the difference between the kind of dog that goes woof and the kind that goes yap.

But even when we discount this kind of thing, there are still plenty of disagreements remaining, and they pose a puzzle bound up with linguistics. A fundamental feature of human language, famously pointed out by Saussure, is that most words are arbitrary: they have nothing inherently in common with the things they refer to. For example, there is nothing actually green about the sound of the word green – English has just assigned that particular sound sequence to that meaning, and it’s no surprise to find that other languages haven’t chosen the same sounds to do the same job. But right now we are in the broad realm of onomatopoeia, where you might not expect to find arbitrariness like this. After all, unlike the concept of ‘green’, the concept of ‘quack’ is linked to a real noise that can be heard out there in the world: why would languages bother to disagree about it?

 

First off, it is worth noticing that not all words relating to animal noises work in the same way. Think of cock-a-doodle-doo and crow. Both of these are used in English of the distinctive sound made by a cockerel, and there is something imitative about them both. But there is a difference between them: the first is used to represent the sound itself, whereas the second is the word that English uses to talk about producing it. That is, as English sees it, the way a cock crows is by ‘saying’ cock-a-doodle-doo, and never vice versa. Similarly, the way that a dog barks is by ‘saying’ woof. The representations of the sounds, cock-a-doodle-doo and woof, are practically in quotation marks, as if capturing the animals’ direct speech.

This gives us something to run with. After all, think about the work that words like crow and bark have to do. As they are verbs, you need to be able to change them according to person (they bark but it barks), tense, and so on. So regardless of their special function of talking about noises, they still have to operate like any other verb, obeying the normal grammar rules of English. Since every language comes with its own grammatical requirements and preferences about how words can be structured and manipulated (that is, its own morphology), this can explain some kinds of disparity across languages. For example, what we onomatopoeically call a cuckoo is a kukushka in Russian, featuring a noun-forming element shka which makes the word easier to deal with grammatically – but also makes it sound very Russian. Maybe it is this kind of integration into each language that makes these words sound less true to life and more varied from one language to another?

This is a start, but it must be far from the whole story. Animal ‘quotes’ like woof and cock-a-doodle-doo don’t need to interact all that much with English grammar at all. Nonetheless, they are clearly the English versions of the noises we are talking about:

And as we’ve already seen, the same goes for quack and oink. So even when it looks like we might just be ‘doing impressions’ of non-linguistic sounds, every language has its own way of actually doing those impressions.

Reassuringly, at least we are not dealing with a situation of total chaos. Across languages, duck noises reliably contain an open a sound, while pig noises reliably don’t. And there is widespread agreement when it comes to some animals: cows always go moo, boo or similar, and sheep are always represented as producing something like meh or beh – this is so predictable that it has even been used as evidence for how certain letters were pronounced in Ancient Greek. So languages are not going out of their way to disagree with each other. But this just sharpens up the question. For obvious biological reasons, humans can never really make all the noises that animals can. But given that people the world over sometimes converge on a more or less uniform representation for a given noise, why doesn’t this always happen?

In their feline wisdom, the cats of the Czech Republic can give us a clue. Like sheep, cats sound pretty similar in languages across the globe, and in Europe they are especially consistent. In English, they go meow; in German, it is miau; in Russian, myau; and so on. But in Czech, they go mňau (= approximately mnyau), with a mysterious n-sound inside. The reason is that at some point in the history of Czech, a change in pronunciation affected every word containing a sequence my, so that it came out as mny instead. Effectively, for Czech speakers from then on, the option of saying myau like everyone else was simply off the table, because the language no longer allowed it – no matter what their cats sounded like.

What does this example illustrate? First of all – as well as a morphology, each language has a phonology (sound structure), which constrains its speakers tightly: no language lets people use all the sounds they are physically able to make, and even the available sounds are only allowed to join up in certain combinations. So each language has to come up with a way of dealing with non-linguistic noises which will suit its own idea of what counts as a legitimate syllable. Moo is one thing, but it’s harder to find a language that allows syllables resembling the noise a pig makes… so each language compromises in its own way, resulting in nöff, knor, oink etc., none of which capture the full sonic experience of the real thing.

And second – things like oink, woof and mňau really must be words in the full sense. They aren’t just a kind of quotation, or an imitation performed off the cuff; instead they belong in a speaker’s mental dictionary of their own language. That is why, in general, they have to abide by the same phonological rules as any other word. And that also explains where the arbitrariness comes in: as with any word, language learners just notice that that is the way their own community expresses a shared concept, and from then on there is no point in reinventing the wheel. You don’t need to try hard to get a duck’s quack exactly right in order to talk about it – as long as other people know what you mean, the word has done its job.

So what speakers might lose in accuracy this way, they make up for in efficiency, by picking a predetermined word that they know fellow speakers will recognise. Only when you really want to draw attention to a sound is it worth coming up with a new representation of it and ignoring the existing consensus. To create something truly striking, perhaps you need to be a visionary like James Joyce, who wrote the following line of ‘dialogue’ for a cat in Ulysses, giving short shrift to English phonology in the process:

–Mrkgnao!