Browsed by
Category: Uncategorized

Happy Christmas/Nowell/Yule/etc.

Happy Christmas/Nowell/Yule/etc.

It might have escaped your notice, but Christmas is coming! While the exact date of the birth of Jesus Christ is subject to some debate, the overall consensus of the early church settled upon sometime in winter as the time to hold the feast. However, Christian denominations still disagree on the exact date of the celebration; in the West Christians celebrate on the (Gregorian) 25th December, but some Orthodox churches (notably the Russians) retain the Julian Calendar, so their liturgical 25th December is in fact the 7th January in the Gregorian calendar. Meanwhile, the Armenians have long observed the 6th January as their Christmas, whether that be the Gregorian 6th January as in Armenia or the Julian 6th January (Gregorian 19th) as the Armenians of Jerusalem still do. With such a level of disagreement about when to celebrate, it is no surprise that there is even more disagreement about what the feast should even be called.

English is simple enough; since Old English the term Christes mæsse ‘Christ’s mass’, referring to the actual religious service held on the day, came to refer to the whole festive period. A similar process happened to give Dutch Kerstmis. On the other hand, English also have the term Nowell (think of the carol ‘The First Nowell’), which is borrowed from the French term Noël.

This French term is one of an array of terms in romance languages derived from Latin nātālis, meaning ‘birthday [of Christ]’. This also provides us with Portugues Natal, Catalan Nadal and Italian Natale. Spanish meanwhile uses Navidad, which comes from a related Latin term nātīvitās. Another Latin term nātālīcia ‘birthday feast’ was an early borrowing into the Celtic languages, giving for example Welsh Nadolig and Gaelic Nollaig, while Albanian borrowed a Latin phrase Christī nātāle, which with Albanian’s complicated history ended up as Kërshëndella. Greek Khristougenna also means ‘Christ’s birth’; Church Slavonic likewise boasted a form Rozhdestvo (fully Rozhdestvo Khristovo), again meaning ‘birth [of Christ]’, which is now the Russian form as well, while Polish combines a derivative of that same root with the Slavic Bog ‘God’ to give Boże Narodzenie ‘birth of Godˈ.

A different Latin term, calendae, originally meaning ‘Kalend (the first day of the month)’, which also gives us English calendar, also ended up being borrowed into Proto-Slavic as *kolęda, with many Slavic languages using this as an alternative term for ‘Christmas’ (somewhat akin to referring to Christmas as Yule in English, on which more later); in Bulgarian Koleda is the primary term, and Lithuanian also refers to the festival as Kalėdos, borrowing from the Slavic (Latvian Ziemassvētki simply means ‘winter holidays’). Furthermore, a Polish kolęda and a Russian kolyadka are both Christmas carols, as is a colindă in Romanian.

A third term that English has is Yule. This is a Germanic term, likely referring to festivities in general, but now the word for Christmas in modern Scandinavian languages, e.g. Danish/Swedish/Norwegian Jul and Icealandic Jól. This word also spread to other languages in Scandinavia, hence Finnish Joulu (If you visit actual Lappland ‘Father Christmas’ will be referred to Joulupukki ‘Christmas Buck’), alongside the more archaic juhla, a general word for ‘party, celebration’, while (Northern) Sámi has Juovllat from the same source. German Weihnachten, meanwhile, originally meant ‘Holy Nights’; Czech Vánoce/Slovak Vianoce are borrowed from the German, replacing Germanic Nacht with Slavic noc for ‘night’.

Perhaps the most interesting pair of terms for Christmas are Romanian Crăciun and Hungarian Karácsony. These terms are likely related, but beyond that the etymology is disputed. Romanians tend to claim that it is Latin in origin, typically creātiōnem ‘creation’, though there are other possibilities such as calātiōnem ‘calling’ and incarnatiōnem ‘incarnation’ (this latter one is perhaps the best semantic match; theologically Christmas is the feast of the incarnation ‘making flesh’ of God). This is plausible, but Slavicists propose a different etymology. Specifically, they contend that this form is a borrowing from a Slavic root *korčiti meaning something like ‘step forth’, from which is derived a form Kračun meaning ‘winter solstice’ (the metaphor being the sun ‘stepping forth’ after the shortest day of the year) in certain Bulgarian varieties and in Slovak, and with an attestation as Koročun in some Novgorodian manuscripts. I personally would lean towards the former, since the Slavic forms could easily be loans from the Romanian (we noted already the borrowing of calendae from Latin) and the semantic match seems something of a stretch to derive ‘winter solstice’, but we will likely never be sure.

However you celebrate (if you do), we at SMG wish you Joyeux Noël, Frohe Weinachten, Feliz Navidad, God Jul, s Rozhdestvom, and of course Merry Christmas.

Merry Christmas in a variety of languages
Happy holiday(s)!
Who are the Germans?

Who are the Germans?

You may be familiar with the fact that the Germans refer to themselves as Deutsch and their country as Deutschland, and we find this term also in most other Germanic languages, such as Dutch Duits or Swedish Tysk, as well as Italian Tedesco. However, there are many other names in other parts of Europe. The French and Spaniards call them Allemand/Alemán, as do the Welsh with Almaenaidd; the various Slavic languages share a different term again, seen in e.g. Polish Niemiec or Russian Nemets. In the Baltic the Lithuanians and Latvians have their own terms not seen anywhere else (Vokietis and Vācijis respectively), while in Finland and Estonia they call them Saksi. We could also add some assorted forms from smaller languages, such as Miksas from Old Prussian, an extinct sister language to Lithuanian and Latvian.

An aerial shot of the meeting of the Rhine and Mosel rivers at Koblenz
The Deutsches Eck, or ‘German corner’, in Koblenz

Now, it is not unusual for inhabitants of a country to refer to themselves and their country with a different form from that used by outsiders (when was the last time you called China Zhongguo or India Bharat?). What is particularly notable about the German case, however, is the diversity even among its immediate neighbours. Contrast e.g. France, where everyone uses some form of derivative of Latin Francia (after the Germanic tribe the Franks), though the Greeks still call it Gallia after the Roman province of Gaul. Similarly, most call Spain some form derived from Hispania and Italy one from Italia. So, this diversity in names for the Germans requires some explanation.

Whence this plethora of terms? A consideration of history leads us to our answer. Recall that the modern country of Germany is a relatively recent creation, only being officially united in the mid 19th century by Otto von Bismarck. While there was a political entity that occupied the area in the form of the Holy Roman Empire it was only a relatively loose collection of small states, and prior to that the area was inhabited by a number of distinct Germanic-speaking peoples.

As a result, some of these names derive from the individual groups or tribes which lived in part of the area: so in the Western Romance and Brittonic Celtic languages the name of the Alemanni tribe was applied to the Germans as a whole. The same process occurred in the northeast with the Baltic Finns and the Saxons: not only were the Saxons the nearest group, but also, due to a combination of the Hanseatic League controlling trade through the Baltic and the anti-pagan crusading of the Teutonic Knights (another Deutsch-relative, see below), many Saxons came to settle in the Eastern Baltic, with some of their descendants still living in Estonia and Latvia today. Some small varieties show different groups again: some of the smaller Germanic varieties use a form derived from Prussian, after the state which ended up uniting the German peoples.

English takes a slightly different approach, deriving the term Germans from the Latin name of the region; Germania. This term included two Roman provinces covering much of modern-day Belgium, Switzerland, parts of eastern France and the Rhineland in modern Germany, as well as applying to the larger swathe of barbarian territories further east. Interestingly, several languages use this term to refer to Germany the country despite using a different term to refer to the Germans: Italian and Russian are the most notable examples.

We find a different source again with the Slavic Nemets terms. There is again some dispute in origin, but the general consensus is that it derives from a Slavic root *němъ meaning ‘mute’, itself of contested origin. The meaning likely was not ‘mute’ necessarily, but rather simply denoted that these groups were not Slavic-speaking. This puts in a similar group to the word ‘barbarian’ in fact, which derives from a Greek word meaning ‘those who go bar-bar/talk incomprehensibly’. Similar origins to do with ‘talking’ are likely behind the Baltic Vok-/Vāc-/Miks- forms as well.

Finally, what of German ‘Deutsch’? Well, as is the case with many endonyms it is a relatively simple and self-referential etymology. It ultimately derives from an Indo-European root *tewteh2 meaning simply ‘people’, which shows up also in e.g. Irish túath with the same meaning. This form may also be the source of Romance forms such as Spanish todo or French tout meaning ‘everyone/everything’. This root even survives in Slavic, in Russian giving the form čužoj, meaning ‘foreign, alien’. This ended up as Germanic *þeudō, which through an adjective formation *þiudiskaz meaning something like ‘of the people’ ultimately leads to the modern German form. This form also gives Latin Teutones, a likely Celtic or Germanic tribe which lived in the North German region and was encountered by the Romans early in their expansion northwards.

So, as with many other terms, such as the aubergine words which have been discussed here before, the differences between languages are reflective of a complex history. In this case the wide array of disparate terms of different etymologies reflects the complex history of the entity involved, specifically the absence of a country that even called itself ‘Germany’ until the modern era, as well as the extent to which different groups of ethnic Germans have moved about in Europe.

Of the saintly and sinister: words for the left-handed

Of the saintly and sinister: words for the left-handed

A couple of years ago, when I was still living in North Yorkshire but shortly to be moving further south, I attended a function at one of the final services conducted by my mother (an ordained priest in the Church of England) in a rural parish church. Afterwards, over a goodly spread of finger-food (what in clerical circles is commonly referred to as a bun-fight), I was in conversation with one of my mother’s then parishioners, a local farmer, who was much interested in my studies in linguistics. He tasked me with sourcing the etymology of an unusual local term – cuddy-wifter, which I was told refers to a left-handed person. I’ll propose an etymology of this specific term later, but first let’s have a short discussion of terms for “left hand” across languages.

A close-up of the left hand of a bronze statue. The hand is a polished golden colour, in contrast to the rest of the statue which is a dark brown colour.
Rarely has the left hand been portrayed in such good light

Firstly, it is important to note that many languages do not really make use of such ego-centric terms as “left” or “right” much, if at all. In these languages instead speakers opt for a geocentric system, locating and orienting objects and themselves by their relationship to either the points of the compass or in some cases the landscape. This has been most famously documented in a number of Paman languages of Queensland, Australia. For example, a speaker of Guugu Yimidhirr might refer to their nagaalngurr “east side” or guwaalngurr “west side” rather than to their left or right. Aspects of this conception of space are also found widely in languages across the Pacific and beyond.

And where languages do have a term for “left”, they very frequently differ on what the term should be. Even only looking at the Indo-European family we find a multiplicity of terms: besides English left we find forms as clì or ceàrr in Scottish Gaelic; izquierdo in Spanish; majtë in Albanian; levyj in Russian; chap in Persian: and bau in Sylheti. So many different words, and these languages are all related!

From whence then English left? We can find cognates in nearby parts of West Germanic, such as West Frisian lafter/lofter and Dutch lucht/luft, but none further afield meaning “left”. These forms derive ultimately from a term meaning “palsy, paralysis”, which might perhaps derive from a Proto-Indo-European verb *lewp- “peel, break” which would also through another derivation gives the English verb lop. A similar pattern appears to hold in closely related German links which, though it doesn’t appear directly related and is of somewhat unclear etymology, the Icelandic form linur meaning “weak, feeble” indicates likely a similar semantic development.

On the other end of the scale, terms for the left hand or left-handedness can often end up taking on other meanings. Notably, this has happened twice in English borrowing from two different Romance languages, Latin and French. In the case of the Latin term sinister, it now refers to someone or something that is seen as being shadowing and potentially malicious. In the case of the French term gauche (ultimately derived from the same root as English walk), it instead refers to a lack of fashionability and perhaps a degree of awkwardness.

We can therefore draw two main conclusions from the above. Firstly, the concepts “left” and “right” are not essential to how we as humans conceptualise ourselves and the wider world; many languages do without, and those which have words for them seem to be happy to churn out old forms for new ones (compare e.g. the near-universal agreement on a form deriving from something like *mātēr for “mother” in the various Indo-European languages mentioned above). Secondly, there is a clear tendency for the left hand to carry a negative connotation, either directly due to being the non-dominant hand for most people (as indicated by the various etymologies referring to physical weakness in Germanic) or through some more cultural taboos (as seen by the development of the Romance terms sinister and gauche in English).

What then of cuddy-wifter? Well, the exact origin is not given specifically in any source I could find, but with a bit of digging uncovers the following. The “wifter” part is easier to pin down, as at least according to the Oxford English Dictionary (the pre-eminent source on English etymology) “wift” comes up sometime in the 16th century as a verb meaning something like “to turn aside” or “to drift”, which, going by the pattern set by the above data, seems a reasonable source (at least in part) for a term for “left hand”.

The cuddy part is the more problematic element. Cuddy is an affectionate form of Cuthbert, the pre-eminent saint of the North-East of England, and crops up in a number of terms from the region, notably cuddy ducks to refer to the eider ducks said to have been particularly beloved by the saint, as well as the ponies used in the many coal mines of the area, which were referred to simply as cuddies. However, the connection with the saint seems suspect from the off, as there doesn’t appear to be any kind of source which would attest to the saint being left-handed. Certainly, the Venerable Bede, that great chronicler of Anglo-Saxon England, makes no mention of anything of the kind, in contrast to his willingness to comment on much else of the saint’s physical characteristics.

a single eider duck drake floating on the surface of the sea
Can you blame Cuthbert really?

There also exists a sense of “cuddy” meaning “a stupid fellow” which perhaps would tie into the negativity often associated with left-handedness. However, the OED only provides examples from the mid-nineteenth century and appears to derive from the “pony” usage (in a similar manner to the development of ass more broadly in English), which does somewhat problematise this as an etymology. In particular, the structure of cuddy-wifter this would require seems oddly-formed, akin to something like ass-drifter, and the timespan of a century for this form to arise and spread south of the Tees, well outside the coalfield regions where the pit-pony was a fact of life, is probably a bit of a stretch.

Perhaps then we should look elsewhere for a source. Old English appears to provide no obvious sources, so perhaps we should look to Celtic for an origin. And a tantalising hint we find: Welsh chwith “left” or “wrong” and Irish and Scottish Gaelic ciotach “left-handed” or “clumsy” (there’s that negativity again), pointing to a Proto-Celtic *skittos, which to my mind looks like it might have a relationship at some point with English skew, though that is far from proven at this stage. I would propose, then, that this word was borrowed into a northern variety of English (perhaps from Cumbric, the extinct Celtic language of Cumbria) as something like *cwithy~*cuthy, and then later on it was folded into the cuddy form, with no actual direct connection with the saint at all.

So, while admittedly I haven’t been able to come up with a definite answer to that question I was set at that bun-fight a couple of years ago, I can at least tell a story rich in history and culture, revealing much both of the linguistic landscape of Britain and of our historic attitudes towards those who are left-handed.

Seasonal thoughts

Seasonal thoughts

As spring slowly but surely begins to announce itself with snowdrops, primroses and daffodils, we may ask how much variation there is in the concept of the seasons from one language to another. As Encyclopedia Britannica informs us, “the seasons—winter, spring, summer, and autumn—are commonly regarded in the Northern Hemisphere as beginning respectively on the winter solstice, December 21 or 22; on the vernal equinox, March 20 or 21; on the summer solstice, June 21 or 22; and on the autumnal equinox, September 22 or 23. In the Southern Hemisphere, summer and winter are reversed, as are spring and fall”.

Many languages spoken in Eurasia conform to this division into four seasons. But what other options are there? Leaving aside jokes about places where a single season lasts all year round (Russia: white winter and green winter; Quebec: beginning of winter, end of winter, beginning of next winter; New York: almost summer, summer, still summer, Christmas…), there are languages which really do distinguish between two seasons only: the dry season and the rainy season. Indonesian is like this, having musim hujan ‘rainy season’ and musim kemarau ‘dry season’. In Mandinka (a Mande language spoken in Guinea, northern Guinea-Bissau, Senegal, and the Gambia) the seasons are sàmaa ‘rainy season’ and tìlikandi ‘dry season’. In Wolof (Niger-Congo language spoken in Senegal, The Gambia, Mali and other countries) the seasons are nawɛt ‘rainy season’ and nɔɔr ‘dry season’. Rainy seasons stretch roughly from June to October, while dry seasons take up the rest of the year. Two-season languages are generally spoken close to the equator.

Three-season languages also exist. In Ancient Egypt the year was divided into three seasons: Inundation, when the Nile overflowed the agricultural land; Going Forth, the time of planting when the Nile returned to its bed; and Deficiency, the time of low water and harvest. In some varieties of Turkish there are three seasons only: kış ‘winter’, bahar ‘spring’ and yaz ‘summer’, although other speakers use a four season system: kış ‘winter’, ilkbahar ‘spring’, yaz ‘summer’ and sonbahar ‘autumn’, where bahar can also be used to designate an unspecified intermediate season.

Finally, there are languages which have more than four seasons. For example, in Hindi (an Indo-European language spoken in northern India), six seasons are distinguished: vasant ritu ‘spring season’ (March-April), greeshm ritu ‘summer season’ (May-June), varsha ritu ‘rainy season’ (July – August), sharad ritu ‘autumn season’ (September-October-mid November), hemmat ritu ‘pre-winter season’ (November-December) and sheet ritu ‘winter season’ (January-February).

In Polish, besides wiosna ‘spring’, lato ‘summer’, jesień ‘autumn’ and zima ‘winter’ there are the words przedwiośnie (‘before spring’) and przedzimie (‘before winter’). Interestingly, some Polish speakers say that the latter word is now obsolete while the former is used widely. In Russian, there is a word предзимье (predzim’ye) ‘before winter’, but no other words to designate such ‘in-between’ seasons.

But having different number of seasons from the ‘standard’ is not the only possible way for languages to stand out. Have a look at this linguistic puzzle: it was originally composed (in Russian) by Irina Chesnokova for use at the Moscow Linguistics Olympiad, and it recently appeared in a collection of the best Olympiad puzzles (Традиционная Олимпиада по лингвистике. 49 лучших задач. [The Traditional Linguistics Olympiad. 49 problems], Moscow, 2020).

It is all about how the Manx language refers to various kinds of time period. Problem. Manx is a language belonging to the Celtic branch of Indo-European, spoken by about 1800 people on the Isle of Man. Consider these phrases in Manx and their unordered English translations:

1.     Jerrey Geuree A. June
2.     mean oie B. January
3.     Toshiaght Souree C. midnight
4.     oie gyn cadley D. February
5.     Jerrey Souree E. July
6.     cadley geuree F. winter sleep (hibernation)
7.     Toshiaght Arree G. May
8.     Mean Souree H. sleepless night

 

  1. Match the Manx phrase (1-8) with the corresponding English translation (A-H)
  2. Translate into English: Mean Fouyir, gyn jerrey
  3. Translate into Manx: April, October

As is rightly emphasized on the website of the  International Linguistics Olympiad – which, incidentally, is going to be held on the Isle of Man this year, “no prior knowledge of linguistics or languages is required: even the hardest problems require only your logical ability, patient work, and willingness to think around corners”. For those who want to try and solve the problem for themselves, I will give the solution below, underneath a picture of a hellebore, the first flower to open in our garden at the end of winter:

 

Solution to the problem.

In the middle column I give the literal translations, and in the right column are the actual equivalents of the names of the months:

Jerrey geuree end of winter January
mean oie middle of the night
Toshiaght Souree beginning of summer May
oie gyn cadley sleepless night
Jerrey Souree end of summer July
cadley geuree winter sleep (hibernation)
Toshiaght Arree beginning of spring February
Mean Souree middle of summer June
Mean Fouyir middle of autumn September
gyn jerrey endless
Jerrey Arree end of spring April
Jerrey Fouyir end of autumn October

 
As we can see, the crucial point in solving of the problem is to realise that the seasons in Manx do not match up with the seasons in English: in Manx, January counts as the end of winter, not the middle, September is the middle of autumn, not the beginning, and so on.

The headache-bringer-oner(er) of the English agentive suffix

The headache-bringer-oner(er) of the English agentive suffix

The task of the light-turner-offer-onerer

Recently, a friend jokingly mentioned that he was thinking of hiring a light-turner-offer-onerer so that he wouldn’t have to get off the sofa to operate the light switch. In doing so, he made use of the extremely productive agentive suffix -er (also -or), which we use in English to derive a noun from a verb, to express the person or thing that carries out the action of the verb. The interpretation of this suffix is particularly transparent, even when used in completely novel ways, as in the recent article in The Economist newspaper cleverly titled The Baby Crisperer, drawing an analogy with The Horse Whisperer, while making reference to the gene-editing technology CRISPR-Cas9.

The butcher, the bak-er and the candlestick mak-er

But the striking thing about the opening example is the multiple occurences of the agentive suffix. Most of the time in English the agentive suffix is simply added to the end of a word, regardless of whether the word in question has a single element (e.g. baker) or is a compound word (e.g. candlestick maker). But in the humourous example of the light switch operator, we are faced with a phrasal verb (or rather two phrasal verbs, turn off and turn on with the second instance of turn elided) and, in this case, the agentive suffix is added to each element of the phrasal verb. Omitting any of them (with the exception of the final -er, but we’ll come to that later) feels instinctively wrong (e.g. light-turner-offer-on, light-turner-off-onerer, light-turn-offer-oner, light-turner-off-on, light-turn-off-oner etc.).

So, just what’s going on here? Well, the issue lies in the fact that English phrasal verbs consist of a verb (which by itself has a different meaning) followed by a preposition or adverb, and it is precisely this ordering that appears to trip speakers up. In English, suffixes (by definition) come at the end of a word, but when a word has various elements to it, such as a compound word, there are multiple places that could potentially host a suffix. Since the meaningful element of many English compounds comes at the end (e.g. a houseboat is a type of boat that people live in, while a boathouse is a type of house for boats), it usually goes without saying that the suffix attaches to the final word, but if that ordering is upset in any way we tend to see different forms competing with each other (e.g. mothers-in-law vs. mother-in-laws, directors-general vs. director-generals).

A boathouse (left) and a houseboat (right)

Drawing a parallel with inflectional suffixes, which only affect the verb in a phrasal verb (e.g. wash up > he washes up > he washed up, pass by > she passed by > she’s passing by), we might expect the same to be true when it comes to the agentive suffix -er. Indeed, this is precisely what we see with established forms like passer-by (recorded in the OED as early as 1568). The historical form knocker-up (recorded in the OED from 1861), which referred to a person who would rouse workers by knocking on their window, also followed this pattern; it’s worth noting, however, that the form knocker-upper also exists, as seen in this BBC article about the profession, but it’s unclear whether this is a recent innovation or not. (NB. With the demise of this profession, readers can be excused for interpreting the term knocker-up(per) as a man with a predisposition for getting women pregnant.)

A knocker-up(per) at work

Other terms derived with the -er suffix, however, do not adhere to the pattern of marking only the verb element of a phrasal verb. For instance, we often talk of a property in need of renovation as a fixer-upper. Although we do encounter the forms fixer-up and fix-upper, fixer-upper is by far the most widely used term (recorded in the OED from 1948, and with 41 million Google hits, as opposed to fewer than 180 thousand hits for either fixer-up or fix-upper); no doubt the US reality TV show about home renovations, Fixer Upper, has helped popularise this term, in the US at least.

In many cases, a form which marks both elements of a phrasal verb co-exists with a form which marks only the first element of the phrasal verb, with the former appearing to be a much more recent development. Below are some examples of this (with dates showing the earliest recorded occurrences in the OED):

washer-up (1907)       washer-upper (1961)
picker-up (1611)         picker-upper (1913)
looker-up (1867)         looker-upper (1934)
opter-out (1968)         opter-outer (not recorded)

The form opter-outer was not found in the OED, but is sometimes encountered (a Google search results in around 100 hits), such as in this Telegraph article about opting out of a pension. The opposite term, opter-inner, results in a mere 2 hits, however suprising that might seem following last year’s barrage of GDPR opt-in-related emails that we were all subjected to. (Perhaps this reflects the fact that, in the pre-GDPR world, we tended to opt out of things, rather than the reverse?) One of those hits is this short web article, where the writer is bemoaning the amount of spam emails she receives; in it, she not only uses the forms opted-in and opter-inner – the former illustrating the fact that inflectional suffixes generally only attach to the verbal element of the phrasal verb – but also uses opt-in as a noun, stating that “not all opt-ins are created equal”, where the inflectional suffix is instead on the preposition.

But what’s even more interesting than the -er suffix appearing on both elements of a phrasal verb is that some speakers take this process one step further: once every element has been marked with the -er suffix, it’s as if the word as a whole then needs marking with the suffix again, leading to variants like washer-upperer, doubling up on the suffix on the final element. Based on Google searches, the form with the double suffix is surprisingly less common that I (as a speaker of British English) ever thought it was – washer-upperer returns a mere 244 hits on a Google search, while washer-upper returns 47,500 and washer-up returns 110,000 – although it’s entirely possible that in spoken language forms like this are much more frequent, and the Google search of what people are prepared to commit to writing are skewing the results. In any case, common or otherwise, such forms exist. OK, so no doubt some forms with a double -erer suffix are produced for humourous effect, as our opening example of the light-turner-offer-onerer was, but might there be an explanation for why speakers produce these forms in the first place?

One possible explanation is that speakers add the final -er by analogy with agentive nouns formed from verbs that themselves end in -er and which thereby end in the same -erer sequence, such as gatherer, plasterer, murderer? If this is the case, we might hypothesise that the first -er on the particle serves to make the phrasal verb ‘feel’ more verb-like (from the perspective of the suffix), giving the second -er which performs the agentive function something that it is happy to attach to. Could this possibly explain why Vermont Mountain Real Estate have listed a property on their books as being “a good place to fix upper,” perhaps mistakenly interpreting the -er suffix on the adverb as somehow forming a verb (maybe even a back formation from “fix upperer”)? (A much less interesting explanation, of course, is that this is just a typo.)

This house is a good place to fix upper!

The locus of the plural marker -s in agentive nouns of this sort lends some weight to this idea. In forms that mark only the first element of the phrasal verb, such as passer-by and washer-up, the plural marker almost always attaches to the first element together with the agentive suffix, just as we would expect with inflectional suffixes (recall he washes up, she passed by), so we talk of the passers-by or the washers-up, but are less comfortable with the washer-ups (athough it should come as no surprise by now that both forms are found).

But if both elements of the phrasal verb take the agentive suffix, the plural marker attaches to the rightmost of the two (or more) suffixes. We can no longer say the washers-upper, but have to say the washer-uppers. When both elements take the agentive suffix, speakers appear to reanalyse the word as a single unit which no longer permits suffixes to occur internally (i.e. on a non-final element). And once it’s been reanalysed as a single unit, it almost seems right to then want to attach the -er suffix to the unit as a whole.

So while some may argue that this doubling up of the suffix is done intentionally, as a sort of metalinguistic joke, there are reasons to believe this isn’t always the case and that sometimes such forms (albeit markedly colloquial in nature) are produced because they just feel right and/or are following a rule in a speaker’s internal grammar.

Anyway, thinking about all this has brought on a headache, so I’m off to make myself an automatic day-maker-betterer(er)!

A fun bit of marketing, using the agentive suffix
Linguistic problem? Call in a violin

Linguistic problem? Call in a violin

Like brain surgeons, breakfast cooks and other professionals, linguists fall into two groups: believers and sceptics. Take the fact that wheat is singular in English and oats is plural. Believers are confident that there is a thoroughly good reason for differences like this, based on meaning. Sceptics aren’t easily convinced, and they talk shiftily about rules that once obtained but are since lost, partial regularities, conflicting motivations and simple exceptions. And things can get surprisingly heated, as in the linguistic skirmishes of the late 1980s and early 1990s, which centred on the discussion precisely of wheat and oats. (The feelings and the porridge have cooled sufficiently for it to be safe to mention these contentious nouns again.)

Many oats = much porridge

We talk about one or more scalpels or spatulas (these are count nouns), but we don’t usually count health, wealth or porridge (these are mass nouns). Mass nouns in English are typically singular, as indeed wheat is. So nouns like oats are unusual in being plural, and having no contrasting singular. They are known in the trade as pluralia tantum ‘plural only’. (In contrast, there are languages like Manam where all mass nouns are plural – they treat them all like oats.)

It’s not just mass nouns. We also find that there are nouns which we would expect to be ordinary count nouns which are actually pluralia tantum nouns in English. Examples include scissors, binoculars, trousers, slacks … The believers, who believe there must be a good reason for these nouns to behave in this way, argue as follows: It’s as we’d expect. These are all nouns whose referents have symmetrical parts (usually two, hence they are often called bipartites). Case proven.

But wait: bicycle has two significant parts, emphasised by its form in bi- (rather like binoculars). Why isn’t it subject to the generalization? Why isn’t it like binoculars? And while we’re on it, how about bigraph, shirt, duo and Bactrian camel? They all have two significant parts but are normal count nouns, just like letter, skirt, quartet and elephant.

Even so (say the believers) it’s not just English. French has les ciseaux (plural) ‘the scissors’, Russian has nožnicy (plural) ‘scissors’. These are pluralia tantum nouns – that can’t be coincidences. And yet, sceptically speaking, French has le pantalon ‘the trousers’ and Russian has binokl ‘binoculars’, and both are regular count nouns with singular and plural.

There are indeed various “usual suspects”, which regularly show up as pluralia tantum nouns in different languages, with sufficient frequency to persuade the believers and yet with more than enough no-shows to leave the sceptics unconvinced.

To resolve the issue once and for all (!), we need:

  1. A new item (not one from the “usual suspects” list)
  2. which can have one significant part or more than one (so that we can evaluate the force of the semantic regularity)
  3. with two different terms, one plurale tantum and one not
  4. and comparable forms in different related languages

And then we shall have a clear prediction: more than one significant part >> plurale tantum noun, one significant part >> ordinary count noun. We could resolve the dispute. But where could we hope to find such a creature, outside the laboratory? Here a drum roll would be particular apposite, for it is time for the entry of the Slavonic violins.

In the Balkans, the Slavs have a traditional instrument called the gusle, pictured below. You can hear someone playing it here. (This isn’t to be confused with the East Slavonic gusli, which is quite different, like a psaltery or small harp).

Serbian Gusle

Now the key (sorry) thing for us, is that the gusle in Serbia typically has one string (see the picture). Or rather have one string, since it’s a plurale tantum noun. Got that – so far, gusle, a plurale tantum noun, a traditional violin with one string. Similarly in Slovenian. But a normal singular in Macedonian and Bulgarian. There are different forms in dialects, but the message so far is one string, may be a plurale tantum noun or not.

But then of course there are all those romantic Slavonic symphonies. With classic violins, with four strings. What do they call those? Well, Slovenian, Macedonian and Serbo-Croat all have violina, and it’s a regular noun with singular and plural. Not looking too good for the believers here.

At this point, to be sure we’re conducting the research properly, it would be good to be certain that we’re talking about a classic violin, and just one (not a whole bank of them in a symphony orchestra). Well here a Nobel prize-winner comes to our aid. Ivo Andrić won the literature prize in 1961. He is famous for The Bridge on the Drina. But for us, we need the scene in the book in which two people are practising a Schubert sonatina. That’s one (classical) violin and one piano. Given the popularity of the novel, it’s been translated into most of the Slavonic languages, sometimes more than once. Moreover, to help thing along here, there’s a handy resource, the Parasol site, which allows us to search the parallel translations (that’s von Waldenfels, Ruprecht and Meyer, Roland (2006-): ParaSol, a Corpus of Slavic and Other Languages. Available at parasol.unibe.ch. Bern, Regensburg). As expected we find violina in Slovenian, Serbo-Croat and in Macedonian. Bulgarian is unique in having cigulka, but again it’s a regular noun with singular and plural. In the East Slavonic languages (Russian, Belarusian and Ukrainian) it is skripka (skrypka in Belarusian). A regular noun with singular and plural. But now in Polish we find the same root, skrzypce, but this is a plurale tantum noun. And yes, they all have four strings.

What about the keen concert-goers who speak Czech and Slovak? Well, they use housle and husle respectively. You can see, I think, where those terms come from, now applied to the classic violin, and yes, they are both pluralia tantum.

Didn’t Andrić mention gusle too? He did indeed, and gave it an important part (sorry) in his story. For the languages into which it is translated as an outside rather than local instrument it stays as a plurale tantum noun.

It gets better. The West Slavonic languages Upper and Lower Sorbian aren’t yet in the ParaSol corpus for this text, so we need to refer to dictionary sources. Stone (2002) gives three terms for ‘violin’ in Upper Sorbian: wiolina (a regular noun) and two pluralia tantum nouns husle and fidle. And it gets even better – the traditional Sorbian violin has three strings (see it here).

In a word, then, there are terms based on different roots, and they can be used of different instruments. But an instrument with four symmetrical parts is likely to be designated by a normal count noun, and one with a single string is likely to be designated by a plurale tantum noun. This is hardly in harmony with the world-view of the believers. But data are no bar to belief.

Brave new words

Brave new words

Words are all around us. And there are a lot of them out there! The Oxford English Dictionary contains full entries for over 170,000 words in current use and over 47,000 obsolete words. Yet, surprisingly, the Economist newspaper reports that most adult native speakers only have a vocabulary of between 20,000–35,000 words. Defining precisely what we mean by a ‘word’ is no mean feat, of course, but even so there is a huge chasm between these two figures.

So, if speakers of English typically know between 12 and 20% of the words recorded in the OED, one might understandably assume that there really wouldn’t be any need to go about creating new ones. Yet barely a day goes by when we don’t encounter a new word in some form or another, whether that be a word that is eventually fully adopted into the language, an ‘incorrect’ word, or even a one-time use word created on the spur of the moment, perhaps for comic effect.

But when we hear a new word for the first time, how are we supposed to know what it means?

Well, this partly depends on how the new word was formed. If the new word is a ‘blend’, then the meaning of the new word might be easily recoverable from its component parts, particularly if aided by context. For instance, the meaning of hangry (angry or frustrated due to hunger) would be quite transparent in ‘We ordered our food over an hour ago. What’s going on? I’m beginning to feel really hangry’, even if you’d never come across the word before. (NB. Given the findings in a hot-off-the-press article from less than a month ago, however, it would appear that the concept of hangriness is a little more involved that the component words might suggest!)

A hangry cat

In a similar vein, when a work colleague, who often takes the same train to work as me, suggested that we should trainstorm ideas during our commute, both the activity and the location were neatly conveyed in a single word that I immediately understood, despite the fact that I’d never heard it before and may never hear it again.

Likewise, if the word you’re hearing for the first time follows the general rules of the language, then it is usually a straightforward task to understand what is really meant. This scenario certainly applies when interpreting child language, which often follows language-internal rules even where they should be overridden by irregular forms, e.g. I goed to the shop and buyed a toy). This was illustrated fairly recently by my three-year-old daughter who, after lining up all her soft toy animals on the edge of her bed, proudly announced that she was the petshopper and asked if I would like to buy a pet.

But new words may also ‘break the rules’ as it were, and still be easy for us to interpret, perhaps by analogy with another similar word. At some point in time, in the not too distant past, what I presume must have been a well-paid marketing team came up with the notion of sun-blushed tomatoes. It’s a wonderful word which conveys a sense of sweetness from having been sat in the sun for a while, but a juiciness from not having been dried out in the same way as sun-dried tomatoes (compare the two images below – I know which ones I would prefer!). However, the verb to blush is intransitive, which means it shouldn’t be allowed to take an object. We can say ‘the sun dried the tomatoes’, but we can’t say ‘the sun blushed the tomatoes’ (and perhaps this is why the term ‘sunblush’ is also quite common nowadays). But by analogy to things that have been sun-dried or, more poetically, sun-kissed, it just works.

Shrivelled sun-dried tomatoes vs. juicy sun-blushed tomatoes

And if you’re Nigella, of course, you might take this process one step further and come up with your own recipe for moonblush tomatoes. These are tomatoes that have been cooked overnight (hence the reference to the moon) in the residual heat of a cooling oven (NB. there are no known cases of anyone having successfully used this method of cooking tomatoes prior to sundown). Google the term ‘moonblush’ and you’ll get 174,000 hits, a vast number of which will reference Nigella Lawson in some way, showing just how unique the word is!

Yet another category of new words are those which, on the surface, appear to follow some rule of word formation in the language, but actually leave you scratching your head when you encounter them for the first time, wondering what they mean. This scenario is often symptomatic of the word having been purposefully coined by someone, say for marketing purposes, who didn’t foresee the potential confusion.

Postcrete = fence post concrete

On a recent trip to a DIY store, I spotted big bags of postcrete. Since I wasn’t there to buy said product, I could have just ignored it, but as a linguist I am, unfortunately, subject to the occupational hazard of being unable to go about my daily life without questioning such things. I realised it had something to do with concrete, for obvious reasons – well, I suppose it could have been somehow related to Crete – and so began thinking to myself ‘I wonder what is used before that?’ I’d assumed the post part of the word was being used as a prefix indicating ‘after in time or in order’. Only later did I learn it was a special fast-setting concrete for bedding in fence posts!

Thinking of detoxing? Be prepared!

Similar confusion ensued when a colleague saw an advertisement which said “why detox, when you can pretox?” Presumably by analogy with detox, itself a relatively new word meaning the removal of toxins from one’s body, it did at first glance seem like the advert was recommending the opposite, i.e. to add toxins to one’s body. Using pretox as a verb probably contributed to the confusion, since words beginning with pre in English are almost invariably verbs meaning do x prior to something else (e.g. precook, preboard, prebook).

Finally, there will always be new words that we have never heard before and whose meaning we are unable to deduce from our existing knowledge of the language. I experienced this just two days ago when the word peng was mentioned in a TV commercial. Fortunately, in this digital age, those of us who are more chronologically gifted than secondary school pupils have the Urban Dictionary on hand to help out.

So, while new words may arise for all manner of reasons and in all manner of contexts, perhaps the most remarkable thing about them is our (almost) unfailing capacity to understand them despite never having heard them uttered before.

How to count to 1296 in Ngkolmpu

How to count to 1296 in Ngkolmpu

In order to feed his family for the year, and prove himself a worthy man, a man living in southern New Guinea is expected to grow 1296 yams (dioscorea sp.) each season. In Ngkolmpu, a language spoken by around 200 people who live in this region in a single village 15kms within the Indonesian side of the border between West Papua and Papua New Guinea, there is a single word for this number ntamnao.

To speakers of English, this seems like an arbitrarily specific number; yet to Ngkolmpu speakers it’s perfectly natural. Ngkolmpu, along with most of its related languages, has what is known as a senary numeral system also known as a base-six system. In English, we use a decimal system which is based on recursions of ten units while senary systems are based around recursions of six. In Ngkolmpu, the words for one to six are naempr, yempoka, yuow, eser, tampui and traowow. Seven is naempr traowo naempr or ‘one six and one;’ thirteen is yempoka traowo naempr or ‘two six and one.’ You should be starting to see the pattern now. But what happens when you get to six groups of six, i.e. 62 or 36? Well there is a specific word for that ptae.  In fact, in Ngkolmpu there are words for 62, 63, 64 and 65. That’s all the way up to 7776! Related language Komnzo even has a word wi which is used for 66 or 46,656! If you want to learn how to count to 7776 in Ngkolmpu the entire system is presented in Table 1.

1 naempr
2 yempoka
3 yuow
4 eser
5 tampui
6 61 traowo
7 naempr traowo naempr
8 naempr traowo yempoka
13 yempoka traowo naempr
36 62 ptae
216 63 tarumpao
1296 64 ntamnao
7776 65 ulamaeke

Table 1 – Senary numerals in Ngkolmpu

While we are used to decimal counting systems in English, lots of languages around the world use different systems. What is remarkable is that these senary systems are essentially unique to the southern New Guinea region. As far as we know, the only languages which use base-six are found in this region. In Ndom, a completely unrelated language to Ngkolmpu spoken on Yos Sudarso Island around 250kms away have a sort of light six-base system. Ndom displays unique words for the numbers one to six, but no words higher terms and no way to construct them from lower numerals; this is what is known as a ‘restricted numeral system.’ As far as we know, this complex base-six system as we see in Ngkolmpu and its relatives are an entirely unique development. This then raises a crucial question: How and why did such a system emerge?

Pic 1 – Yams and plantains for distribution after a feast

This is a hard question to answer. The leading theory on this is based on the primary use of the counting systems: yam tallying. In the communities of southern New Guinea, the various species of dioscorea aka yam are extremely important for every part of life. They are the primary food staple and, as we said before, the general consensus is that it takes a ntamnao of yams to feed a family for a year. Good yam gardeners count their yams to ensure they have enough food for the year but just as importantly for the bragging rights that accompany being a good gardener. Additionally, yams serve many ceremonial roles, for instance a wedding feast can’t be held without a ntamnao of yams which are meticulously counted, brought to the bride’s village and counted again with all parties present. Smaller feasts might require a tarumpao (216) which are counted and distributed to participants as in Picture 2. The significance of counting yams in these cultures has been hypothesised as the motivation for the development this counting system; something we don’t really see anywhere else in the world. The next question is why base six and not some other number? Well, the main yams consumed in this region are teardrop shaped with a round end and a narrow end. These when placed into small piles naturally fall into neat piles of 6 (Picture 3). This provides a motivation for a specifically 6 based system and supports the claim that numeral system emerged through the practice of tallying yams.

Pic 2 – 6 yams in a pile

The Ngkolmpu system only has numerals up to 7776 but hypothetically could be used to count to any number. Numeral systems of this type are known as ‘unrestricted numeral systems.’ We take this for granted in English but in smaller communities these are typically not that common. For example, in Marind a culturally dominant language spoken by around 9000 people in the same region as Ngkolmpu have words for one and two only. Counting is done by counting fingers and toes without any productive means for extending beyond that. Similar are the body part tallies of New Guinea such as the Oksapmin body part tally where one can count up to 27 by listing names for the places along the fingers, hands, arms and head for values up to 27 (Picture 4). This is very different to the Ngkolmpu system as we see in Table 1.

Pic 4 – Oksapmin body tally system

It was previously thought that unrestricted numeral systems could only develop in cultures which had sufficient organisational bureaucracy to warrant such a system. What the southern New Guinea situation shows is that the agrarian practices of yam cultivation under certain conditions also allow for the development of advanced counting systems. So, it looks like if people want to count something enough, they can develop the systems to do so which is remarkable.

The next time you have to count up something in multiples of six spare a thought for the Ngkolmpu and their wonderful counting system.