Browsed by
Category: English

Making cuts in the wrong places

Making cuts in the wrong places

When you want to look up a word, how do you go about it? The dictionary is organised by the first letter of the word, so that is what you consider first. And when you want to compare languages, what is the first thing to catch your eye? Again, the first sound. Thus, when looking at a set of words like English fish, father, full, Latin piscis, pater, plenus and Scottish Gaelic iasg, athair, làn, the fact that f- in English corresponds to p- in Latin and zero in Scottish Gaelic spring immediately to our attention, reading as we do from left to right.

Thus, we might presume that the beginning of a word is somehow especially stable, and that sounds which appear at the beginning of a word are a good first indicator of etymology. However, in fact the beginning of a word is not so immutable as you might suppose. Famously, Celtic languages have initial consonant mutations, which alters the initial consonant of a word in regular ways depending on grammatical context. So in Welsh, while ‘Wales’ is Cymru, ‘Welcome to Wales’ is Croeso i Gymru, ‘in Wales’ is yng Nghymru and ‘England and Wales’ is Lloegr a Chymru. This is interesting enough, but not the only way that the start of a word may be altered in languages. Indeed, we don’t even have to leave English to find examples of a different phenomenon that can take place in the history of an individual word.

Let us take a word like adder (the snake specifically, not someone that does addition!). We can look for cognates in closely-related languages, but we are immediately presented with a problem: German Natter, Frisian njirre and Icelandic naðra all seem like they should be related (all being words for ‘snake’), but what’s with this n- at the beginning of the word? Things only get more confusing when we notice words like Latin natrix ‘watersnake’, Welsh neidr or Scottish Gaelic nathair, all again showing an n-. Finally, when we look at Old English we find that the word there is næddre! What’s going on? We know that in general English n- doesn’t do anything particularly strange and it certainly doesn’t just disappear from the beginnings of words, as evidenced by numerous forms like name, night, nest, new, and nine which have had an n- since Proto-Indo-European!

The answer lies in a phenomenon that linguists call ‘rebracketing’. This is a fairly straightforward notion; linguists already make use of brackets to show the internal structure of phrases, thus any change in the structure of the phrase is notated by a change in the arrangement of the brackets. (It will be noted that some authors, including the Oxford English Dictionary, use the term metanalysis instead, but the meaning is the same.)

In the case of adder, the confusion comes from the indefinite article, which in English is a before words beginning with a consonant and an before words beginning with a vowel. Thus, if a word begins with an n-, this can find itself being rebracketed onto the indefinite article: thus [a [nadder]] becomes [a-n [adder]]. And this isn’t the only word where this has happened in English either: thus [a [napron]] (from French napperon) became [a-n [apron]]. On the flipside, the opposite is also found, where the -n from the indefinite article finds itself attached to the front of a word that originally began with a vowel, e.g. [an [ewt]] → [a [n-ewt]] or [an [ekename]] → [a [n-ickname]].

A newt crawling over moss.
An ewt!

Some of these forms have since become the predominant forms of their respective words, but such is not always the case. For example, uncle derives from a French word oncle, ultimately from Latin avunculus. However, those who are familiar with their Shakespeare will remember the Fool in King Lear, who refers to the title character as ‘nuncle’. Here the reanalysis, rather than from the indefinite article, seems to have been on the basis of possessive pronouns mine and thine, which are particularly frequently used with kind terms: thus [mine [uncle]] becomes [my [nuncle]]. Yet, unlike with the other examples, this has not stuck around, perhaps because the other possessive pronouns (his, her, our, your, their) which would not have motivated this reanalysis; thus the original uncle stuck around and was able to reassert itself.

Nor is English alone in exhibiting these kinds of change. In the adder~nadder case, the same reanalysis has also taken place in Dutch and Low German, also spelt adder in both cases. Similarly, Arabic nāranj was borrowed into Spanish as Naranja, but this underwent rebracketing when it was borrowed into Italian as arancia, and it was from there that the word spread to the rest of Europe, including English orange.

French provides us with an especially interesting example of layered reanalyses in a single word. In Old French, unicorne was reanalysed as beginning with the indefinite article (which is in a sense not incorrect: the literal meaning of the word is ‘one-horn’ and ‘one’ is the source of the French indefinite article, as well as indefinite articles in general cross-linguistically). This left a form icorne, which would contract with the definite article, giving l’icorne ‘the unicorn’. However, at some point, this contracted form with the article came to be reanalysed as the base of the noun itself, with the result that licorne is now simply the French for ‘unicorn’, leading to constructions such as la licorne ‘the unicorn’ where a historical definite article appears ‘doubled up’!

Some of the most complex cases of rebracketing can be found in Scottish Gaelic. Here we have a number of potential sources of rebracketing, both because the definite article changes depending on the following noun and because of the interaction of the definite article and the mutation system.

Firstly, with vowel-initial masculine noun the definite article prefixes a t- e.g. eun ‘bird’ but an t-eun ‘the bird’. Unsurprisingly, based on the examples we have seen above, this prefixed t- has in many cases become attached to the noun. Interestingly this is particularly common in loanwords from Old Norse, such as talla ‘hall’ from hǫll, tòb ‘small bay’ from hóp (òb is also common) and tolm ‘small islet’ from holmr, as well as other loans such as taigeis ‘haggis’ and tobha ‘hoe’ from English.

In a similar vein, one of the components of consonant mutation is Scottish Gaelic is that an f sound disappears (though is still written as fh). As a result, a larger number of words that began with vowels in Old Irish have acquired an f- in Scottish Gaelic, e.g. áinne ‘ring’, uar ‘cold’ and íaru ‘squirrel’ have become fáinnefuar and feòrag respectively, as if an áinne uar ‘the cold ring’ was really an fháinne fhuar. Many of the words have undergone the same kinds of changes in Irish and Manx, though not all languages agree on which (e.g. Irish also has fáinne and fuar but iora respectively).

And, as in English, words that begin with n- can find this consonant being rebracketed as part of the article an. However, once this n- has been rebracketed, this now vowel-initial word can undergo the same kinds of mutation-based reshaping as an originally vowel initial word. Perhaps the most extreme example of this is ‘nettle’, which was nenaid in Old Irish, but in Scottish Gaelic can be (depending on who you ask) any of neanntag, eanntag (with the n- rebracketed away), feanntag (with the f- appended by lenition reversal) and deanntag (where the d- is apparently a hypercorrective reversal of a process of nasalisation in the Northwestern dialects)!

A bed of nettles
neanntag, eanntag, feanntag or deanntag?

So, when searching around for a word in a dictionary or an old text, be cautious; simply looking for the first consonant to give you a clue might be misleading when taken out of context. Furthermore, instances like these make clear that language is primarily a spoken phenomenon and the kinds of changes that we see reflect that: while in a written text the different between a newt and an ewt is obvious, in spoken language the question of where one word ends and the nexts begins is not so straightforward as a casual glance at a dictionary might suggest. Perhaps this should then make us ponder further how much written language is a direct reflection of spoken language versus being at least partially arbitrary choices made by the writers.

Royal Rules on Rich Rectors

Royal Rules on Rich Rectors

Last month, the Guildford Shakespeare Company put on a production of Richard II, a fascinating tale of political strife and the perils of having a leader lacking in competence when the country is in crisis. Sound familiar? In any case, this got me thinking about the name Richard and its many etymological links.

First with the name Richard. It’s borrowed from French, but it didn’t start there. In fact it is one of a number of French words that was borrowed from Germanic, deriving from Frankish *Rīkahard, meaning ‘hard/brave king’. This also gives modern German Richard and through the travels of the Goths and Vandals also made its way into Spanish as Ricardo and Italian as Riccardo. The first part of this name, the *rīk- ‘ruler’ part, in other derivations also gives words like German Reich and Dutch rijk, both meaning ‘empire’ or ‘kingdom’, which in English is also found as the ‘domain, kingdom’ suffix -ry, as in Jewry ‘the Kingdom of the Jews’. As different derivation again gives us English rich, something you’d rather expect a king to be. As a component of names it is ubiquitous in Germanic, such as in Old English Godric ‘God(ly) king’, Wulfric ‘Wolf-king’ and Theodric ‘King of the people’. This last one turns up in German as Dietrich and, again courtesy of the Franks, through French Thierry comes into English as Terry (see also my previous post on the Germans for more on this Theod-).

But it is not only Germanic languages that have this root. Indeed, some form of it crops up across the Indo-European language family, usually meaning something like ‘king’ or ‘ruler’. In Celtic (from which Germanic likely borrowed the rīk- words) we find e.g. in Irish and rhi in Welsh, both meaning king. In Gaulish, rulers such as Vercingetorix and Ambiorix had an earlier form –rix it as part of their name, and in a reduced form we find the same in the Welsh surname Tudor, originally meaning ‘ruler of the people’ and thus cognate with Theodric/Dietrich/Terry.

In Latin too we find rēx, again meaning ‘king’ or ‘ruler’. This form survives as such in many modern Romance languages, for example Spanish rey and French roi. We also get two separate adjectives in English: regal from Latin and royal from French. Further afield, we find this word cropping up as far away as India, in the form of Sanskrit rāja, once again a ‘king’ word, as well as rāṣṭrá, a ‘kingdom’.

All of these forms can be traced back to a form in Proto-Indo-European (the reconstructed ancestor of all of these languages), which we represent as *h3rḗǵs. In the terminology of Indo-European studies this is an ‘athematic root noun’, meaning a short root without additional derivational suffixes onto which inflectional endings such as the nominative singular *-s are suffixed directly, rather than having an additional ‘theme vowel’ *-o inbetween. As with many such forms in Proto-Indo-European, when we isolate the root itself, *h3reg-, which probably meant something like ‘stretch out the arm, direct’, we can find even more related derivations.

Adding a thematic vowel *-e/o- we get a verb which shows up in Latin as regō ‘rule, govern, direct’, along with an array of derived nouns which we have inn English. We have the agent noun rector, the instrument noun rule (from a French reflex of Latin rēgula) and the abstract noun regimen. Additionally, we have prefixed verbs such as dīrigō, ērigō and corrigō, which through their respective supine forms dīrēctum, ērēctum and corrēctum give us English ‘direct’, ‘erect’ and ‘correct’ respectively.

Germanic, meanwhile, provides us with a different set of reflexes of this verb. While we have already seen the rich set relating to wealth and kingship, the ‘straighten’ meaning of *h3reg- results in other interesting links. We have the (originally separate) verb and noun rake, a device for making straight lines, and the former participle right, originally meaning ‘straightened, directed’. Then we have reckon, perhaps a natural extension of the metaphor of lining things up in order to count them. Finally, from a causative ‘make straighten up’ we have reach (as if ‘straightening out one’s arm’).

This here is the greatest joy of etymology for me; by untangling these webs of relationships, we can show how so much of our vocabulary results from variations upon a common root. It reminds us of the continual creativity involved in using language and, by extension, the creativity of language users, i.e., humans.

Remember, remember

Remember, remember

A lot of the work that linguists do involves taking a language as it is spoken at a particular time, finding generalizations about how it operates, and coming up with abstractions to make sense of them. In English, for example, we identify a category of ‘number’ (with possible values ‘singular’ and ‘plural’); and we do that because in many ways the relationship between cat and cats is the same as that between mouse and mice, man and men, and so on, meaning that it would be useful to treat all of these pairings as specific examples of a more general phenomenon. We can then make the further generalization that whatever this linguistic concept of ‘number’ really is, it is not only relevant to nouns but also to verbs, and to some other items too – because English speakers all know that this cat scratches whereas these cats scratch, and you can’t have any other combination like *these cat scratch.

A black cat wearing bat wings for Halloween
This bat scratches

Once you start looking, you discover layer upon layer of generalizations like these, and you need more and more abstractions in order to take care of them all. This all gives rise to a view of language as a kind of machine built out of abstract principles, all coexisting at the same time inside a speaker’s head. On that basis, we can ask questions like: are there any principles that all languages use? Does having pattern X always go along with having pattern Y? Are there any generalizations that you can easily come up with, but that turn out not to be found anywhere? What does all this tell us about human psychology?

But that is not the only approach to language we could take. While we can point to a general principle of English to explain what is wrong with these cat, there is no similar principle explaining why we refer to the meowing, purring, scratching creature as a cat in the first place. The word cat has nothing feline about it, and the fact that we use that sequence of sounds – rather than e.g. tac – is not based on some higher-level truth that applies for all English speakers right now: instead, the ‘explanation’ is rooted in the fact that this is the word we happened to inherit from earlier generations of speakers.

Portrait photo of General Burnside, featuring his famous sideburns
General Ambrose Burnside (1824-1881)

So studying the etymology of individual words serves as a good reminder that as well as an abstract, principled system residing in human minds, every language is also a contingent historical artefact, shaped by the peoples and cultures of the past.1 Nothing makes this more obvious than the continued existence of ordinary vocabulary items that commemorate individuals from centuries gone by – often without modern-day speakers even knowing it. In English, sandwiches are named after the Earl of Sandwich, wellingtons are named after the Duke of Wellington, and cardigans are named after the Earl of Cardigan; and the parallelism here says something about the locus of cultural influence in Georgian and Victorian Britain. More cryptically, sideburns owe their name to a General Burnside of the US Army, justly famed for his facial hair; algorithms celebrate the Persian mathematician al-Khwarizmi; and Duns Scotus, although a towering figure of medieval philosophy, now lives on in the word dunce popularized by his academic opponents.2

But which historical figure has had the greatest success of all in getting his name woven into the fabric of modern English? I reckon that, against all the odds, it could well be this Guy.

A close up of the face of Guy Fawkes, labelled Guido Fawkes, from a depiction of several conspirators together

While all English speakers are familiar with the word guy as an informal word corresponding to man, probably not that many know that it can be traced back to a historical figure from 400 years ago who, in a modern context, would be called a religious terrorist. Guy Fawkes was one of the conspirators in the ‘Gunpowder Plot’ of November 1605: with the aim of installing a Catholic monarchy, they planned to assassinate England’s Protestant king, James I, by blowing up Parliament with him inside. Fawkes was not one of the leaders of the conspiracy, but he was the one caught red-handed with the gunpowder; as a result, one cultural legacy of the plot’s failure is the celebration every 5th November (principally in the UK) of Guy Fawkes Night, which commonly involves letting off fireworks and setting a bonfire on which a crude effigy of Fawkes was traditionally burnt.

But how did the name of one specific Guy, for a while the most detested man in the English-speaking world, end up becoming a ubiquitous informal term applying to any man? The crucial factor is the effigy. It is unsurprising that this came to be called a Guy, ‘in honour’ of the man himself; but by the 19th century, the word was also being used to refer to actual men who dressed badly enough to earn the same label, in the way one might jokingly liken someone to a scarecrow (one British woman writing home from Madras in 1836 commented: ‘The gentlemen are all ‘rigged Tropical’,… grisly Guys some of them turn out!’). It is not a big step from there to using guy as a humorous and, eventually, just a colloquial word for men in general.3

Procession of a Guy (1864)

And of course the story does not stop there. While a guy is still almost always a man, for many speakers the plural guys can now refer to people in general, especially as a term of address. The idea that a word with such unambiguously masculine origins could ever be treated as gender-neutral has been something of a talking point in recent years, as in this article from The Atlantic about the rights and wrongs of greeting women with a friendly ‘hey guys’; but the fact that it is debated at all shows that it is happening. In fact, there is good reason to think that in some varieties of English, you-guys is being adopted as a plural form of the personal pronoun you: one piece of evidence is the existence of special possessive forms like your-guys’s, a distinctively plural version of your.

It is interesting to notice that the rise of non-standard you-guys, not unlike y’all and youse, goes some way towards ‘fixing’ an anomaly within modern English as a system: almost all nouns, and all other personal pronouns, have distinct singular and plural forms, whereas the standard language currently has the same form you doing double duty as both singular and plural. Any one of these plural versions of you might eventually win out, further strengthening the (already pretty reliable) generalization that English singulars and plurals are formally distinct. This just goes to show that the two ways of looking at language – as a synchronic system, and as a historical object – need to complement each other if we really want to understand what is going on. At the same time, it is fun to think of linguists of the distant future researching the poorly attested Ancient English language of the twenty-second century, and wondering where the mysterious personal pronoun yugaiz came from. Would anyone who didn’t know the facts dare to suggest that the second syllable of this gender-neutral plural pronoun came from the given name of a singular male criminal, executed many centuries before?

  1. For example, cat itself seems to be traceable back to an ancient language of North Africa, reflecting the fact that cats were household animals among the Egyptians for millennia before they became popular mousers in Europe. []
  2. Of course, it is no accident that all of these examples feature men. Relatively few women in history have had the opportunity to turn into items of English vocabulary; in fact, fictional female characters – largely from classical mythology – have had much greater success, giving us e.g. calypso, rhea and Europe. []
  3. A similar thing also happened to the word joker in the 19th century, though it didn’t get as far as guy: that suggests that sentences containing guy would once have had the same ring to them as Who’s this joker?; and then some joker turns up and says… []
Is twote the past of tweet?

Is twote the past of tweet?

Have you ever encountered the form twote as a past tense of the verb to tweet? It is something of a meme on Twitter, and a live example of analogy (and its mysteries). However surprising the form may sound if you have never encountered it, it has been the prescribed one for a long time:

Ten years later, the question popped up among a linguisty Twitter crowd, where a poll again elected twote as the correct form:

It is clear that this unusual form replacing tweeted is some sort of form, but why specifically twote? I saw here and there a reference to the verb to yeet, a slang verb very popular on the internet and meaning more or less “to throw”. Rather than a regular form yeeted, the past for to yeet is often taken to be yote. The choice of an irregular form is probably meant to produce a comedic effect.

This, precisely, is analogical production: creating a new form (twote) by extending a contrast seen in other words (yeet/yote). Analogy is a central topic in my research. I have been trying to answer questions such as: How do we decide what form to use ? How difficult is it to guess? How does this contribute to language change?

But first, have you answered the poll?

What is the past tense of “to tweet”?

To investigate further why we would say twote rather than tweeted, I took out my PhD software (Qumin). Based on 6064 examples of English verbs1, I asked Qumin to produce and rank possible past forms of tweet2. To do so, it read through examples to construct analogical rules (I call them patterns), then evaluated the probability of each rule among the words which sound like tweet.

Qumin found four options3: tweeted (/twiːtɪd/), by analogy with 32 similar words, such as greet/greeted; twet (/twɛt/), by analogy with words like meet/met; tweet (/twiːt/) by analogy with words like beat/beat, finally twote (/twəˑʊt/), by analogy with yeet. Figure 1 provides their ranking (in ascending order) according to Qumin, with the associated probabilities.

Twote 0.028 < tweet 0.056 < twet 0.056 < tweeted 0.86
Figure 1. Qumin’s ranking of the probability for potential past forms of to tweet

As we can see, Qumin finds twote to be the least likely solution. This is a reasonable position overall (indeed, tweeted is the regular form), so why would both the official Twitter account and many Twitter users (including several linguists) prefer twote to tweeted?

But Qumin has no idea what is cool, a factor which makes yeet/yote (already a slang word, used on the internet) a particularly appealing choice. Moreover, Qumin has no access to semantic similarity, which could also play a role. Verbs that have similar meanings can be preferred as support for the analogy. In the current case, both speak/spoke and write/wrote have similar pasts to twote, which might help make it sound acceptable. Some speakers seem to be aware of these factors, as seen in the tweet above.

What about usage?

Are most speakers aware of the variant twote and using it? Before concluding that the model is mistaken, we need to observe what speakers actually use. Indeed, only usage truly determines “what is the past of tweet”. For this, I turn to (automatically) sifting through Twitter data.

Speakers must choose between tweeted or twote: what a dilemna !

A few problems: first, the form “tweet” is also a noun, and identical to the present tense of the verb. Second, “twet” is attested (sometimes as “twett”), but mostly as a synonym for the noun “tweet” (often in a playful “lolcat” style), or as a present verbal form, with a few exceptions, usually of a meta nature (see tweets below). I couldn’t find a way to automatically distinguish these from past forms while also managing within the Twitter API limits. Thus, I left out both from the search entirely. This leaves only our two main contestants.

 

I extracted as many recent tweets containing tweeted or twote as Twitter would let me — around 300 000 tweets twotten between the 26th of August and the 3rd of September. 186777 tweets remained after refining the search4. Of these, less than 5000 contain twote:

There were more than 180000 occurences of tweeted and less than 5000 of twote in the past few days.
Counts of tweets containing either of two possible pasts for the verb “to tweet” in the past few days on twitter (mentions excluded).

As you can see, the tweeted bar completely dwarfs the other one. However amusing and fitting twote may be, and despite @Twitter’s prescription (but conforming with Qumin’s prediction), the regular past form is by far the most used, even on the platform itself, which lends itself to playful and impactful statements. This easily closes this particular English Past Tense Debate. If only it were always this simple!

  1. The English verb data I used includes only the present and past tenses, and is derived from the CELEX 2 dataset, as used in my PhD dissertation and manually supplemented by the forms for “yeet”. The CELEX2 dataset is commercial, and I can not distribute it. []
  2. The code I used for this blog post is available here, but not the dataset itself. Note that for scientific reasons I won’t discuss here, this software works on sounds, not orthography. []
  3. One last possibility has been ignored by this polite software, a form which follows the pattern of sit/sat. I see it used from time to time for its comic effect, but it does not seem at all frequent enough to be a real contestant (and I do not recommend searching this keyword on Twitter). []
  4. Since there has been a lot of discussion on the correct form, I exclude all clear cases of mentions. I count as mentions any occurrences wrapped in quotations, co-occurring with alternate forms, mentioning past tense, or with a hashtag. Moreover, with the forms in –ed, it is likely that the past participle would be identical, but for twote, the past participle could well be twotten. To reduce the bias due to the presence of more past participles in the usage of tweeted, I also exclude all contexts where the word is preceded by the auxiliary forms has, have, had, is, are, was, were, possibly separated by an adverb. []
Christmas Gifts

Christmas Gifts

Recently, a friend of mine received an email saying that because of their hard work in difficult circumstances this year, he and his colleagues would all be “gifted” a few extra days off over Christmas. And the other day I saw someone else wondering on Facebook: ‘when did the word “given” cease to exist, and why is everything “gifted” now?’ So with the festive season fast approaching, it seems like a good time to ask: is there really something funny going on with the word gift?

Once you gift it a bit of thought, I don’t think I am gifting anything away by pointing out that the verb to give is still very much with us. But the rise of a rival verb to gift, in some contexts where you’d expect to give, has been receiving attention for a while now: in recent years it has been discussed on National Public Radio in the US (The Season of Gifting) and in The Atlantic magazine (‘Gift’ is Not a Verb). Whether or not it bothers you personally, you may well have noticed the trend. The existence of gift as a noun is just a mundane fact of life, but apparently the corresponding verb gets people talking.

Gifted children

Now, nobody would be surprised to learn that English changes over time, or even that it has pairs of words that mean more or less the same thing… how much difference is there between liberty and freedom, or between little and small? And in fact, synonyms have an important role to play in language change. If we look back and notice that one expression has been replaced by another – a historical change in the vocabulary, as when the Shakespearian anon gave way to at once – then there must have been an intervening period when they were both around with pretty much the same meaning, and people had a choice of which one to use.

Does that mean that we do now find ourselves in the very early stages of a long historical process which will eventually result in to gift replacing to give altogether? If that’s the case, in a few generations’ time people will be saying things like ‘Never gift up!’ or ‘Could you gift me a hand?’.

Frankly, my dear, I don’t gift a damn

But whatever happens in the future, that clearly isn’t the situation now. So if English often provides multiple ways of saying the same thing, why have people taken the coexistence of to give and to gift as something to get worked up about – and can linguistics shed any light on what is going on here?

One thing that makes this specific pairing stand out is that the two words are just so similar. Gift is obviously connected with give in the first place: that makes it easy to wonder why anyone would bother to avoid the obvious word, only to pick an almost identical one. Another factor (as the title of The Atlantic article makes clear) is the idea that gift is really a noun, and so people shouldn’t go around using it as a verb.

But if we take a broader view, it turns out that what is happening with to gift is not out of the ordinary. Instead, it fits neatly with some things that linguists have already noticed about English and about language change more generally. For one thing, English is very good at ‘using nouns as verbs’ – which is why we can hammer (verb) with a hammer (noun), fish (verb) for fish (noun), and so on. So a verb gift, meaning ‘give as a gift’, goes well with what the language already does. What often happens is that when a new verb of this kind starts to take off, not all speakers are happy about it, but after a while it gains acceptance. For example, the twentieth century saw complaints about verbs-from-nouns such as to host, to access or to showcase, but they grate less on people nowadays.

You could even try hammering with a fish!

Ultimately, the ability to create words like this is just an ‘accidental’ fact about English, which also has various other ways of making verbs from nouns – for example, turning X into ‘X-ify’ (person-ify, object-ify) or ‘be-X’ (be-friend, be-witch). The bigger question may be: as we already have the verb give, why would anyone bother to make a verb gift in the first place, and why would it ever catch on? It might seem that by definition, a gift is something you give, so inventing a term meaning ‘give as a gift’ is pointless.

But that is not how things really are. Gifts are given, but that doesn’t mean that everything that can be given counts as a gift: a traffic warden might give you a parking ticket and in return you might give him a piece of your mind, but the noun gift doesn’t cover either of those things. Among other restrictions on its use, it is generally associated with positive feelings: if you give something as a gift, it is usually something tangible that you expect to be warmly received, and that carries over into the verb to gift itself.

This subtle difference between to give and to gift explains why for the moment it is impossible to gift someone a sidelong glance, or lots of extra work to do. But apparently it is becoming possible to gift an employee some time off, even though that is not a physical present that can be handed over and unwrapped. Evidently, the writer just felt like using a verb that sounded a bit more interesting and positive than to give, and the ‘warmly received’ part of the meaning was enough to outweigh the lack of any tangible object involved.

This is an example of something that happens all the time in language change. Naturally, while a word is still restricted in its use, it is more noticeable and interesting than a word you hear regularly. As a result, sometimes people decide to go for the less common word even where it doesn’t quite belong, to achieve some kind of extra effect… but over time, this process makes the word sound less and less special, until it eventually becomes the new normal. We don’t even need to look far to find this happening precisely to the word ‘gift’ in other languages: French donner ‘give’ is based on don ‘gift’, and it has totally wiped out the normal verb for give that ‘should’ have been inherited from Latin.

So if speakers and writers of English continue to chip away at the restrictions on gift as a verb, maybe one day it really will replace give altogether. Of course, that idea sounds totally outlandish at the moment – but then, I’m sure the ancient Romans would have thought much the same thing. You never know what will happen next: language change truly is the gift that keeps on giving!

Siôn Corn: The bloke who comes down the chimney

Siôn Corn: The bloke who comes down the chimney

It’s December, which means you’ve probably been bombarded with ‘Christmas cheer’ since the beginning of November. Bah humbug I say! And if you’re from down under, I feel really sorry for you having to celebrate twice a year – once in July and then again in December! You may think of me as a bit of a Scrooge spoiling all your fun but…

Speaking of Scrooge, that’s a great instance of personification, how a characteristic of a person gets attached to their name. The name is then used to refer to that characteristic. It happens a lot, just look at the recent phenomenon concerning poor Karen. Something similar happens when common and frequent names get hijacked into standing for the average Joe.

Moving on to Joe, that’s one of the many names in English used for the everyman, as in Joe Bloggs, or Joe Public. Similarly, John or Jane as in John Doe or Jane Doe, a term for an unknown person, especially used in the USA for unidentified cadavers. And in the UK, John Bull is the personification of the nation.

John Bull: the personification of the UK

And let’s not forget Jack, itself a nickname for John. Jack is found in many phrases relating to the everyman, especially in reference to someone of historically low status (hence Jack in a pack of cards being lower than the King or Queen) or in phrases about working in a rural employment, as in lumberjack, or the Australian Jackaroo (or Jillaroo!) for someone learning to work on a sheep or cattle farm. Jack has also been extended to objects that are generally handy and helpful – such as carjack and jackhammer.

This brings us to the title of our post, Siôn Corn, which is the name of Santa Clause in Wales and can be translated as ‘John Stack’ (as in corn simnee ‘chimney stack’) or ‘Chimney Pot John’. Siôn is the Welsh equivalent of the everyman, and is used to mean, the guy, the bloke etc.

Siôn Corn and his Welsh dragon.

The name Siôn is used in many different phrases and is the personification of many personal characteristics.

  • Siôn Barrug ‘Jack Frost’
  • Siôn yr offis ‘personification of laziness’
  • Siôn Chwarae Teg ‘personification of fair play’
  • Siôn o’r wlad ‘itinerant worker’
  • Siôn Cwsg ‘sleepiness, or the sandman’
  • Siôn Ben Tarw ‘John Bull’
  • Look up Siôn at the dictionary of the Welsh language for many more interesting examples
  •  
    As for the use of Siôn Corn denoting the personification of yuletide, the earliest reference comes from the Welsh scholar, poet and songwriter, J. G. Davies in his 1923 Children’s songbook Cerddi Huw Puw:

    The history of Sion Corn is unknown to me any further back than my father’s dialogues with him in the seventies. He was a benevolent spook, living up the chimney in comfortable apartments. He had some mysterious interest in getting children off to bed early, and a more rational habit of making presents at Christmas, as a Welsh Santa Claus. I do not know whether my father found him in Edern, his mother’s home, or invented him. Anyhow, Sion Corn has done untruthful and amiable service for two generations.

    So it seems, before Siôn Corn took on the persona of Father Christmas, he had another job, helping to get children to bed, much like a Siôn Cwsgsandman’. Though, of all the meanings that Siôn connotes, I like Siôn llygad y geiniog ‘miser’ the best. Basically, Siôn can be both Father Christmas and Scrooge at the same time – Siôn really is a Siôn pob crefft ‘a Jack of all trades’.

    What slips of the tongue can tell us about language

    What slips of the tongue can tell us about language

    “The grouchy knight cuddled the rowdy seer’s adorable puppy while devouring lasagne”

    This is probably a sentence you’ve never heard – or produced – before. Yet this experience is not novel – everyday, you make utterances you’ve never heard, and understand new ones.

    Producing such utterances is not a trivial matter. To do this we have to generate them – that is, decide on the concept to be expressed, encode that into words and structures, then into the sounds that make up our words before sending instructions to our articulatory apparatus to produce the utterance. All within fractions of a second.

    Yet, sometimes we make mistakes, and produce things we didn’t intend to do:

    Error (The Mistake we Make) Target (What we had intended to say)
    heft lemisphere left hemisphere
    squoor squeaky floor
    a leading list a reading list
    gave the goy gave the boy
    stough competition stiff/tough competition
    she sliced the knife with a salami she sliced the salami with a knife
    a hole full of floors a floor full of holes

     

    We usually notice these errors when we make them and correct ourselves. But rather than being merely slips of tongue, they are a goldmine of information as they demonstrate breakdowns at various parts in the speech production process.

    Some of these errors are lexical selection errors – we select the wrong lexical concept or lemma for the message we’re trying to say. That is, we select the wrong word stored in our brains, we pick the wrong word from our mental dictionary. This can be simply the wrong concept, as in: ‘he’s carrying a bag of cherries’ instead of ‘grapes’. Sometimes, we can combine words together in blends: ‘the competition is getting a little stough’ instead of stiff or tough. Other times, we can exchange words within a sentence, as in ‘she sliced the knife with a salami’, rather than ‘she sliced the salami with a knife’.

    We can also make phonological errors, that is, errors in the sound structure of our words:

    Exchanges
    heft lemisphere left hemisphere
    fleaky squoor squeaky floor
    cheek and ch[ɔː]se Chalk and cheese
    Additions
    enjoyding it enjoying it
    Deletions
    cumsily Clumsily
    Anticipations
    leading list reading list
    Perseverations
    gave the goy gave the boy

     

    We can look at large data sets, or corpora, to see what kinds of errors are commonly made. We find that these errors are still well-formed in terms of their sound structure, or phonology. 60-90% of errors (depending on the corpus you look at) involve errors with a single sound or segment, and these errors are sensitive to syllable structure. That is, we might swap segments from the same part of the syllable as in exchanges:

    face spood < space food

    Or we might combine the beginning of one syllable and the end of another:

    grool < great + cool

    We also like to swap sounds that are similar to each other, so

    paid mossible < made possible

    is more likely than

    two sen pet < two pen set

    There are exceptions to these generalisations of course – but they are rare.

    Speech errors give us an insight into normal speech production processes. The fact that sound errors occur at all tells us that speech production is a generative process – it is not that we just reproduce fully formed stored sentences, but rather we create each utterance afresh each time. In order to mix or swap two elements, both must be activated at the same point of the production process.

    Furthermore, the range of speech across which errors can occur implies that the span of processing is greater than a single word. You might be familiar with spoonerisms, popularised by Dr William Archibald Spooner:

  • You were caught fighting a liar in the quad < You were caught lighting a fire in the quad
  • You have hissed my mystery lectures < You have missed my history lectures
  • You have tasted the whole worm < You have wasted the whole term
  •  
    We must plan more than a word ahead for errors like these to happen.

    There is a much wider array of questions we can ask about speech production than can be answered by speech errors, but certainly they are an entertaining place to start.

    Eggcorns and mondegreens: a feast of misunderstandings

    Eggcorns and mondegreens: a feast of misunderstandings

    Have you ever felt that you needed to nip something in the butt, or had the misfortune to witness a damp squid? And what can Jimi Hendrix, Bon Jovi and Freddie Mercury tell us about language change?

    Well, if you know Hendrix’s classic “Purple Haze”, you surely remember the moment where he interrupts his train of thought with the unexpected request, ‘Scuse me while I kiss this guy. Or perhaps you recall “Living on a Prayer”, where we hear that apparently It doesn’t make a difference if we’re naked or not. And who can forget the revelation, in “Bohemian Rhapsody”, that Beelzebub has a devil for a sideboard?

    Wise words from Celine Dion

    If you do remember these lyrics fondly, you are not alone – lots of people are familiar with these exact lines. There is just one problem, of course: none of those songs really say those things. Instead, the lyrics involved are ‘Scuse me while I kiss the sky; It doesn’t make a difference if we make it or not; and Beelzebub has a devil put aside for me. And yet thousands of English speakers the world over have had the experience of listening to “Purple Haze” and the others – and of misunderstanding the words, entirely independently, in exactly the same way.

    Mishearings of this kind are common enough that they have been given a name of their own, mondegreens – a word invented by the American writer Sylvia Wright, who as a child heard a poem containing the following lines:

    For they hae slain the Earl o’ Moray
    And laid him on the green

    and assumed that it listed not one but two victims – the unfortunate Earl himself, and “Lady Mondegreen”, a plausible character who happens not to feature in the real poem.

    Why does this kind of thing happen? One reason has to do with the nature of spoken language. On the page, English sentences come pre-packaged into words, each of which is made up of distinct, easily-identified letters which look pretty much the same every time. But pronounced out loud, they are not like that! Instead, a continuous, mushy stream of noise makes its way into our ears, and it is up to our brains to work out what speech sounds are actually in there, where one word ends and the next one begins (think the-sky versus this-guy), and so on. Obviously this process is not exactly helped when there are rock guitars competing for your attention too.

    Obama’s elf….. don’t wanna be… Obama’s elf… any more…

    But another reason is that we are never ‘just listening’ passively. Instead, behind the scenes, our minds are busy trying to relate what we’re hearing to our existing knowledge – not only our linguistic knowledge, but our general knowledge about the world. For example, the common-sense knowledge that people tend to kiss other people, rather than intangible abstractions like the sky. This is obviously very useful most of the time, but in the “Purple Haze” case it leads us astray, because the more implausible meaning is the one that Jimi Hendrix intended.

    What has this all got to do with language change? Well, the crucial point is that what I’ve just said – interpreting sounds is complicated, and to navigate the process we engage our common sense as well as our knowledge of the language – applies just as well to normal conversation as it does to song lyrics. We don’t always hear things perfectly, and even if we do, we have to square the things we’ve just heard with the things we already knew, which provide a guide for our interpretation but may sometimes take us in the wrong direction.

    So if you hear someone referring to a really disappointing experience as a damp squib, but are not familiar with squib (an old-fashioned word for a firework), what is to stop you thinking that what you really heard was damp squid? A squid is, after all, a very damp creature, and not always something that people are hugely fond of. Similarly, the expression to nip in the bud makes sense if you latch on to the gardening metaphor it is based on – but if you don’t, well, nipping an undesirable thing in the butt does sound like a very effective way of getting rid of it. So, people who think the expressions really are damp squid and nip in the butt have made a mistake along the lines of “kiss this guy”; the difference is that here they may end up using the new versions in their own speech, and thus pass them on to other speakers. And the process doesn’t have to involve whole expressions: individual words are susceptible to it too, for example midriff becoming mid-rift or utmost becoming up-most.

    It’s beautiful, but undeniably damp

    Misinterpreted words and expressions like these, which have some kind of new internal logic of their own, are known as eggcorns. This is because egg-corn is exactly how some English speakers have reinterpreted the word acorn, on the basis that acorns are indeed egg-shaped seeds. And the development of a new eggcorn may not involve any mishearing at all, just reinterpretation of one word as another one that sounds exactly the same. Are you expected to toe the line or to tow the line? Are people given free rein or free reign? In each case the two expressions sound identical, and each brings with it some kind of coherent mental image. For the moment, toe the line and free rein are still considered to be the ‘correct’ versions of these idioms, but perhaps in the future that will no longer be the case.

    As words and expressions are reinterpreted over time, the language changes little by little: in speech and in writing, people pass on their reinterpretations to one another, in a way which may eventually pass right through the language. The underlying factors producing eggcorns are the same as those producing mondegreens. But unlike the lyrics of “Purple Haze”, words and idioms don’t generally have a fixed author and don’t belong to anybody, meaning that if everyone started calling acorns eggcorns, then that just would be the correct word for them: the previous, now meaningless term acorn would be no more than a historical curiosity, and English as a whole would be very slightly different from how it is now.

    So this is how we get from Jimi Hendrix to language change – via mondegreens and eggcorns. Have you spotted any eggcorns in the wild? And how likely do you think they are to catch on and become the new normal?

    How to break an impasse

    How to break an impasse

    Have Brexit negotiations met an impasse (where the first vowel sounds like the vowel in ‘him’), or an impasse where the vowel is like the initial sound in the French word bain /bɛ̃/? Or is it something in between?

    If it is the former, congratulations! This borrowing from French has been successfully integrated into your native phonology, whilst simultaneously making a nod to its orthography.

    If you opt to French-it-up then you have recognised that this word is not an Anglo-Saxon one, and that it should be flagged as such by keeping the pronunciation classic. Or you are French.

    If you are somewhere between these two extremes, you are in good company. This highly topical word has no less than 12 British variants listed in the OED, reflecting various solutions to integrating the nasalized French vowel /ɛ̃/ and stress pattern into English:

    Choosing which pronunciation to use for impasse is both a linguistic and social minefield, with every utterance revealing something about your education and social networks. No pressure then.

    Recent news reports are providing a very rich corpus of data on the pronunciation of this specific word, with many variants being used within the same news report by different speakers, and perhaps even the same speaker.

    For those yet to commit, choosing which to pick may be bewildering. So how do we avoid this impasse? Perhaps unsurprisingly, one tactic speakers use is to avoid using a word they aren’t confident pronouncing altogether. It might be safer to stick to deadlock.

    Watch BBC Political Editor Laura Kuenssberg translate deadlock into German, Spanish and French.

    Ultimately, our cousins across the pond may have some influence in resolving this issue in the long term. The OED lists only two variants for U.S. English, with variation based on stress, not vowel quality, and U.S. variants of words (e.g. schedule, U.S. /skɛdjuːl/ vs U.K. /ˈʃedʒ.uːl/) are widely adopted in the speech of the UK public. But this will not necessarily be the case and the multiple UK variants may continue for some time.

    The impasse goes to show that languages tend to tolerate a whole lot of diversity, even when the world of politics doesn’t.

    A whole nother story

    A whole nother story

    Words do some truly inventive things when they change, and change they always do. Some switch their sounds around, like when hros became hors, nowadays spelt with an extra e as horse. Some lose their sense of having an internal composition, like when wāl-hros ‘whale-horse’ became walrus. Some cave in to peer pressure and change their looks to conform with others, including one of my favourite cases in English, when under the influence of similarly-meaning words probably, possibly, plausibly which all end in -bly, we get supposably, which is how in some varieties of modern English you can say ‘supposedly’. One the of truly odd things that words do though, is to start stealing sounds from their neighbours.

    A famous case in English is an apron, which used to be a napron, until the n got snaffled by the a. It goes the other way too. A newt was originally an ewt. Of course, in Middle English when this n-theivery was underway, there were a few more words complicit in the heist, for example my napron also became mine apron, and your napron became yourn apron, since at that stage in English, words like my/mine, your/yourn worked like a/an. So, ever wondered why the nickname for Edward is Ned? As in mine Ed, ourn Ed? Got it? Speaking of which, nickname was originally ekename and was also involved in a swindling of n from the previous word (the eke-, which is related to eke in ‘eke out a living’, meant an addition or supplement, so mine ekename was my additional name).

    It’s not only in English that words have indulged in this shifty business. In late Latin, the word originally borrowed from Greek apotheca would have been l’aboteca, which you may recognise today as Italian la bottega, Spanish la bodega or French and English boutique. In Danish, the plural pronoun meaning ‘you’ is I, related to English ye, but in closely related Swedish it’s ni with an extra n. Where did it get it? Theft. The corresponding plural verbs used to end in -en, like haven i ‘have you?’, and you can see what happened next. In fact, the same game played out a thousand years earlier with singular ‘you’ in several West Germanic languages, except this time it was the verb that kept a piece of the pronoun, when phrases like habēs thū ‘have you?’ became habēst thū, which you might recognise as English havest thou.

    How does all this shifting of sounds between words come about? To get an idea, try saying quickly: ‘an apron, a napron, an apron’, and you’ll already have a sense of how this is possible. Unlike on the printed page, words in spoken language stream forth in a smooth and almost seamless flow, and the human brain performs some impressively deft reverse-engineering to slice that stream back up into words. In fact, picking out the individual words in speech is one of the first monumental intellectual tasks we embark on as infants, even before we start learning what the words mean. Recent research suggests that we may even begin this process from within the womb, where we get pre-season access to language courtesy of the muffled rhythms of speech that seep in to us from outside.

    Now, you may well wonder how anyone, let alone an infant, can slice up a speech stream into individual words without knowing any of the meanings. Good question. It would appear that the brain operates like a finely tuned statistical inference machine, storing and calculating the relative frequencies at which sounds follow one another, and from this it can begin to pinpoint where the word boundaries are located, since at those boundaries, it is much less predictable what sounds will come next. The trick, then, is that word boundaries are zones of unpredictability, irrespective of their meanings. Of course, we might ask next, why is it that the sounds are so predictable inside the words? One of the reasons for that has to do with what linguists term ‘phonology’: the fascinating way in which sound sequences themselves are intricately structured and highly non-random within the words of human languages, but I’m afraid that for now, that’s a whole nother story.