“Are potatoes ‘badder’ than usual in the UK atm?” This was the question posed by a reddit user last week. Despite the scare quotes, this use of the word ‘badder’ was met with general mockery (as well as some genuine attempts to answer the question: the wet weather has caused poor growing conditions for root veg this year, if you were wondering). Yet the intended meaning is completely clear to English speakers; more so than if it had been phrased ‘are potatoes worse than usual?’.
In fact, ‘badder’ has seen a big increase in use since the mid 20th century (although it’s been around for a long time, and was even used by Chaucer). Google books offers numerous titles from recent years such as Bigger and badder: a billionaire romance (2016), How to be a badder bitch (2018) and the Bad guys even badder box (2019). What these titles have in common is that ‘bad’ is used with a special meaning as part of a set phrase. ‘Bad guy’ evokes a stock villain from a story, not just any old guy who happens to be bad. A ‘bad bitch’ is a tough, empowered woman. Thus a ‘badder guy’ is even more villainous, and a ‘badder bitch’ is even cooler and tougher. ‘Can you imagine a worse bitch than Helen?’ makes it crystal clear that the speaker doesn’t like Helen, but ‘can you imagine a badder bitch than Helen?’ implies admiration instead.
These are examples of what linguists call lexicalisation. ‘Bad guy’ and ‘bad bitch’ have become set phrases whose meaning is more than just the sum of their individual parts. In other words, the meaning of ‘bad guy’ is not just the meaning of ‘bad’ + ‘guy’, and would have to be listed as a separate entry in a dictionary. In a sense, it behaves as a single word (which is also betrayed by its special stress pattern: a bad guy is a guy who is bad, while a bad guy is a villain). This is even clearer for ‘bad bitch’: both bitch and bad have strongly negative connotations, but bad bitch is positive.
The coinage of ‘badder bitch’ reveals a change that has already happened under the surface. Two words have become one phrase, with its own unpredictable meaning. Bad and worse are forms of the same lexeme, in the sense that they’d be listed under the same dictionary entry: they are members of the same paradigm. ‘Badder bitch’ shows that changes in meaning can happen to individual forms in the paradigm, rather than lexemes, otherwise ‘worse bitch’ would automatically take on the meaning of ‘bad bitch’. Bad in the newly lexicalised bad bitch starts off its life without any comparative form, so you’ve got to make up something new if you want to use one.
Something similar can be seen in examples like straight or wrought, which started off life as past participles of the verbs stretch and work. Straight in a sentence like ‘I have straight the string’ was regularised to stretched, and wrought to worked. But the forms straight and wrought were left behind in usages like ‘the straight string’ or ‘the wrought iron’, revealing that they had become lexicalised as adjectives in their own right.
Back to badder potatoes. Bad in the context ‘a bad potato/apple/egg (etc.)’ has a special sense of ‘rotten’ that is somewhat lexicalised, hence ‘badder potatoes’ are more rotten, while ‘worse potatoes’ could be worse in any number of ways. Either that, or potatoes are becoming meaner and more villainous as a result of the miserable weather, which frankly I can relate to. Let’s remain vigilant, just in case.
As schoolteachers the English-speaking world over know well, the use of of instead of have after modal verbs like would, should and must is a very common feature in the writing of children (and many adults). Some take this an omen of the demise of the English language, and would perhaps agree with Fowler’s colourful assertion in A Dictionary of Modern English Usage (1926) that “of shares with another word of the same length, as, the evil glory of being accessory to more crimes against grammar than any other” (though admittedly this use of of has been hanging around for a while without doing any apparent harm: this study finds one example as early as 1773, and another almost half a century later in a letter of the poet Keats).
According to the usual explanation, this is nothing more than a spelling mistake. Following ‘would’, ‘could’ etc., the verb have is usually pronounced in a reduced form as [əv], usually spelt would’ve, must’ve, and so on. It can even be reduced further to [ə], as in shoulda, woulda, coulda. This kind of phonetic reduction is a normal part of grammaticalisation, the process by which grammatical markers evolve out of full words. Given the famous unreliability of English spelling, and the fact that these reduced forms of have sound identical to reduced forms of the preposition of (as in a cuppa tea), writers can be forgiven for mistakenly inferring the following rule:
‘what you hear/say as [əv] or [ə], write as of’.
But if it’s just a spelling mistake, this use of ‘of’ is surprisingly common in respectable literature. The examples below (from this blog post documenting the phenomenon) are typical:
‘If I hadn’t of got my tubes tied, it could of been me, say I was ten years younger.’ (Margaret Atwood, The Handmaid’s Tale)
‘Couldn’t you of – oh, he was ignorant in his speech – couldn’t you of prevented it?’ (Hilary Mantel, Beyond Black)
Clearly neither these authors nor their editors make careless errors. They consciously use ‘of’ instead of ‘have’ in these examples for stylistic effect. This is typically found in dialogue to imply something about the speaker, be it positive (i.e. they’re authentic and unpretentious) or negative (they are illiterate or unsophisticated).
These examples look like ‘eye dialect’: the use of nonstandard spellings that correspond to a standard pronunciation, and so seem ‘dialecty’ to the eye but not the ear. This is often seen in news headlines, like the Sun newspaper’s famous proclamation “it’s the Sun wot won it!” announcing the surprise victory of the conservatives in the 1992 general election. But what about sentences like the following from the British National Corpus?
“If we’d of accepted it would of meant we would have to of sold every stick of furniture because the rooms were not large enough”
The BNC is intended as a neutral record of the English language in the late 20th century, containing 100 million words of carefully transcribed and spellchecked text. As such, we expect it to have minimal errors, and there is certainly no reason it should contain eye dialect. As Geoffrey Sampson explains in this article:
“I had taken the of spelling to represent a simple orthographic confusion… I took this to imply that cases like could of should be corrected to could’ve; but two researchers with whom I discussed the issue on separate occasions felt that this was inappropriate – one, with a language-teaching background, protested vigorously that could of should be retained because, for the speakers, the word ‘really is’ of rather than have.”
In other words, some speakers have not just reinterpreted the rules of English spelling, but the rules of English grammar itself. As a result, they understand expressions like should’ve been and must’ve gone as instances of a construction containing the preposition of instead of the verb have:
Modal verb (e.g. must, would…) + of + past participle (e.g. had, been, driven…)
One way of testing this theory is to look at pronunciation. Of can receive a full pronunciation [ɒv] (with the same vowel as in hot) when it occurs at the end of a sentence, for example ‘what are you dreaming of?’. So if the word ‘really is’ of for some speakers, we ought to hear [ɒv] in utterances where of/have appears at the end, such as the sentence below. To my mind’s ear, this pronunciation sounds okay, and I think I even use it sometimes (although intuition isn’t always a reliable guide to your own speech).
I didn’t think I left the door open, but I must of.
The examples below from the Audio BNC, both from the same speaker, are transcribed as of but clearly pronounced as [ə] or [əv]. In the second example, of appears to be at the end of the utterance, where we might expect to hear [ɒv], although the amount of background noise makes it hard to tell for sure.
“Should of done it last night when it was empty then” (audio) (pronounced [ə], i.e. shoulda)
(phone rings) “Should of.” (audio) (pronounced [əv], i.e. should’ve)
When carefully interpreted, writing can also be a source of clues on how speakers make sense of their language. If writing have as of is just a linguistically meaningless spelling mistake, why do we never see spellings like pint’ve beer or a man’ve his word? (Though we do, occasionally, see sort’ve or kind’ve). This otherwise puzzling asymmetry is explained if the spelling of in should of etc. is supported by a genuine linguistic change, at least for some speakers. Furthermore, have only gets spelt of when it follows a modal verb, but never in sentences like the dogs have been fed, although the pronunciation [əv] is just as acceptable here as in the dogs must have been fed (and in both cases have can be written ‘ve).
If this nonstandard spelling reflects a real linguistic variant (as this paper argues), this is quite a departure from the usual role of a preposition like of, which is typically followed by a noun rather than a verb. The preposition to is a partial exception, because while it is followed by a noun in sentences like we went to the party, it can also be followed by a verb in sentences like we like to party. But with to, the verb must appear in its basic infinitive form (party) rather than the past participle (we must’ve partied too hard), making it a bit different from modal of, if such a thing exists.
Whether or not we’re convinced by the modal-of theory, it’s remarkable how often we make idiosyncratic analyses of the language we hear spoken around us. Sometimes these are corrected by exposure to the written language: I remember as a young child having my spelling corrected from storbry to strawberry, which led to a small epiphany for me, as that was the first time I realised the word had anything to do with either straw or berry. But many more examples slip under the radar. When these new analyses lead to permanent changes in spelling or pronunciation we sometimes call them folk etymology, as when the Spanish word cucaracha was misheard by English speakers as containing the words cock and roach, and became cockroach (you can read more about folk etymology in earlier posts by Briana and Matthew).
Meanwhile, if any readers can find clear evidence of modal of with the full pronunciation as [ɒv], please comment below! I’m quite sure I’ve heard it, but solid evidence has proven surprisingly elusive…
If an alien scientist were designing a communication system from scratch, they would probably decide on a single way of conveying grammatical information like whether an event happened in the past, present or future. But this is not the case in human languages, which is a major clue that they are the product of evolution, rather than design. Consider the way tense is expressed in English. To indicate that something happened in the past, we alter the form of the verb (it is cold today, but it was cold yesterday), but to express that something will happen in the future we add the word will. The same type of variation can also be seen across languages: French changes the form of the verb to express future tense (il fera froid demain, ‘it will be cold tomorrow’, vs il fait froid aujourd’hui, ‘it is cold today’).
The future construction using will is a relatively recent development. In the earliest English, there was no grammatical means of expressing future time: present and future sentences had identical verb forms, and any ambiguity was resolved by context. This is also how many modern languages operate. In Finnish huomenna on kylmää ‘it will be cold tomorrow’, the only clue that the sentence refers to a future state of affairs is the word huomenna ‘tomorrow’.
How, then, do languages acquire new grammatical categories like tense? Occasionally they get them from another language. Tok Pisin, a creole language spoken in Papua New Guinea, uses the word bin (from English been) to express past tense, and bai (from English by and by) to express future. More often, though, grammatical words evolve gradually out of native material. The Old English predecessor of will was the verb wyllan, ‘wish, want’, which could be followed by a noun as direct object (in sentences like I want money) as well as another verb (I want to sleep). While the original sense of the verb can still be seen in its German cousin (Ich will schwimmen means ‘I want to swim’, not ‘I will swim’), English will has lost it in all but a few set expressions like say what you will. From there it developed a somewhat altered sense of expressing that the subject intends to perform the action of the verb, or at least, that they do not object to doing so (giving us the modern sense of the adjective ‘willing’). And from there, it became a mere marker of future time: you can now say “I don’t want to do it, but I will anyway” without any contradiction.
This drift from lexical to grammatical meaning is known as grammaticalisation. As the meaning of a word gets reduced in this way, its form often gets reduced too. Words undergoing grammaticalisation tend to gradually get shorter and fuse with adjacent words, just as I will can be reduced to I‘ll. A close parallel exists in in the Greek verb thélō, which still survives in its original sense ‘want’, but has also developed into a reduced form, tha, which precedes the verb as a marker of future tense. Another future construction in English, going to, can be reduced to gonna only when it’s used as a future marker (you can say I’m gonna go to France, but not *I’m gonna France). This phonetic reduction and fusion can eventually lead to the kind of grammatical marking within words that we saw with French fera, which has arisen through the gradual fusion of earlier ferre habet ‘it has to bear’.
Words meaning ‘want’ or ‘wish’ are a common source of future tense markers cross-linguistically. This is no coincidence: if someone wants to perform an action, you can often be reasonably confident that the action will actually take place. For speakers of a language lacking an established convention for expressing future tense, using a word for ‘want’ is a clever way of exploiting this inference. Over the course of many repetitions, the construction eventually gets reinterpreted as a grammatical marker by children learning the language. For similar reasons, another common source of future tense markers is words expressing obligation on the part of the subject. We can see this in Basque, where behar ‘need’ has developed an additional use as a marker of the immediate future:
ikusi behar dut
see need aux
‘I need to see’/ ‘I am about to see’
This is also the origin of the English future with shall. This started life as Old English sceal, ‘owe (e.g. money)’. From there it developed a more general sense of obligation, best translated by should (itself originally the past tense of shall) or must, as in thou shalt not kill. Eventually, like will, it came to be used as a neutral way of indicating future time.
But how do we know whether to use will or shall, if both indicate future tense? According to a curious rule of prescriptive grammar, you should use shall in the first person (with ‘I’ or ‘we’), and will otherwise, unless you are being particularly emphatic, in which case the rule is reversed (which is why the fairy godmother tells Cindarella ‘you shall go to the ball!’). The dangers of deviating from this rule are illustrated by an old story in which a Frenchman, ignorant of the distinction between will and shall, proclaimed “I will drown; nobody shall save me!”. His English companions, misunderstanding his cry as a declaration of suicidal intent, offered no aid.
This rule was originally codified by Bishop John Wallis in 1653, and repeated with increasing consensus by grammarians throughout the 18th and early 19th centuries. However, it doesn’t appear to reflect the way the words were actually used at any point in time. For a long time shall and will competed on fairly equal terms – shall substantially outnumbers will in Shakespeare, for example – but now shall has given way almost entirely to will, especially in American English, with the exception of deliberative questions like shall we dance? You can see below how will has gradually displaced shall over the last few centuries, mitigated only slightly by the effect of the prescriptive rule, which is perhaps responsible for the slight resurgence of shall in the 1st person from approximately 1830-1920:
Until the eventual victory of will in the late 18th century, these charts (from this study) actually show the reverse of what Wallis’s rule would predict: will is preferred in the 1st person and shall in the 2nd , while the two are more or less equally popular in the 3rd person. Perhaps this can be explained by the different origins of the two futures. At the time when will still retained an echo of its earlier meaning ‘want’, we might expect it to be more frequent with ‘I’, because the speaker is in the best position to know what he or she wants to do. Likewise, when shall still carried a shade of its original meaning ‘ought’, we might expect it to be most frequent with ‘you’, because a word expressing obligation is particularly useful for trying to influence the action of the person you are speaking to. Wallis’ rule may have been an attempt to be extra-polite: someone who is constantly giving orders and asserting their own will comes across as a bit strident at best. Hence the advice to use shall (which never had any connotations of ‘want’) in the first person, and will (without any implication of ‘ought’) in the second, to avoid any risk of being mistaken for such a character, unless you actually want to imply volition or obligation.
Guarantee and warranty: two words for the price of one
By and large, languages avoid having multiple words with the same meaning. This makes sense from the point of view of economy: why learn two words when one will do the job?
But occasionally there are exceptions, such as warranty and guarantee. This is one of several synonymous or near-synonymous pairs of words in English conforming to the same pattern – another example is guard and ward. The variants with gu- represent early borrowings from Germanic languages into the Romance languages descended from Latin. At the time these words were borrowed, the sound w had generally developed into v in Romance languages, but it survived after g, in the descendants of a few Latin words like lingua ‘tongue, language’. So when Romance speakers adapted Germanic words to the sounds of their own language, gu was the closest approximation they could find to Germanic w.
This is why French has some words like guerre ‘war’, where gu- corresponds to w- in English (this word may have been borrowed because the inherited Latin word for war, bellum, had become identical to the word for ‘beautiful’). Later, some of the words with gu- were borrowed back into English, which is why we have both borrowed guard and inherited ward. According to one estimate, 28.3% of the vocabulary of English has been borrowed from French (figures derived from actual texts rather than dictionaries come in even higher at around 40%), a debt that we have recently started repaying in earnest with loans like le shopping and le baby-sitting. This is all to the consternation of the Académie française, which aims to protect the French language from such barbarisms, as evidenced by the dire, ne pas dire (‘say, don’t say’) section of the académie‘s website advising Francophones to use homegrown terms like contre-vérité instead of anglicisms like fake news.
In fact, warranty and guarantee reflect not one but two different waves of borrowing: the first from Norman French, which still retained the w- sound, likely through the influence of Scandinavian languages spoken by the original Viking invaders of Normandy. Multiple layers of borrowing can also be seen in words like castle, from Latin castellum via Norman French, and chateau, borrowed from later French, in which Latin c- had developed a different pronunciation.
Incidentally, Norman French is still continued not only in Normandy but also in the Channel islands of Guernsey, Jersey and Sark. The Anglo-Norman dialect of the island of Alderney died out during World War II, when most of the island’s population was evacuated to the British mainland, although efforts are underway to bring it back.
Of all the world’s languages, English is the most widely learnt by adults. Although Mandarin Chinese has the highest number of speakers overall, owing to the huge size of China’s population, second-language speakers of English outnumber those of Mandarin more than three times.
Considering that the majority of English speakers learn the language in adulthood, when our brains have lost much of their early plasticity, it’s just as well that some aspects of English grammar are pretty simple compared to other languages. Take for example the way we express the plural. With only a small number of exceptions, we make plurals by adding a suffix –s to the singular. The pronunciation differs depending on the last sound of the word it attaches to – compare the ‘z’ sound at the end of dogs to the ‘s’ sound at the end of cats, and the ‘iz’ at the end of horses – but it varies in a consistently predictable way, which makes it easy to guess the plural of an English noun, even if you’ve never heard it before.
That’s not the case in every language. Learners of Greek, for example, have to remember about seven common ways of making plurals. Sometimes knowing the final sounds of a noun and its gender make it possible to predict the plural, but other times learners simply have to memorise what kind of plural a noun has: for example pateras ‘father’ and loukoumas ‘doughnut’ both have masculine gender and singulars ending in –as, but in Standard Greek their plurals are pateres and loukoumathes respectively.
This is similar to how English used to work. Old English had three very common plural suffixes, -as, -an and –a, as well as a number of less common types of plural (some of these survive marginally in a few high-frequency words, including vowel alternations like tooth~teeth and zero-plurals like deer). The modern –s plural descends from the suffix –as, which originally was used only for a certain group of masculine nouns like stān, ‘stone’ (English lost gender in nouns, too, but that’s a subject for another blog post).
How did the -s plural overtake these competitors to become so overwhelmingly predominant in English? Partly it was because of changes to the sounds of Old English as it evolved into Middle English. Unstressed vowels in the last syllables of words, which included most of the suffixes which expressed the gender, number and case of nouns, coalesced into a single indistinct vowel known as ‘schwa’ (written <ə>, and pronounced like the ‘uh’ sound at the beginning of annoying). Moreover, final –m came to be pronounced identically to –n. This caused confusion between singulars and plurals: for example, Old English guman ‘to a man’ and gumum ‘to men’ both came to be pronounced as gumən in Middle English. It also caused confusion between two of the most common noun classes, the Old English an-plurals and the a-plurals. As a result they merged into a single class, with -e in the singular and -en in the plural.
This left Middle English with two main types of plural, one with –en and one with –(e)s. Although a couple of the former type remain to this day (oxen and children), the suffix –es was gradually generalised until it applied to almost all nouns, starting in the North of England and gradually moving South.
A similar kind of mass generalisation of a single strategy for expressing a grammatical distinction is often seen in the final stages of language death, as a community of speakers transition from a minority to a majority language as their mother tongue. Nancy Dorian has spent almost 50 years documenting the dying East Sutherland dialect of Scots Gaelic as it is supplanted by English in three remote fishing villages in the Scottish highlands. In one study the Gaelic speakers were divided into fluent speakers and ‘semi-speakers’, who used English as their first language and Gaelic as a second language. Dorian found that the semi-speakers tended to overgeneralise the plural suffix –an, applying it to words for which fluent speakers would have used one of another ten inherited strategies for expressing plural number, such as changing the final consonant of the word (e.g. phũ:nth ‘pound’, phũnčh ‘pounds’), or altering its vowel (e.g. makh ‘son’, mikh ‘sons’).
But why should the last throes of a dying language bear any resemblance to the evolution of a thriving language like English? A possible link lies in second language acquisition by adults. At the same time as these changes were taking place, English was undergoing intense contact with Scandinavian settlers who spoke Old Norse. During the same period English shows many signs of Old Norse influence. In addition to many very common words like take and skirt (which originally had a meaning identical to its native English cognate shirt), English borrowed several grammatical features of Scandinavian languages, such as the suffix –s seen in third person singular present verbs like ‘she blogs’ (the inherited suffix ended in –th, as in ‘she bloggeth’), and the pronouns they, their and them, which replaced earlier hīe, heora and heom. Like the extension of the plural in –s, these innovations appeared earliest in Northern dialects of English, where settlements of Old Norse speakers were concentrated, and gradually percolated South during the 11th to 15th centuries.
It’s possible that English grammar was simplified in some respects as a consequence of what the linguist Peter Trudgill has memorably called “the lousy language-learning abilities of the human adult”. Research on second-language acquisition confirms what many of us might suspect from everyday experience, that adult learners struggle with inflection (the expression of grammatical categories like ‘plural’ within words) and prefer overgeneralising a few rules rather than learning many different ways of doing the same thing. In this respect, Old Norse speakers in Medieval England would have found themselves in a similar situation to semi-speakers of East Sutherland Gaelic – when confronted with a number of different ways of expressing plural number, it is hard to remember for each noun which kind of plural it has, but simple to apply a single rule for all nouns. After all, much of the complexity of languages is unnecessary for communication: we can still understand children when they make mistakes like foots or bringed.
In a 2016 twitter poll asking do you feel comfortable using gift as a verb? (ie: “I gifted that sweater to you”), 66% of respondents reported that they found this use ‘icky’. This phenomenon is known by linguists as ‘conversion’ or ‘zero-derivation’, because it involves taking a particular class of word, such as a verb, noun, or an adjective, and deriving another type of word from it without doing anything. This stands in contrast to common-and-garden ‘derivation’, where you convert the class of a word by changing its form somehow. For example, the verb sense becomes a noun sensation, which becomes an adjective sensational, which comes full circle to another verb sensationalise, all by the accumulation of suffixes. (The OED even lists a further derivation sensationalisation – but this sort of style has its own equally vociferous critics, showing you can’t win when it comes to linguistic taste).
In English verbs, nouns and adjectives all tend to look much the same, which makes it possible to zero-derive by stealth. It wasn’t always this way. Take for example the verb stone, a 12th-century example of the noun-as-verb phenomenon, derived from the noun stone (in its Middle English form stōn). Back then, the infinitive wasn’t simply stōn, but stōnen – the suffix –en was obligatory for all infinitives, and makes it clear that the word is no longer being used as a noun. Or compare the noun fight with the identical verb. In Old English the basic form of the noun was feoht, but there was no corresponding verb form feoht: instead, it was feohteð ‘he fights’, fihtest ‘you fight’, fuhton ‘we/you (pl)/they fought’, feaht ‘I/he/she fought’, or one of many other forms, depending on various grammatical properties such as subject (who is doing the fighting?), number (how many people are fighting?) and mood (is the fighting real, hypothetical, or an instruction?). As Old English evolved into its modern form, most of these inflectional suffixes were lost, encouraging a rise in the number of zero-derivations entering the language.
The laissez-faire attitude of English can be clearly recognised when comparing how languages deal with new words such as the recently-coined verb ‘to google’. Some languages, like Middle English stōnen, merely adapt the company’s name to express the grammatical categories which are important in the language (e.g. German du googlest, ‘you google’, ich habe gegoogelt ‘I googled’), while other languages add extra pieces of word to explicitly flag up the conversion, e.g. Greek γκουγκλίζω or γκουγκλάρω (pronounced ‘googlizo’/‘googlaro’), where the final syllable -o indicates that the subject of the verb is ‘I’, but the -iz- preceding it can’t be attributed any meaning beyond ‘I’m a verb!’.
In English, meanwhile, pretty much anything goes: in addition to verbs which have become nouns, we have numerous nouns becoming verbs (e.g. father, storm), adjectives becoming verbs (round, smooth), adjectives becoming nouns (intellectual), and liberal rules governing compounds, which let us treat nouns as if they were adjectives (stone wall). English has such a devil-may-care attitude to conversion that even whole phrases can become nouns or adjectives: basically, it’s a free-for-all.
This has been going on in English for a very long time, so why do examples like gifting make people feel ‘icky’? Partly it’s because we associate coinages like impact, action and workshop with corporate jargon – although some of these are actually of considerable age (impact was a verb before it became a noun, and it started out life even earlier as an adjective), their use boomed in the decades following the second world war, as management increasingly came to be seen as a scientific discipline. Another objection is that we already have words for verbs like to gift, namely give, which makes gift feel like an overelaborate solution to a non-problem, the linguistic equivalent of bic’s infamous ‘for her’ range of pens.
Nevertheless, zero-derivation can come in handy when a word has acquired a different or narrower meaning than the word it originally derived from. Gift originally referred to an action or instance of giving, in addition to the thing being given, but it now almost exclusively refers to something given for free in a spirit of goodwill. You can give someone a black eye, hepatitis, or the creeps, but it would be the height of irony to call these things gifts. Correspondingly, to gift has a more specific meaning than to give, and is much more concise than to give as a gift, just like texting someone is more concise than sending a text message to them, and friending someone is more concise than adding them as a friend on Facebook.
Wh- words like which, whom and why get a lot of knickers in a twist, as attested by this oatmeal comic on when to use who vs whom, or the age-old debate about the correct use of which vs that (on which see this blog post by Geoffrey Pullum). But in Old English the wh- words formed a complete and regular system which would have been easy to get the hang of. They were used strictly as interrogative pronouns – words that we use for asking questions like who ate all the pies? – rather than relative pronouns, which give extra information about an item in the sentence (Jane, who ate all the pies, is a prolific blogger) or narrow down the reference of a noun (women who eat pies are prolific bloggers). They developed their modern relative use in Middle English, via reinterpretation of indirect questions – in other words, sentences like she asked who ate all the pies, containing the question who ate all the pies?, served as the template for new sentences like she knew who ate all the pies, where who functions as a relative.
Originally, the new relative pronoun whom (in its Middle English form hwām) functioned as the dative case form of who, used when the person in question is the indirect object of a verb or after prepositions like for. For direct objects, the accusative form hwone was used instead. So to early Middle English ears, the man for whom I baked a pie would be fine, while the man whom I baked in a pie would be objectionable (on grammatical as well as ethical grounds). Because nouns also had distinct nominative, dative and accusative forms, the wh- words would have posed no special difficulty for speakers. But as English lost distinct case forms for nouns, the pronoun system was also simplified, and the originally dative forms started to replace accusative forms, just as who is now replacing whom. This created a two-way opposition between subject and non-subject which is best preserved in our system of personal pronouns: we say he/she/they baked a pie, but I baked him/her/them(in) a pie.
Thus hwone disappeared the way of hine, the old accusative form of he. Without the support of a fully-functioning case system in the nouns, other case forms of pronouns were reinterpreted. Genitive pronouns like my and his were transformed into possessive adjectives (his pie is equivalent to the pie of him, but you can no longer say things like I thought his to mean ‘I thought of him’). The wh- words also used to have an instrumental case form, hwȳ, meaning ‘by/through what?’, which became an autonomous word why.
Although him and them are still going strong, whom has been experiencing a steady decline. Defenders of ‘whom’ will tell you that the rule for deciding whether to use who or whom is exactly the same as that for he and him, but outside the most formal English, whom is now mainly confined to fixed phrases like ‘to whom it may concern’. For many speakers, though, it has swapped its syntactic function for a sociolinguistic one by becoming merely a ‘posh’ variant of who: in the words of James Harding, creator of the ‘Whom’ Appreciation Society, “those who abandon ‘whom’ too soon will regret it when they next find themselves in need of sounding like a butler.”
The death of the dual, or how to count sheep in Slovenian
One reason why translation is so difficult – and why computer translations are sometimes unreliable – is that languages are more than just different lists of names for the same universal inventory of concepts. There is rarely a perfect one-to-one equivalence between expressions in different languages: the French word mouton corresponds sometimes to English sheep, and at other times to the animal’s meat, where English uses a separate word lamb or mutton.
This was one of the great insights of Ferdinand de Saussure, arguably the father of modern linguistics. It applies not only in the domain of lexical semantics (word meaning), but also to the categories which languages organise their grammars around. In English, we systematically use a different form of nouns and verbs depending on whether we are referring to a single entity or multiple entities. The way we express this distinction varies: sometimes we make the plural by adding a suffix to the singular (as with hands, oxen), sometimes we change the vowel (foot/feet) and occasionally we don’t mark the distinction on a noun at all, as with sheep (despite the best efforts of this change.org petition to change the singular to ‘shoop’). Still, we can often tell whether someone is talking about one or more sheep by the form of the agreeing verb: compare ‘the sheep are chasing a ball’ to ‘the sheep is chasing a ball’.
Some languages make more fine-grained number distinctions. The English word sheep could be translated as ovca, ovci or ovce in Slovenian, depending on whether you’re talking about one, two, or three or more animals, respectively. Linguists call this extra category between singular and plural the dual. The difference between dual and plural doesn’t show up just in nouns, but also in adjectives and verbs which agree with nouns. So to translate the sentence ‘the beautiful sheep are chasing a ball’, you need to ascertain whether there are two or more sheep, not just to translate sheep, but also beautiful and chase.
According to some, having a dual number makes Slovenian especially suited for lovers (could this explain the Slovenian tourist board’s decision to title their latest campaign I feel sLOVEnia?). But putting such speculations aside, it’s hard to see what the point of a dual could be. We rarely need to specify whether we are talking about two or more than two entities, and on the rare occasions we do need to make this information explicit, we can easily do so by using the numeral two.
This might be part of the reason why many languages, including English, have lost the dual number. Both English and Slovenian ultimately inherited their dual from Proto-Indo-European, the ancestor of many of the languages of Europe and India. Proto-Indo-European made a distinction between dual and plural number in its nouns, adjectives, pronouns, and verbs, but most of the modern languages descended from it have abandoned this three-way system in favour of a simpler opposition between singular and plural. Today, the dual survives only in two Indo-European languages, Slovenian and Sorbian, both from the Slavic subfamily.
In English the loss of the dual was a slow process, taking place over thousands of years. By the time the predecessor of English had split off from the other Germanic languages, the plural had replaced the dual everywhere except the first and second-person pronouns we and you, and verbs which agreed with them. By the earliest written English texts, it had lost the dual forms of verbs altogether, but still retained distinct pronouns for ‘we two’ and ‘you two’. By the 15th century, these were replaced by the plural forms, bringing the dual’s final demise.
Grammatical categories do not always disappear without a trace – in some languages the dual has left clues of its earlier existence, even though no functional distinction between dual and plural remains. Like English, German lost its dual, but in some Southern German dialects the dual pronoun enk (cognate with Old English inc, ‘to you two’) has survived instead of the old plural form. In modern dialects of Arabic, plural forms of nouns have generally replaced duals, except in a few words mostly referring to things that usually exist in pairs, like idēn ‘hands’, where the old dual form has survived as the new plural instead. Other languages show vestiges of the dual only in certain syntactic environments. For example, Scottish Gaelic has preserved old dual forms of certain nouns only after the numeral ‘two’: compare aon chas ‘one foot’, dà chois ‘two feet’, trì casan ‘three feet’, casan ‘feet’.
Although duals seem to be on the way out in Indo-European languages, it isn’t hard to find healthy examples in other language families (despite what the Slovenian tourist board might say). Some languages have even more complicated number systems: Larike, one of the languages spoken in Indonesia, has a trial in addition to a dual, which is used for talking about exactly three items. And Lihir, one of the many languages of Papua New Guinea, has a paucal number in addition to both dual and trial, which refers to more than three but not many items. This system of five number categories (singular/dual/trial/paucal/plural) is one of the largest so far discovered. Meanwhile, on the other end of the spectrum are languages which don’t make any number distinction in nouns, like English sheep.