A picture is worth a thousand words: Choosing images for psycholinguistic research

A picture is worth a thousand words: Choosing images for psycholinguistic research

Linguists need to come up with different ways of testing our theories of how particular languages in the world function. We generally rely on two main methods of data collection – linguistic elicitation and corpus collection. With linguistic elicitation a linguist asks a speaker of a language: ‘How do you say “Monty Python is really funny” in your language?’ But can we be sure that what the speaker said is naturalistic and not just a word for word translation?

Linguists need naturalistic data and can also record stories and conversations to build up a representative sample of a language (a corpus). This however takes a lot of time, effort and dedication on the part of both the linguist and the community of speakers of a language. It might even be that – after years of toil – the particular construction that a linguist wants to look at is under-represented with a dearth of examples in the corpus.

Thankfully, there is a happy medium! We can combine cognitive psychological techniques and targeted linguistic elicitation, to create scenarios where speakers produce naturalistic responses. Of course, this technique brings with it another set of problems entirely.

Psycholinguistic experiments need to be carefully designed and can’t be made up on the fly in response to something a speaker of a language says to you; this is drastically different to standard linguistic elicitation where one can continually come up with new sentences to check, while in the middle of working with a speaker of a language.

In our current research on optimal categorisation we aim to find out how different nouns are assigned to different classifiers in a group of six related Oceanic languages spoken in Vanuatu and New Caledonia. Each language has a different inventory size of classifying particles — from two to 23 — which are used in possessive constructions, and categorise the possession in terms of its use or functionality.

Here are a few examples from the Iaai language, spoken in New Caledonia, which has the largest inventory of classifiers in our sample of languages:

(1a)	a-n			wââ	(b)	hanii-ny		wââ
        FOOD.CLASSIFIER-his	fish 		CATCH.CLASSIFIER-his	fish
        ‘his fish (to eat)		        ‘his fish (which he caught)’
(2a)	a-n			koko	(b)	noo-n			koko
	FOOD.CLASSIFIER-his	yam		PLANT.CLASSIFIER-his	yam
	‘his yam (to eat)’			‘his yam plant’

We want to see whether or not a particular noun that refers to a particular entity can occur with different classifiers, like with the words for ‘fish’ and ‘yam’ in Iaai above. Also, how does a language with 23 classifiers function differently from a language with just two or three classifiers?

One way in which we can discover how the classifiers function in each language is to use a card sorting experiment. These experiments present speakers with entities in the form of pictures. Speakers are asked to sort them into different groups, first in a “free sort” where they can create groups on any basis they feel is relevant and important, and second, in a “structured sort” where they are asked to group entities according to which classifier they would use in a possessive construction. By doing this with lots of participants we can see individual speaker variation in language usage in one language and across languages and get a clear sense of if and how a language’s classifier system is influencing the way that speakers think about and process different entities.

Once we have decided on which nouns to test in a card sort experiment we have to find or make pictures that represent these images. Sadly I don’t have the artistic skills of Michelangelo and won’t be painting any masterpieces for the experiment! 

Choosing what type of image is trickier than it sounds as we are presented with an array of options.

First should we use simple line drawings of the images? The Noun Project has over 2 million small black and white line drawings. With such a choice of images we can find what we need. Here are some images of yams that I found on the site that we could use for our experiment.

These are great, and I know they are yams because I searched for images of yams on the website. But if I present these images to speakers I want them to tell me what they are. If the images aren’t instantly recognisable then participants will use different nouns to describe what they are seeing – is it a yam? A sweet potato? Manioc? Or some other entity? Actually, to tell you the truth, the third picture is actually a sweet potato! But it looks very similar to the first picture of a yam. Another problem is that these images can be quite abstract – and we can’t be sure that these symbolic representations of entities will be shared across different cultural and linguistic groups.

What about black and white pictures? – These are cheaper to print and easier to standardise. But we do not see the world in black and white and presenting entities as black and white pictures  may make it harder to identify  them, especially when the lightness of the background and the object of focus are similar. We need to be sure that the images we choose are easy to identify or else we can end up with problems of misidentification.

Another possibility is to remove the background of the image.  By doing this we can eliminate distractions and help the participant focus on the object in the image. However, the background is often key. Background information gives context that can influence how the speaker of a language perceives the entity in the image.

For instance, speakers may classify a fish that has been caught differently to a fish that is alive and swimming in the sea. The edible classifier is more likely with the former scenario, and a general classifier with the latter. But if we were to remove the background from both of these photos they would look strikingly similar! This leads us onto a very important question – what classifier would speakers of these languages use for a parrot if it was alive or dead?

So now we have decided to present images in colour and keep the background. But we must make sure that the background varies across different images. We don’t want participants to sort the entities into groups based on a colour or shape in the background or some other extraneous visual cue that may appear in several pictures!

For every psycholinguistic experiment that uses images there are multiple decisions that need to be made to figure out what type of image is required. The images we have chosen are specifically tailored to the nature of the languages we are studying to ensure that they are culturally relevant and thus identifiable.

For us, the pictures need to be realistic and represent the world around us — Sadly, we can’t take artistic licence with kangaroos and trampoline acts, as fun as that would be!

 

Drinkable houses, edible canoes and Trojan horses

Drinkable houses, edible canoes and Trojan horses

Michael Lotito, a French entertainer known as Monsieur Mangetout, became famous for his penchant for devouring objects that most would consider inedible. From bicycles and televisions to the most bizarre of all, a Cessna 150 light aircraft.

Though Monsieur Mangetout hailed from France, one might have thought that he was from the archipelago of Vanuatu. This small island nation is not only famous for being the most linguistically dense country in the world – with over 130 languages for a population of just a quarter of a million – but is also renowned for its intriguing possessive classifiers, which turn up in sentences when you talk about the things that you own, much like the possessive pronouns in English – my, your, hers etc. But in the Oceanic languages of Vanuatu these classifiers also tell us about how you will use the item that you own.

It took Michael Lotito two years to eat the Cessna 150!

The most common distinctions these classifiers make are between three types of possessions: ones that can be drunk, eaten and a residual classifier used when the more specific instances of eating and drinking aren’t needed. So, for example if you speak Paamese you can make a distinction between a coconut that you will drink, ani mak ‘my drinkable coconut’; one that you will eat the flesh of, ani ak ‘my edible coconut’; or one that you intend to sell, ani onak ‘my coconut for an unspecified use’.

But, more intriguingly, several languages of Central Vanuatu, spoken on the islands of Pentecost, Ambrym, Paama and Epi, use the food and drink classifiers for some rather strange items that one might not consider to be edible or drinkable — though of course Michael Lotito might beg to differ. The drink classifier in the language of North Ambrym covers a rather broad range of entities, including the obvious drinks such as water, tea, coffee and juice:

(1)	ma-n			we	/	ti	/	jus
	DRINK.CLASSIFIER-his	water		tea		juice
	‘his water/tea/juice’

But the classifier is also used with items that can’t be drunk, like the words bwelaye ‘cup’ or bwela ōl ‘coconut shell (used as a cup)’, but also im ‘house’, hul ‘mat’ and bulubul ‘hole’. And in the Sa language spoken on Pentecost island, the food classifier can also be used with the word bulbul ‘canoe’!

The languages of Central Vanuatu where houses can be drunk, except for Raga which likes to be different.

How do you drink a house? How do you eat a canoe? While Michael Lotito might well be able to eat canoes and drink houses, the people who speak these languages certainly do not! So what explanation can be given as to why and how these non-drinkable and non-edible entities are included within the semantic domain of drinks and food?

The words meaning cups and containers of liquids are included with the drink classifiers in some of these languages due to a process of semantic extension. This is when the coverage of the semantics of a classifier are extended to include entities that are frequently associated with the core meaning of that classifier. This type of semantic extension is known as metonymy, where the word for a container can be used instead of the word for what it contains – e.g. in English we can use the word ‘dish’ to refer not only to a plate, but also to its contents. It is not such a large cognitive step to associate drinks with cups, and that is why containers of liquids are now included in the drink classifier’s semantic domain. However, it is quite a large cognitive leap to think that houses are associated with drinks and canoes with food.

To explain how houses are now classified along with drinks and canoes with food we have to look into the history of the languages and how these languages have changed through time. This is of course quite a difficult endeavour considering that these languages have no literary traditions and are only now just starting to be written down. We cannot  consult old texts to see how the language used to be several hundred years ago as these don’t exist. Though limited records exist for a few languages going back to the mid 1800s, we mainly have to rely on comparing how related languages in the area differ and try to figure out how they got to be different.

Let’s start by looking at the language of Apma, spoken on Pentecost. The word for house, imwa, doesn’t occur with the drink classifier, but instead occurs in a different possessive construction where the owner is marked directly on the word for house, instead of on a classifier:

(2)	imwa=n		atsi
	house=his	person
	‘a person’s house’

This type of construction, called direct possession, normally occurs with possessions closely associated with the possessor, including body parts and kinship terms, but sometimes includes more intimate personal possessions as well. Now if we look at Apma’s neighbouring language, Ske, spoken to the south, the noun for house occurs with the drink classifier:

(3)	im	mwa=n			azó
	house	DRINK.CLASSIFIER=his	person
	‘a person’s house’

As you can see the word for house in Ske, which historically for the languages of Pentecost would have been imwa just like it is in Apma, has been split, where the first part im now means ‘house’, and speakers recognise the second part of imwa, namely mwa, as identical in form to the drink classifier. Speakers have now reanalysed the second part of the word for house as being the drink classifier, and now accept houses as being classified along with drinkable entities. A similar mechanism has occurred across several other languages of Central Vanuatu, and this is why houses are classified along with drinks.

Just what is a drinkable house anyway?

In most languages of Vanuatu this change didn’t occur and houses are either directly possessed or occur with the residual general classifier. But in a few other languages, the word for house developed into a distinct classifier that is different from the drink classifier. In the languages of Southern Vanuatu the word for house iimwa has now turned into a classifier for locations and places, and is distinct from the classifier for drinks — nɨmwɨ.

Now what about the edible canoes that I mentioned earlier? This strange occurrence happens in the language of Sa, also spoken on Pentecost island:

(4a)	a-k			anian		(b)	a-k			bulbul
	FOOD.CLASSIFIER-my	food               	FOOD.CLASSIFIER-my	canoe
	‘my food’					‘my canoe’

Historically, the word for canoe was waga in Proto Oceanic, and the word for bulbul was used for a specific type of canoe. Sometimes linguists get lucky and there can be historical documents that help show us the way. Miss Hardacre, a missionary living in northern Pentecost in the early part of the twentieth century, made a small dictionary of the Raga language. In this dictionary she recorded the generic-specific word pairing waga bulbul, ‘canoe type/raft’. Now in Sa, the original word for canoe, waga, underwent several sound changes until it ended up looking like the food classifier, where only the medial vowel /a/ was left! The new word for canoe was bulbul, whereas the old generic term, waga, merged into the food classifier. In other languages of the area, such as Raljago, spoken on Ambrym, a separate classifier for canoes and boats emerged, distinct from the food classifier. Thus, the food classifier is a, but the canoe classifier is ai.

Sometimes when a merger takes place, the noun that merges into a classifier acts as a Trojan horse. Looking back to the language of North Ambrym, where the drink classifier can occur with other nouns denoting houses, parts of houses, and mats. The word for house that originally merged into the drink classifier acts as a locus for semantic extension, opening a back door to other nouns that are semantically similar — those that are in the domain of houses — to enter into the drink classifier as well.

I think Michael Lotito would have felt at home speaking one of the Oceanic languages of Vanuatu. He might even have said of his Cessna 150ː

(5)	a-k			Cessna 150
	FOOD.CLASSIFIER-my	Cessna 150
	‘my edible Cessna 150’

Many thanks to Andrew Gray who runs the languages of Pentecost Island website and is my co-conspirator in turning this post into a journal article!

A Rainbow of Shared Diversity: Culture and Language in the South Pacific

A Rainbow of Shared Diversity: Culture and Language in the South Pacific

When we think of life in the South Pacific we often imagine relaxing in the shade of a coconut palm listening to the soothing sound of Israel Kamakawiwoʻole’s ‘over the rainbow’ (the official song of this blog post and mandatory listening!). But the South Pacific is in fact culturally diverse, and linguistically too, with around 600 languages in the Oceanic family spread across Micronesia, Melanesia and Polynesia.

The original migration of the Oceanic speaking people started around 1600 BC from the north east of New Guinea and they went on to colonise the uninhabited islands of the Pacific Ocean, with New Zealand being the last country to be inhabited by Polynesian seafarers as late as 1285 CE. The vast distances have created huge cultural differences amongst contemporary Oceanic peoples, yet they all speak languages that stem from Proto-Oceanic – the ancestral language of all of Oceania. For example, the Polynesians are famed for their ability to cross vast swathes of Ocean by using star charts made out of sticks, whereas the Melanesians were not great seafarers. However all Oceanic peoples share similar horticultural practices of cultivating yam and taro root crops, which form the basis of an Oceanic diet.

The enormous cultural diversity amongst the Oceanic speaking people has led to widespread variation in the languages spoken in the South Pacific. In particular we can see the cultural influence on the various languages in how they encode possessive relationships in the language. In the most basic way, an Oceanic language makes a difference in the way it treats alienable and inalienable possessions. We’re not talking UFOs here! Inalienable possessions are those that have an inherent connection with the person to whom they belong – such as body parts or members of the family. Alienable possessions are items that can easily be transferred from one owner to another, such as food, baskets, or other household items.

In Port Sandwich, a language spoken in Vanuatu, possessions that are considered inalienable often have a suffix that encodes the possessor (my, your, his/her) directly attached to the possessed noun

(1)    naru-ngg
son-my
‘my son’

Whereas when speaking about sandwiches (and all other alienable possessions) in Port Sandwich, encoding is indirect. The possessor suffix is not able to attach directly to the possessed noun, but instead must attach to a separate marker of possession:

(2)    sanwis        isa-ngg
sandwich        POSS.MARKER-my
‘my sandwich’

Sandwiches aside, in many Oceanic languages this indirect construction that is used for alienable possessions has expanded to include various different semantic types of possession. Languages have separate possessive markers, often called classifiers. Many languages have a three-way split, such as in the language Wuvulu (spoken in the Western Islands off the north coast of Papua New Guinea), for possessions that are eaten, drunk or everything else:

3a. ana-u  niu                      b. numa-mu       upu                         c. ape-muponata
FOOD-my       fish.                DRINK-your  coconut                  GENERAL-your dog
‘my fish (to eat)’                     ‘your coconut (to drink)’             ‘your dog (as a pet)’

Some languages make even more semantic distinctions between alienable items. These classifiers often encode culturally important semantic distinctions. Vera’a, spoken in northern Vanuatu, has eight different possessive classifiers: food, drinks, canoes, houses, beds and mats, prized possessions, long-term possessions, and one for everything else. The Micronesian languages have the largest inventory of classifiers in Oceanic. The Chuuk language has developed thirty-five distinct classifiers, yes, thirty-five! Several of which are used to categorise different types of edible possessions. For example, there is a classifier for cooked food, one for raw food, one for leftover food, and even one that is used with food taken on a journey – great for classifying take-away food!

The yéméti classifier in the Chuuk language for food for a journey is great for take-away pizza, whereas the nikita classifier could be used the day after when you want to eat the leftover pizza – if there is any!

In other languages, speakers are able to create new classifiers when they need to on an ad-hoc basis. This mechanism is particularly prevalent in the languages of Micronesia and New Caledonia. Nêlêmwa, spoken in New Caledonia, can create new classifiers by repeating the possessed noun and adding a suffix to show the possessor, for example mwa ‘house’ (4a) can have the possessor suffix attached (4b), but if a speaker adds an adjective then the possessed noun must be repeated and the directly possessed noun functions as a classifier (4c). In this way a speaker of Nêlêmwa can create new classifiers whenever the need arises.

4a. mwa                        b. mwa-n                    c. mwa-n mwa     doo
house                           house-his                     house-his         house   earth
house                           ‘his house’                    ‘his earth-house’

Though cultural diversity plays a role in the formation of classifiers that are unique to particular languages in the Pacific, there is a commonality among classifiers, and languages that are located far apart often have classifiers that encode similar semantics, which means that though culturally diverse, some important cultural aspects are shared across the Oceanic peoples. For example, many of the Micronesian languages have developed classifiers for beds, mats and pillows. But the language of Vera’a spoken in Northern Vanuatu (over 2500 kilometers away) has also developed a classifier for sleep-related possessions. Similarly, classifiers for domesticated animals have developed in the languages of Micronesia, in Mussau and Seimat (both spoken on the offshore islands of Papua New Guinea), and in Nêlêmwa and Iaai, spoken in New Caledonia. The words used for these classifiers can’t be traced back to a single historical root, which means that these are sporadic innovations in these languages and point to the shared cultural life of the Oceanic peoples.

Just as speakers of different languages can name varying numbers of colours in a rainbow, with Israel Kamakawiwoʻole’s mother tongue Hawaiian distinguishing six colours in contrast to English’s seven, speakers of Oceanic languages differ in the number of ways of categorising their possessions.