Duality of Patterning: Some musings on one design feature of language

A few weeks ago there was a two-part programme on BBC entitled Talk to the Animals presented by Lucy Cooke. As you might imagine, it was about ‘cracking the animal code’ – finding out what animals are communicating with each other and how they are doing so. It was a great programme and got me thinking again about the differences between humans and other animals in terms of the way we communicate.

Our most stand-out method of communication is, of course, language. And our language is used to communicate just about anything and everything we can think of. Whilst animal communication typically concerns food, danger and mating, human communication goes way beyond these things. The more interesting question for me, however, is not so much what we are communicating, but how we are communicating it. How do we package the information we wish to convey and how do we structure it? How is language designed such that it allows us to do these things in the first place?

This question is huge and, surprise, surprise, unanswered. Therefore, I’m simply going to muse on one of the most significant design features of language that has been identified – duality of patterning.

Every human language has a system by which meaningless sounds are combined to make meaningful units (these can be thought of as words), and every human language has a separate system which combines these meaningful units into phrases and sentences (the same applies to Sign Languages). This means that a language can have a reasonably small number of meaningless elements from which it can generate a very large number of distinct words. Furthermore, this very large number of distinct words can be combined to form an even larger number of distinct sentences (in fact, an infinite number of sentences). The capacity of human language to take discrete elements from one level and combine them to make discrete units at another level is what Charles F. Hockett called duality of patterning (Hockett 1960).

It is an immensely efficient way of doing things. Imagine what language would be like if this were not the case. To be meaningful at all, the elements of language would have to be meaningful in and of themselves. Since there would be no way of combining them, we could only express as many things as we have words for. The shapes of these words would be chaotic as well since there would be no way of combining smaller meaningless elements into words.

A number of authors have suggested that a system for combining meaningless elements into meaningful words does exist in other animals, e.g. humpback whales and chaffinches (see Hurford 2007), but a system for combining meaningful words into phrases and sentences appears to be much rarer and possibly unique to humans. Why should humans have two combinatorial systems at their disposal? Or could it be that the two systems are fundamentally the same but appear different purely because of the nature of the elements they manipulate? This suggests that studying the similarities and differences between phonology and syntax will shed light on the underpinnings of our combinatorial abilities (see Nevins (2010) who argues that the operation Agree is found in both syntax and phonology). Comparing these with the abilities of other animals may then shed light on the evolution of language itself.


Hockett, C. F. (1960). The Origin of Speech. Scientific American, 203(3), 89–96.

Hurford, J. R. (2007). The Origins of Meaning: Language in the Light of Evolution. Oxford: Oxford University Press.

Nevins, A. (2010). Locality in Vowel Harmony. Cambridge, MA: MIT Press.

Are you a Belieber?

As I sat on the edge of my sofa on Saturday night watching Doctor Who and trying to acclimatise myself to a slightly softened version of Malcolm Tucker as the new identity of everyone’s favourite Time Lord, I wondered whether I could call myself a Whovian. A Whovian, as you may know, is someone who self identifies as a part of the Doctor Who fanbase. It is one of the seemingly endless set of terms that have been created to describe one’s particular fandom affiliation.

Nicknames for groups of fans have been around for a long time, for example the name Whovian was first used in the 1980s when fans created a fan club newsletter called the Whovian Times. For years we have heard football fans identifying themselves as part of the Toon Army or as a Gooner (Newcastle City fans and Arsenal fans, respectively). However, there seems to have been a recent explosion of fan nicknames in a host of areas: music (Beliebers, Directioners, Swifties), TV (Sherlockians, Gleeks), books (Ringers, Tributes, Twihards), films (Trekkies) and even celebrities (Cumberbitches, Pine Nuts). I want to consider some issues in this post: Why do we feel the need to create these nicknames? Why do some fan groups have nicknames whilst others do not? What makes a good fandom nickname and how are these created?

Firstly, why are these nicknames coined? I think there are four core reasons:

  1. To identify an ingroup. With the rise of the internet, people are exposed nowadays to media from across the world. Young people growing up with this access to global content may be rejecting typical labels such as nationality, religion or political affiliation in favour of associating themselves with personal interests. It is notable that the suffixes often used to create these fangroup names appear to come from those used for nationality or local identity (such as –[i]an used in American and Argentinian or –er used in Londoner and Westerner). By choosing one’s own label and ingroup, you are aligning yourself with a particular community that shares similar values. The names of these fan groups can act as a shibboleth. Although it may be easy to discern that a Belieber is a Justin Bieber fan, would you necessarily know that Smilers are Miley Cyrus fans? These names can create exclusivity where only those ‘in the know’ can be part of the group.
  2. To identify a community. Early fangroup names seem to stem from films or TV shows that held conventions. The names Warsies (Star Wars fans) and Trekkies/Trekkers (Star Trek fans) formed before the age of the online forum. The fact that people actually met in person and created a community around these brands is what helped create their monikers. Now that there are online forums and blogs for almost anything one can imagine, it has allowed communities to form in the virtual world. From these communities, fan group names have been coined. It is arguable that to be deserving of the fan group nickname, one must engage with these communities either online or in person. I might be a huge fan of Doctor Who, but having never visited any fan sites or attended any conventions, I probably could not consider myself a Whovian. So strong are the communities for some of these groups that there are dating websites entirely based around one’s particular fan community (for example, www.whovianlove.com).
  3. To create layers of fandom. A personal admission: I have three One Direction songs on iTunes and I know some of the band members’ names. You could perhaps say I am a One Direction fan. You would probably not say I am a Directioner. A Directioner is more than just someone who has some One Direction music on their iPod. It is someone who has memorised all the lyrics, knows where Harry Styles was born, has queued up for hours to buy tickets to their shows and so on. Nicknames for fan groups provide the superlative on a scale of commitment. It is possible to imagine someone saying: “She might listen to Justin Bieber, but she’s not a Belieber like me.”
  4. To defy haters. It seems that early fan group nicknames (such as Belieber or Directioner) were a means of unifying fans and standing up against those people who criticised the objects of fans’ affections. Perhaps it is the case that the more divisive the thing in question, the more likely it is to have a fan group nickname.

This brings me on to another question – why do some fan groups have nicknames and some do not? Some brands are enormously popular and yet do not have a fan group nickname. For example, Oprah Winfrey is arguably the most powerful woman in America. She has immense influence, is allegedly worth $2.9 billion and has over 25 million followers on Twitter. However, her many millions of fans do not have a nickname. I think this is due to two of the reasons mentioned above. As nicknames may stem from brands being divisive, there must be a feeling that the brand needs defending. Oprah is not criticised enough for her fans to rally together under one name. Secondly, to create an ingroup, a brand must be in some way exclusive. Oprah is too ubiquitous and popular to really be the source of an ingroup and therefore a fan name. Other huge fanbases that do not have a clear nickname include fans of Game of Thrones (or more generally the book series A Song of Ice and Fire) and fans of Harry Potter (notably some people call this group Pottheads but this started as a derogatory term and does not unite the fanbase). In these cases I suggest the final reason is at play again. With their phenomenal popularity, one cannot affiliate oneself with these brands as an ingroup due to the sheer size of the fanbase. However, one may choose to affiliate with certain characters or groups within the brands. For example, fans may side with the Lannisters or the Starks in the Game of Thrones fandom and Gryffindor or Slytherin in Harry Potter. Indeed, some of these sub-sections do have fan group nicknames. For example, the group of Harry Potter fans who wish that Hermione had chosen Harry instead of Ron call themselves Harmonians!

So, how are these nicknames formed? One way nicknames appear is that the artists select them themselves. This is not a new occurrence, with George Harrison calling the superfans of The Beatles, who gathered outside the Apple Corps building, Apple Scruffs. In 2009 Lady Gaga dubbed her fans her Little Monsters (after her album Fame Monster) and Ke$ha called her fans Animals (after her album Animal). However, often the communities themselves develop the nicknames for themselves. Sometimes they have a selection of names that they ask the celebrity to pick from (for example, Ed Sheeran picked Sheerios from a some fan-suggested possibilities). Sometimes the fan group names are selected and the artist in question does not necessarily approve of the choice (for example, Benedict Cumberbatch would prefer his fans called themselves Cumberbabes or the Cumber Collective, but they have dubbed themselves his Cumberbitches). When fans do select nicknames, it may be that a number abound for a while until one wins out (as with Ringers from a number of other possibilities for Lord of the Rings fans, such as LOTRians). In the case of fans of the Hunger Games series, they rather democratically had an online vote to choose their fan name.

So then, how does one create a good fan group nickname? The easiest way is to take the name of the object of your affection and add a suffix. The most popular appear to be –ers, –ies and –ians. Notably fanbases seem to steer away from the suffix –phile (the suffix that means ‘to have a fondness for’) perhaps due to unfavourable connotations from use of this suffix in unsavoury words such as paedophile and necrophile. A second option is to create amusing portmanteaus, such as Gleeks (from Glee + geek), Twihards (from Twilight + try-hard), Bey Hive (from Beyonce + Bee Hive), Fanilow (from fan + Barry Manilow) and, finally, for men who like a retro kids TV show, Brony (from brother + My Little Pony). As mentioned earlier, the fan group nickname can act as a shibboleth and therefore some groups may choose something that is slightly more obscure so that only ‘real fans’ will understand the meaning. For example, the Hunger Games’ fans chose Tributes as their nickname (a term used in the books to describe a certain heroic group to which most of the main characters belong). Similarly, Miley Cyrus fans are called Smilers, originating from the fact that Miley was nicknamed smiley when she was child. Bruce Springsteen fans call themselves Bruce Tramps due to one of his song titles and Katy Perry fans named themselves KatyCats due to their idol’s love of cats.

Due to inherent narcissism I could not help but consider what my fans would call themselves if I ever gained celebrity. I think that Rowena would only have to drop its first syllable to become a passable fan group name. Therefore, I can only hope that I maintain obscurity to save any group from ever having to declare themselves Weeners.

Québec, Language, and Identity

At the end of last month, I attended the 36th Annual meeting of the Cognitive Science Society in my hometown of Québec City, Canada. I was working as a student volunteer and found that several of my colleagues, from all corners of the world, were surprised that many residents of Québec City spoke little or no English. This has had me thinking about Québec’s linguistic situation and the fact that it remains strikingly misunderstood outside the province. What follows is my attempt to draw a brief portrait of Québec’s linguistic state of play. Let me begin by saying that I am perfectly aware that these remarks are tinted by my own personal experience growing up in Québec and would require a more careful examination than the one I can provide here. Nevertheless, I think they are important for discussing the Québécois case.

Until the 1960s, the francophone majority in Québec was relatively poorly educated and worked mainly on the production lines whilst the anglophone minority more likely occupied professional and managerial roles. English was dominant in Québec both economically and socially and Québecers found themselves having little levering power. This situation is denounced in Michèle Lalonde’s now famous 1968 Speak White poem. In 1960, La Révolution Tranquille (The Quiet Revolution) laid the groundwork for the social and economic emancipation of Québecers. This pivotal period of change in Québec’s history was marked by the secularisation of the state and the development of state institutions (e.g. public healthcare and education). These social advancements provided a fertile ground for the rise of a new French-speaking middle class that could now aspire to hold higher positions in society. As this new class rose, establishing French as Québec’s official language became a priority. In 1977, La Charte de la langue française (Loi 101) (The Charter of the French language (Bill 101)) defined French as the official language of the province; “the French language, the distinctive language of a people that is in the majority French-speaking, is the instrument by which that people has articulated its identity” (Preamble, Charter of the French Language). To this day, Bill 101 remains central to linguistic legislation in Québec, from education policy to signposting.

It is clear that Québecers’ attitude towards the question of language is a complicated one, perhaps unsurprisingly given the obsequiousness they were long expected to adopt in such matters. Discussions of language often seem to awaken delicate sensibilities relating to identity, culture, and politics. But what is the other side of the story? What is the current linguistic situation in Québec? A recent report published last May by Québec’s University of Public Administration raises important issues regarding the teaching of English as a second language (ESL) in the province. The report concludes that despite efforts from Québec’s government to improve ESL teaching, much remains to be done to increase the number of teaching hours pupils receive and to ensure that improvements generalise to all regions of the province, not just the city centres. Evidence cited by the authors of the report suggests that 1200 hours of study are required to reach a basic level of proficiency in a second language. However, the report states that pupils in Québec currently receive on average only 800 hours of ESL teaching during primary and secondary school education. The report highlights the crucial importance of bilingualism as a means for innovation and economic prosperity in the province as well as for the competitiveness of its workforce in a global economy.

There was a time where the francophone majority in Québec seemed terminally threatened and much progress has been made to endow the province with the fundamental linguistic legislation and rights to firmly establish itself as a French-speaking province within an English-speaking country; a nation within a nation if you will. It is not necessary to go very far back in Québec’s history to understand why we have, as a nation, fiercely defended our language to combat oppression. However, I believe there are real risks attached to equating language with identity, not least of which is the impending risk of depriving new generations from acquiring an adequate level of proficiency in English for fear of acculturation. The danger of preventing new generations from becoming more competitive and active in an ever-expanding world is lurking. If there is anything that imperils Québec, it is the risk of wasting another generation on sterile debates. Whilst Québecers need to learn about and protect their linguistic heritage, they need not fear English as the threat it once was but embrace it as the instrument that could give them the upper hand.


Éditeur officiel du Québec (August 2014). Charter of the French Language. Retrieved from http://www2.publicationsduquebec.gouv.qc.ca/dynamicSearch/telecharge.php?type=2&file=/C_11/C11_A.html

Centre de recherche et d’expertise en evaluation (CREXE) (May 2014). Recherche évaluative sur l’intervention gouvernementale en matière d’enseignement de l’anglais, langue seconde, au Québec. Retrieved from http://crexe.enap.ca/cerberus/files/nouvelles/documents/AnglaisIntensif_ENAP_Rapport3.pdf

Lalonde, Michèle (1974). Speak White. Montréal: L’Hexagone.



Prepositions and national identity

Last week I attended the 5th Sociolinguistics Summer School in Dublin, Ireland. Being, as it were, directed primarily at early-career researchers, the talks offered a good overview of what young sociolinguists (that’s linguists interested in the relationships between social and linguistic variation) are up to these days. There was a pretty impressive amount of papers and posters on new media – Twitter seems to be a fairly fashionable research topic among our lot – and, being in Ireland, the summer school had attracted quite a few talks on minority languages and dialects such as Irish Gaelic, Catalan and, yes, you guessed it, Australian Aboriginal English. Among the many interesting talks, one in particular has had me thinking this week: the final one, presented by Anne Marie Devlin of University College Cork. 


Her talk was entitled “Prepositions on the battlefront: ‘В’ and ‘На’ as indices of socio-political identity in the current conflict between Ukraine and Russia” and, as the title indicates, it focused on the socio-political role of language in Ukraine. According to Anne Marie, the current socio-political conflict is now shaping the ways language is being used in Ukraine, most notably resulting in Russian being given preference in different social spheres, including the sociolinguistic landscape. In this way, Ukrainian-language signs are being removed and replaced with signs in Russian. More subtly, though, her talk demonstrated that small cues like the use of prepositions can be just as powerful tools in signalling socio-political opinion. Speakers of Russian have access to two different prepositions collocating with the word “Ukraine”: “v”, which roughly corresponds to the word “in” in English, and “na”, which means something akin to English “on”. The “in” preposition is used to refer to nation states, whereas the “on” preposition is used with counties or islands, that is, parts of a larger nation state. In this way, through the consistent use of one of these prepositions, a Russian speaker can signal her attitude to Ukraine’s national status. And indeed, after combing through a number of newspapers, letters and online forums, Anne Marie concluded that preposition use in both Ukrainian and Russian media strongly correlate with the political opinions expressed. Writers in favour of an independent Ukraine would almost exclusively use the “in” preposition, and vice versa.

Nuuk (Anna)

This got me thinking. As a Dane, I’ve noticed, but never really given much thought to, a similar sociolinguistic situation at home, which hinges on the political relationship between Denmark and Greenland. For you non-Danes out there, let me explain. Because of its colonial history, Greenland is an autonomous country within the Danish realm. This is a strange in-between state of affairs – it has home rule, but it’s still economically, and to some degree politically, dependent on Denmark. Now, Danes have a similar set of prepositions to the Russo-Ukrainian ones. So the question is: how do we refer to Greenland? My own intuition is to use the “on” preposition, but the “in” variant doesn’t sound too bad either. On the other hand, my stepfather, who is very close friends with a Greenlandic couple, consistently uses the “in” variant. I also recall having a discussion with an Icelandic colleague on a similar matter. Iceland received its independence from Denmark in 1918 and became a republic in 1944. However, lots of Danes still use the “on” preposition when referring to the country, to the irritation of (it would seem) a number of linguistically savvy Icelanders. Protesting this trend, my colleague agued that he associated the Danish “på Island” (“on Iceland”) with derogatory views of the country. In other words, preposition use seems to be able to trigger similar sociolinguistic effects in Danish, even within the current calm Scandinavian political climate. And just as interestingly, the status of these prepositions as sociolinguistic markers seem to have completely escaped the attention of Danes. This is why I like early-career conference presentations – they can be real eye-openers!

Denmark map

To end these musings, let me bring them closer to home. As a non-native speaker of English, I’m not as sensitive to linguistic differences in this language as I am in Danish. So, English speakers, this is where I ask for your opinions. Are prepositions used in similar ways in English? Do you use “in” or “on” with the Solomon Islands? Jamaica? The Channel Islands? The Hebrides?

I don’t know about you, but I’ll be paying closer attention to preposition use in media coverings from now on.


On the (in)completeness of language and thought

The relationship between language and thought is a fascinating field for investigation because, however effortlessly we seem to think and speak, a closer look reveals that the interaction between these is not as simple as we may take it to be. In a previous post I tried to show that, despite the fact that language and thought are tightly intertwined, they do not overlap; some evidence for this comes from considering ways in which we conceive and communicate thoughts without using words, just as dance partners communicate their next using body signals alone. In this post I talk about the relationship between language and thought from the point of view of my own research topic: how are we able to convey complete thoughts using sentences that are incomplete from a syntactic point of view? Since I don’t have a full answer to this problem (yet!), I will talk more generally about the mismatch between language and thought as far as the aspect of completeness/incompleteness of each is concerned; a mismatch which does not get in the way of efficient communication.

Roughly speaking, when we talk about syntactically complete sentences, we mean the ones that involve at least one predicate/verb, e.g. ‘I’m sitting in the sun’, ‘John is British’, ‘Anna was late last night’, etc. Such simple complete sentences have traditionally been the primary unit of analysis for linguistic theory. However, the main difference between such sentences and the ones we use in actual conversations is that the latter never occur in isolation, but rather in a context. They normally occur in sequences of sentences, and are placed in a certain context, consisting of a time, place, topic of conversation, a person we are addressing, etc (for more about the notion of context see Finkbeiner et al. 2012). This way, each sentence that occurs in a conversation can build on previous ones, as well as on information that is already given in the context. Thus, as speakers, we do not have to make explicit every single aspect of the meaning we want to communicate because we can trust that some information is already present in the context (and, as such, known to our interlocutors).

There are many ways in which the interaction of utterances with context (linguistic and extra-linguistic) can save us from having to be explicit about every single aspect of meaning we want to convey. For example, if I have already been talking to a friend about my housemate, I can then afford to say ‘She is going to Paris tomorrow’, without having to explicitly define the female ‘she’ refers to. Similarly, if I utter ‘I’m ready’, it must be clear from the context what it is I am ready for, otherwise my sentence would not be meaningful (see Bach 1994, 2001). In general, each sentence we use is, roughly speaking, supposed to express a thought that we want to share with our interlocutors. But due to the interaction of sentences with context it is possible for communication to be achieved, even if there’s no one-to-one correspondence between the sentences we use and the units of thought we want communicate.

Lets look at some examples of complete and incomplete language and thought. To do this, we’ll need to use a unit of measurement for each. For language this unit will be the sentence; for thought it will be the proposition (‘proposition’ is, roughly speaking, a term used by philosophers of language to talk about ‘units’ of thought. You can read more about it here). In our everyday conversations, we can see all possible combinations of completeness/incompleteness of units of language and units of thought: we use complete sentences to convey complete thoughts, incomplete sentences to convey complete thoughts, incomplete sentences to convey incomplete thoughts, and so on. These combinations are shown in the table below which evaluates the utterance in the 3rd column with regards to the [+/- complete] feature. By ‘language’ I refer to the sentence explicitly pronounced, and by ‘thought’ I refer to the message conveyed by that sentence alone.

Language Thought

  1. [+complete] [+complete]: ‘Germany won the 2014 world cup’.
  2. [+complete] [-complete]: ‘Everybody went to the beach yesterday’.
  3. [-complete] [+complete] [context: doorbell rings] ‘Probably the pizza guy’.
  4. [-complete] [-complete] ‘The essay was on a complicated topic, but I found it interesting so…’


Cases 1 and 2 are linguistically complete because they involve at least one verb each (won, went). Case 1 also conveys a complete and determinate thought. Case 2, however, does not convey a complete thought because some additional piece of information is required to make ‘everybody’ meaningful, i.e., ‘everybody’ needs to be restricted to the specific group of people the speaker is talking about, because the sentence cannot really mean that everybody in the world went to the beach yesterday. This additional piece of information, e.g. ‘Everybody [from our group of friends/[in my family etc]’, need not be explicitly uttered because it is normally recoverable when the utterance is placed in context. But it is, strictly speaking, not contained in the thought conveyed by the sentence alone, hence the [-complete] feature in the ‘thought’ column.

Moving on to case 3, it is syntactically incomplete given that it does not contain any verbs, we can say that it conveys a complete and determinate thought, because it can only mean something along the lines of ‘[The person at the door is] probably the pizza guy’. [The person at the door is] is recoverable on the basis of the contextual information that the doorbell is ringing, the world-knowledge that pizzas are often delivered at doors, etc.

Case 4 is also linguistically incomplete, despite containing two verbs, because it is explicitly left open-ended (i.e. in English sentences are not meant to end in a connective such as ‘so’). However, it seems less clear whether it conveys are complete thought or not. At first glance, it seems that case 4 does not convey a complete proposition, not like case 1 does, or like case 3 because it wouldn’t be as straightforward to add the completion in brackets. At the same time, deciding that case 4 does not convey a complete proposition at all might not be fair either, because there is an intuitive sense in which 4 is meaningful (and, if something has meaning, then it arguably conveys certain thoughts/propositions). Thus, an intermediate solution would be to say that 4 conveys the complete proposition ‘The essay was on a difficult topic, but I found it interesting’, i.e. the thought that is conveyed by the part that comes before the open-endedness, and that, in addition to this, the open-ended part conveys a much more vague (and arguably incomplete) aspect of meaning along the lines of ‘You can easily infer from what I said that there were pros and cons with regards to the essay topic’, ‘The fact that I found it interesting made the complicated topic easier for me’, or even ‘I leave the conclusion of what I said up to you, because I’m ambivalent with regards to the essay topic’ etc. In a way, open-endedness expresses not a proposition but an attitude towards a proposition (for more on this see here).

Abstracting away from the details, however, what we are left with is four cases which, if used in an appropriate context, would be perfectly interpretable by any average speaker of English; moreover, no average speaker of English would judge them as ungrammatical or infelicitous. The fact that these sentences will eventually lead to successful communication basically means that each of them will ultimately convey a complete thought, regardless of how complete or not the components of language or thought involved were initially. This is possible because language is in constant interaction with its context of use which is responsible for completing the incompleteness of either language or thought, and which allows for sentences and propositions to be meaningful, even if incomplete. Given how complex the interaction between language and thought is, isn’t it fascinating how effortlessly we perform the complicated task of communication?


  • Bach, K. 1994. ‘Semantic slack: What is said and more’. In: S. L. Tsochatzidis (ed.). Foundations of speech act theory. Philosophical and Linguistic perspectives. London and New York: Routlege. 267-291.
  • Bach, K. 2001. ‘You don’t say?’ Synthese 128. 15-44.
  • Finkbeiner, Rita, Jörg Meibauer and Petra B. Schumacher. 2012. What is context? Linguistic approaches and challenges. Amsterdam: John Benjamins.

Eleni Savva

The myth of the myth of language complexity

It’s common enough for people to think about languages in terms of relative complexity. I often hear people claim that a language—not infrequently their own language, or a language which they are learning—is particularly complex and difficult to learn due to its large vocabulary, morphological irregularities, or tricky pronunciation. It does seem intuitively obvious that some languages must just be more complex than others. Yet one of the first propositions that many undergrads are exposed to when they begin to study linguistics is that this is actually a myth.

A key tenet of formal linguistics and sociolinguistics for much of the 20th century was that of equicomplexity. This is the idea that all languages are equally effective and powerful means of communication, and, by somewhat shaky extension, that all languages are equally complex. Equicomplexity arose not really from any data-driven research, but from ideological discussions around prescriptivism and descripitivism. You’ll remember from an earlier post on this blog (http://www.icge.co.uk/languagesciencesblog/?p=25) that prescriptivism describes the position of believing that there is a ‘correct’ way to speak, and that to speak in other ways is somehow deficient, while descriptivism is an attitude of open interest towards the ways in which language is used without attaching any value judgements to them. Linguistics—particularly sociolinguistics—holds descriptivism as a core component of its approach, yet throughout much of history prescriptivism has been the mainstream viewpoint.

The—in many ways still largely unsuccessful—battle against prescriptivismhas perhaps necessitated holding simple, powerful ideological positions. Faced with educators who believe that the varieties spoken by their non-white or working class pupils are intrinsically inferior to the standard (calling them ‘illogical’, ‘crude’, ‘rough’, ‘ugly’ or just ‘incorrect’), there seems to be little space to have a sophisticated conversation about the nature of complexity and expressive power. Such views are clearly proxies for racism and classism and serve to perpetuate the grievous structural inequalities that typify western societies. They are best battled with clear maxims, cleanly expressed: All languages are equally powerful tools of communication. All languages are equally deserving of respect. There is no such thing as a simple language.

So, it’s obvious that equicomplexity took its place in the canon of linguistic assumptions for good reason. However, in recent years and not without controversy, scholars have begun to unpick it. Few linguists would argue with the fundamental ideological position underlying the statement that ‘all [natively learned] languages are equally powerful means of communication’, but many have begun to question the leap to the idea that all languages must therefore be equally complex.

It’s clear that in anyparticular area of grammar, languages can be more or less complex. So, English, with two distinct surface forms of each regular noun, is obviously simpler in this respect than Finnish, with perhaps 26. Mandarin, which distinguishes between 19 and 26 different consonants (depending on how you count it), is clearly more complicated in this respect than New Zealand Māori with 10 consonants but less complicated than Adyghe, with over 50. Given this, to maintain that all languages areequally complex overall, one must assume that when one area of grammar gets more complicated, others get more simple to compensate. This has been the implicit assumption underlying equicomplexity for several decades.

The problem is, it turns out that this just isn’t true. If this were true, then whatever our measure of complexity is (—and that’s a whole nother blog post) we should find that in a big sample of languages there is a negative correlation between complexity in one area of grammar and complexity in another. Yet in reality, studies like Maddieson (2006; 2007) and Shosted (2006) show, if anything, a weak positive correlation between complexity in different areas of grammar: languages with more complicated phonology are more, not less, likely to have complicated morphology.

So where does that leave equicomplexity? Well, if we accept these findings then we pretty much have to abandon the idea that all languages are equally complex. It was never backed up by evidence in the first place, and these findings seem to represent some pretty conclusive counter-evidence. It doesn’t, of course, mean that we should abandon the claims that all natively-learned languages are equally powerful means of communication and that all languages are equally deserving of respect. These remain important ideological positions. However, if we can reject canonical equicomplexity, lots of exciting new avenues of research open up to us: Why are some languages more complex than others? How much of language complexity is built into the innate language faculty, and how much is cultural elaboration? What social conditions cause languages to become simpler and what cause them to become more complex? It’s in this latter area that my own research is focused.

A pertinent addendum to all of this has to do with the nature and experience of complexity. When, as I mentioned at the beginning, I hear people talking about how complicated different languages are, they’re almost always interested in the point of view of adult learners. They’re interested in whether they will have to put in more or less effort to learn another language, and in how much effort non-native speakers of their own language have had to make.

The reality is that this ‘ease of learning’ is only partially related to ‘complexity’ in the abstract. The biggest factor which will make another language easy or difficult to learn is not complexity but how closely related it is to your own native language(s) and any other languages you speak. Native speakers of English will find Norwegian or French extremely easy to learn, as (for different reasons) they each share a great deal of vocabulary and structural similarities with English; native speakers of Cantonese may not. Native speakers of languages which do not distinguish tones (e.g. most—though not all—European languages) may find particular difficulty in learning languages which do (most languages of subsaharan Africa, the Chinese languages and related languages, as well as many others).

Having taken this into account, then, yes, morphological and phonological complexity will tend to make for a harder learning process. There is simply a lot more verbal morphology to memorise for a student of Spanish than for a student of Mandarin, and this will take time. Similarly a learner of Hawai’ian won’t have to spend very much energy at all on learning the different consonants they need to be able to pronounce compared with a learner of Halkomelem or another Salishan language, and a student of Danish must learn to distinguish far more vowel qualities than a student of Standard Arabic.

At the end, we have a rather mixed picture. Clearly, in descriptive, neutral terms, some languages are much more complex than others. From a practical point of view for most users of language, though, this has little real relevance. Their experience of language complexity will mostly come down to their own language backgrounds—and even where it doesn’t, it will always be possible to identify particularly complex structures and features of some sort in any language.

Maddieson, Ian. 2006. Correlating phonological complexity: Data and validation. Linguistic Typology 10. 106–123. doi:10.1515/LINGTY.2006.004.
Maddieson, Ian. 2007. Issues of phonological complexity: Statistical analysis of the relationship between syllable structures, segment inventories and tone contrasts. In M.-J. Solé, P. Beddor & M. Ohala (eds.), Experimental Approaches to Phonology, 93–103. Oxford: Oxford University Press.
Shosted, Ryan K. 2006. Correlating complexity: A typological approach. Linguistic Typology 10. 1–40. doi:10.1515/LINGTY.2006.001.

LanguageS in China

Attention: This article contains Chinese text. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Chinese characters.

It all began with a question while I was in a cab from the Cambridge railway station to my college. The driver, after asking where I come from and what my field of study is, asked me a quite simple yet difficult question that kept me busy for the rest of my trip: “so, how many languages are there in China?”

Most people I have met, even Chinese people themselves, do not have a clear idea about the linguistic situation and diversity in China. After all, there is a language named after the country, the so-called “Chinese language”, which is also the lingua franca in China. This description, however, is far from accurate with regards to the real situation of languages spoken in China – China is not a monolingual country, although it is monolingual in some areas. The definition of Chinese language is more complicated than you can imagine, even though everyone knows that the national language of China is called “Standard Chinese”.

In this post, I  focus on several myths about the languages in China, and show that neither “Chinese language” nor “languages in China” are simple concepts.

How many languages are there in China?
There are 298 languages in total, currently spoken by native people in China; some languages are national and regional lingua francas with millions and billions of speakers, while some languages are used by only a few thousands of people in small counties (Lewis, Simons and Fennig, 2014). This number does not include those languages spoken by immigrants, such as English, Arabic or Yoruba; however, it does include some languages that are spoken by ethnic minorities in China which are official languages of other countries, such as Russian, Uzbek and Korean. (There are ethnic minorities of Russian, Uzbek and Korean origins in China whose native languages are recognised among the languages of China.)

Do all the languages in China use Chinese characters?
This is definitely not the case; or, to be more precise, the Chinese language is the only language that uses Chinese characters nowadays. Most of the commonly used languages in China have their own written forms, like Tibetan, Mongolian and Uyghur (using Arabic alphabets); some languages like Zhuang once used Chinese characters for documentation, but Chinese characters have gradually been replaced by Latin characters.

Is there an official language of China?
China does not have a confirmed “official language” – I have double checked the Constitution but there is not a single article with regards to the issue of the official language of the country. However, China does have a standard language: according to Article 2 of Law of the People’s Republic of China on the Standard Spoken and Written Chinese Language (2000), the spoken form of standard Chinese language is Putonghua and the written form should be in Standardised Chinese Character.

In actual use, however, the language policy is more flexible; especially in the areas where ethnic minorities reside, languages other than Standard Chinese are used in both informal and institutional contexts. A good example comes from Renminbi, the currency of China: If we carefully examine a bank note, we will find that it is more similar to Swiss Franc than to Pound Sterling – it is multilingual. A number of languages appear on the note: Chinese (in the form of pinyin), Mongolian, Tibetan, Uyghur and Zhuang. Apart from Chinese, the other four languages are important minority languages in China, and some of them have obtained institutional status in the provinces they are mostly spoken; for instance, Tibetan is an official language in Tibet, part of Qinghai and some areas in Gansu.


So what is “Chinese language”?
The term “Chinese language”, or Hanyu (汉语), is a loosely defined concept. In linguistics, the name refers to a group of linguistic varieties that come from one single ancient origin; the vocabulary and sentential structure of these varieties is generally the same. In general, these linguistic varieties can be classified into seven large subgroups: Mandarin, Wu, Yue (Cantonese), Min, Gan, Xiang, Kejia (Hakka). Here is a family tree of the Chinese languages proposed by You (2000), showing the history and development of these different subgroups.


Due to geographical factors, some varieties of the Chinese language have been isolated from others, and this isolation has led to changes in the way these varieties sound; for example, a native speaker of Shaoxing Chinese may find episodes of TV series in Wenzhou Chinese difficult to follow, if she watches them without subtitles, although the distance between the two cities is only a bit more than 300 km (which is a rather short distance for Chinese standards). This phenomenon is quite common in Southern China, and is called “different pronunciations within five kilometers”.

In traditional linguistic research on Chinese language, these subgroups are labelled “dialects of Chinese language”. I prefer to avoid the term “dialect” because it is not the case that all these linguistic varieties are mutually intelligible, which is the criterion that some Western sociolinguists might use to define “dialects” of the same language.

So you mean we can’t contrast  “Chinese” with “Cantonese”?
Yes, this is indeed the case. Cantonese is a member of the Chinese language group, so it is a branch of the Chinese language; it does not make sense to say “I can speak Chinese and Cantonese” – to Chinese people this sounds equivalent to “I can speak English and London English”. However, we can still contrast  “Mandarin” and “Cantonese”, or “Standard Chinese” and “Cantonese”, because these terms refer to different varieties of the Chinese language.

But what is Mandarin Chinese? Is there any difference between Mandarin and Putonghua?
Mandarin is a subgroup of the Chinese language that is widely spoken in Northern and South-western China; in Chinese, we call it Guanhua (官话), which means “the (Chinese) language spoken by officials”. Varieties of Mandarin do not have a unified pronunciation, but usually native speakers of different varieties of Mandarin can roughly understand each other.

The spoken form of contemporary standard Chinese is Putonghua, whose phonological system is based on Northern Mandarin, and, more specifically, on the varieties spoken in and around Beijing. A simple way to describe the relationship between Mandarin and Putonghua is that Putonghua is a member of the Mandarin group of languages, while Mandarin is a member of the group of Chinese languages. Nowadays, Putonghua is the most representative form of the Chinese language, and when we talk about “learning to speak Chinese”, we always refer to Putonghua.

This was only a sample of the questions that I have been asked to answer over the years, being both a linguistics student and Chinese. I could go on about the languages in China for hours, but I’m afraid I should stop here due to space and time limitations. If you are interested in learning more about the development and categorisation of varieties of the Chinese language, I sincerely recommend Jerry Norman’s Chinese – it is a wonderful introduction to this ancient and beautiful language which will be interesting even for speakers of ‘Chinese languages’ themselves.


Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.). (2014). Ethnologue: Languages of the World, Seventeenth edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com.

Norman, J. (1988). Chinese. Cambridge: Cambridge University Press.

The Law of the People’s Republic of China on the Standard Spoken and Written Chinese Language. 2000. The People’s Republic of China. 

You, R. (2000). Chinese Dialectology. Shanghai: Shanghai Education Publishing.


Reconstruction in relative clauses

Me again, with more stuff about relative clauses! In my defence, I have been working on reconstruction in relative clauses quite a bit recently, so this represents one way of desaturating my brain. That is not to imply that it is a tedious topic – far from it. Reconstruction effects in relative clauses give us a fascinating clue about how these constructions are built and how our interpretive faculties ‘read’ such structures. I have tried to avoid technicalities and jargon as much as possible, and to keep this blog entry a reasonable length whilst also getting to the core of some very deep questions in current syntactic theory. So, let’s get started.

We’ll start by considering the following data (if two elements have the same subscript, it means that the two elements refer to the same individual; if the subscripts are different, the elements refer to different individuals. The * means that the sentence is ungrammatical).

(1)        a.         Samx likes the picture of himselfx.

b.         *Samx likes the picture of himx.

c.         Samx thinks that Rosie likes the picture of himx.

In (1a), himself must refer to Sam. In (1b), him must not refer to Sam but must refer to some other singular male individual (some speakers find (1b) acceptable (Reinhart & Reuland 1993), but I and most other people I have asked do not). (1c) is ambiguous: him can either refer to Sam (as shown by the subscripts) or to some other singular male individual. The pattern in (1) is traditionally captured by the Binding Conditions (Conditions A and B to be more precise) (Chomsky, 1981). The Binding Conditions are quite technical so I won’t go into them here. What is important is the pattern in (1).

What happens if we relativise picture of X, i.e. modify picture of X with a relative clause?

(2)        a.         The picture of himselfx that Samx likes is quite flattering.

b.         ?/*The picture of himx that Samx likes is quite flattering.

c.         The picture of himx that Samx thinks that Rosie likes is quite flattering.

As we can see, the pattern in (2) is exactly the same as in (1). This suggests that we are interpreting the head of the relative clause, i.e. picture of himself, in the object position of like, since then (2) can be interpreted in the same way as (1). This in turn suggests that the head of the relative clause originated inside the relative clause and was moved to the position in which it is pronounced. However, when it comes to interpreting (rather than pronouncing) the structure, we ‘reconstruct’ the movement and interpret the head of the relative clause in its original position (see Bianchi, 1999; Kayne, 1994; Schachter, 1973; Vergnaud, 1974). For example, (2a) is interpreted as (3), where the bold copy is the one being interpreted. Note that this bold copy is not pronounced.

(3)        The picture of himselfx that Samx likes (the) picture of himselfx is quite flattering.

The bold the is in brackets because technically the determiner the does not reconstruct with the head of the relative clause picture of himself (Bianchi, 2000; Cinque, 2013; Kayne, 1994; Williamson, 1987 on the so-called indefiniteness effect on the copy internal to the relative clause). Reconstruction thus captures the similarities between (1) and (2) in a straightforward way.

In (2), the head of the relative clause served as the subject of the main clause. What happens when it serves as the direct object of the main clause?

(4)        a.         *Mrs. Cottony hates the picture of himselfx that Samx likes.

b.         ?/*Mrs. Cottony hates the picture of himx that Samx likes.

c.         Mrs. Cottony hates the picture of himx that Samx thinks that Rosie likes.

If the head of the relative is picture of him, the pattern is the same as in (1) and (2), which suggests that reconstruction has taken place. However, (4a) is ungrammatical for all the speakers that I have asked (this result is of great significance given what is usually said in the literature). This result is unexpected, especially if reconstruction is available in (4b) and (4c). If reconstruction were available, picture of himself should be able to reconstruct to the direct object position of likes inside the relative clause where it could co-refer with Sam, just like in (3). However, the only interpretation available in (4a) is the ungrammatical one where himself is trying to co-refer with Mrs. Cotton suggesting that reconstruction is impossible.

The difference between (4a) and (2a) lies in whether there is an element in the main clause that himself could get its reference from. In (2a), there is no such element, so picture of himself is forced to reconstruct so that himself gets a reference. In (4a), there is an element, albeit an unsuitable one. This suggests that the Binding Condition which allows himself to get its reference from another element applies blindly/automatically: himself gets bound to Mrs. Cotton automatically, which prevents reconstruction occurring. Later on, when it is time to interpret the binding relation, we discover that we were wrong to have bound himself to Mrs. Cotton, but by this time it is too late to perform reconstruction. This suggests that interpretation of syntactic structure only happens after all syntactic operations have finished. If it didn’t, we might expect that we could repair the mistake in (4a) by reconstruction. However, this is not what we find.

The same effect is also found in other constructions. Based on Browning (1987: 162-165), Brody (1995: 92) shows that (5) is acceptable suggesting that picture of himself has reconstructed to the direct object position of buy (the example is slightly adapted).

(5)        This picture of himselfx is easy to make Johnx buy.

However, reconstruction is blocked if there is a potential element that himself could get its reference from, even if it turns out later to be unsuitable (Brody, 1995: 92).

(6)        *Maryy expected those pictures of himselfx to be easy to make Johnx buy.

We have only touched the surface on reconstruction in relative clauses here (there are more reconstruction effects and more subtleties that I have been working on but which would take too long to lay out here). What we have concluded is that reconstruction is generally available in relative clauses (at least in English). This tells us that relative clauses are constructed with a copy of the head of the relative clause inside the relative clause itself. The problem is how to choose which copies to interpret. It seems that there are structural conditions which force certain copies to be interpreted, i.e. the choice is not completely free. Explaining what these conditions are can thus provide a fascinating clue about how the human mind works (and how it doesn’t).

If you’re keen to find out more, Sportiche (2006) gives a good overview of reconstruction effects and Fox (2000) develops a nice account of how interpretation interacts with syntactic structure.


Bianchi, V. (1999). Consequences of Antisymmetry: Headed Relative Clauses. Berlin/New York: Mouton de Gruyter.

Bianchi, V. (2000). The raising analysis of relative clauses: a reply to Borsley. Linguistic Inquiry, 31(1), 123–140.

Brody, M. (1995). Lexico-Logical Form: A Radically Minimalist Theory. Cambridge, MA: MIT Press.

Browning, M. (1987). Null Operator Constructions. PhD dissertation, MIT.

Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris.

Cinque, G. (2013). Typological Studies: Word Order and Relative Clauses. New York/London: Routledge.

Fox, D. (2000). Economy and Semantic Interpretation. Cambridge, MA: MIT Press.

Kayne, R. S. (1994). The Antisymmetry of Syntax. Cambridge, MA: MIT Press.

Schachter, P. (1973). Focus and relativization. Language, 49(1), 19–46.

Sportiche, D. (2006). Reconstruction, Binding, and Scope. In M. Everaert & H. van Riemsdijk (Eds.), The Blackwell Companion to Syntax. Volume IV (pp. 35–93). Oxford: Blackwell.

Vergnaud, J.-R. (1974). French relative clauses. Doctoral dissertation, MIT.

Williamson, J. S. (1987). An Indefiniteness Restriction for Relative Clauses in Lakhota. In E. J. Reuland & A. G. B. ter Meulen (Eds.), The Representation of (In)definiteness (pp. 168–190). Cambridge, MA.


Guiding problems and problematic guides: English language usage

Last week I dropped into an exciting event happening here in Cambridge, the ‘English Usage (Guides) Symposium’, organised by some folk over in Holland who are boldly Bridging the Unbridgeable. I must admit it had nothing to do with my PhD. However, being a linguist, and therefore avowed descriptivist, but also a copyeditor’s daughter with the bad habit of tutting at every cheeky hyphen with aspirations of being an en-rule, I couldn’t resist.

They had brought together authors, linguists, linguists-cum-authors, usage guide writers, usage guide revisers, journalists, a syntactician, and even Grammar Girl herself for two days of exchange and moderately warm debate.

The most interesting questions that were floated throughout the symposium were those getting behind the issues. Why are there usage problems? Where do they come from? And, what do ‘we’ do about them?

But, first off, what are these usage ‘problems’ that shelves of usage guides have been written to sort out? Any feature of a language – construction, word, phrase – which is thought by some speakers to not adhere to convention, or, worse, to be downright incorrect, ‘not the proper way of saying it’. There are split opinions over split infinitives. People are, like, unsure over the use of ‘like’ as a discourse marker. Dangling prepositions are something which people get het up about. Between you and I (or should it be me?), ‘literally’ as an intensifier is literally making steam come out of some folk’s ears. And so on. You get the idea. If you want to find some more examples, try Fowler’s Modern English Usage, Sir Ernest Gowers’ Plain Words, or perhaps the letters page of your chosen newspaper.

Where do these usage problems come from? One thing common to most comments about some problematic feature is the perception that it is a new(fangled) development. But this is almost always not the case. As David Crystal pointed out in his delightful talk on metalinguistic mentions in Punch magazine 1841–1901, the first mention of the dreaded split infinitive was in 1898 – and this is perhaps surprisingly late. Quotative ‘like’ (‘and she was like, “what’s bugging you?”’) is thought to have spread from California into British English in the 1980s1.

What such comments do rightly hit upon, though, is that usage problems often arise with language change, perhaps as new words and grammatical constructions arise in the spoken language, but different, older conventions are adhered to in written communication. Together with this goes sociolinguistic variation – the coexistence of forms with similar meanings or functions in different sociolects, which leads to competition between them and some sort of value judgement. And, as was pointed out in the symposium by Pam Peters and Geoffrey Pullum, usage guides themselves, or rather their authors, sometimes practically invent usage problems because of very personal opinion or the sheer need to comment on linguistic features and create an elitist ‘eloquent English’2. So there’s a self-fulfilling aspect to this too.

But once a linguistic feature becomes a ‘usage problem’, why does it gain a market share in the linguistic interest not just of ‘pedants’, but the public in general? Why is it that everyone, it seems, if perhaps not entirely sure about when to use ‘who’ and when ‘whom’, has some sense that there is something they should know about this and expresses attitudes about different usages (and the users)? That was what one of the ‘Bridging the Unbridgeable’ researchers, Viktorija Kostadinova, was asking. Two main views emerged during the symposium. For some, like Grammar Girl Mignon Fogarty, what was clear was that speakers like to have quick, simple answers, black-and-white, right-and-wrong, for functional reassurance (e.g. ‘if I say that in a job interview, will I be disadvantaged?’). I suspect there may be some appeal of feeling superior as well (inwardly tutting at those who apparently don’t know better). For Geoffrey Pullum, on the other hand, our obsession with correct usage comes instead from a grammatical masochism – we want to punish ourselves by finding out all the rules of our language, and how we’re doing things wrong. Why, he asked, are we happy to consult usage authorities older than our grandparents when we wouldn’t dream of consulting a medical or physics textbook from the 1920s? It must be a pleasure in our mistakes. I’m not so sure about that. One idea that did strike me though, from Robin Straaijer, was the observation that, as we spend a lot of time investing in learning our (written) language – 13 school years here – we are then loathe to discover that that’s not how you say it any more.

So what do ‘we’ do about them? By ‘we’ here, I mean mostly linguists, who are often called upon, whether by friends or a newspaper journalist, to comment on controversial linguistic features. Three options were presented last week. Firstly, as the Bridging the Unbridgeable team are setting out to do, we can descriptively investigate the sociolinguistics of this phenomenon – what are users’ attitudes to which usage problems? Secondly, we can help nudge usage guides from the Arts (personal opinions about best usage) to the Social Sciences (based on actual examples of usage) by providing historical linguistic information about the emergence, or decline, of conventions (put forward by Pam Peters). Thirdly, we could try to save speakers from their apparent ‘grammatical masochism’ by pointing to linguistic features for which, even in a ‘standard’ variety, there appears to be no clear majority view on convention. Whichever route is taken, it seems that dialogue in this area is important as it is through usage guides and usage problems that many people begin to be curious about language.

1 Macaulay, R. (2001) You’re like ‘why not?’ The quotative expressions of Glasgow adolescents. Journal of Sociolinguistics 5/1, 3–21
2 Apparently Wilson Follet in his Modern American Usage introduced the problems of using ‘hopefully’ in “I’ll hopefully make it tomorrow” and the possessive antecedent to the pronoun in “Fowler’s usage guide is widely consulted. He certainly has influenced written English style in the twentieth century”.

Conspiring indexicals

A while ago, I found myself on a plane sitting close to a distinguished and well-dressed man in his fifties. He bought an Italian newspaper and showed me the picture on the front-page of the newly formed Monti cabinet, following Berlusconi’s resignations in 2011. “Isn’t it curious – he asked me with a sarcastic tone – that they picked exactly this day to form the new cabinet?”. As I clearly did not get his allusion, he continued: “and isn’t it curious that they nominated exactly thirteen new ministers?”. For some reason, my puzzled face aroused his talkativeness. How could I ignore how to decipher the messages they want to send through masonic numerology? For the remaining three hours of our flight, the man decided to charitably remedy my evident ignorance by talking uninterruptedly about any imaginable sort of conspiracy theory: “They want to leave us in a state of ignorance to control us better”, he said. “They are mostly aliens called reptilians and they occupy the positions of power all around the world”. “They know how to cure cancer but they wouldn’t let us know”. “They can control the weather through the chemtrails produced by airplanes”. “They make babies have vaccinate because they know it causes autism”. I was extremely fascinated, I must confess. Not only by the inexhaustible fantasy of this guy and of his sources, but also – from a linguist’s perspective – from the very words he was using to make his points.

Picture taken from here

Conspiracy theories have always been extremely popular[1], and they are especially widespread in the internet era. Scientific explanations and rigorous fact checking can be boring, time-consuming and ultimately disappointing and/or unexciting. If you surf the internet around the sea of conspiracy websites and forums, you will easily notice that most of the conspirationist discourse follows schematic stylistic strategies and is characterized by a similar sensationalist and allusive rhetoric. But one thing that fascinated me about the talk of my bizarre travel companion, and that I later noticed in most conspiracy texts on the internet, is the widespread “empty” usage of the third-person plural pronoun “they”. Popular blogs names are: “Truth they are hiding” and “Stuff they don’t want you to know”. In the 1997 movie “Conspiracy Theory”, the conspiracy-theory obsessed character Jerry Fletcher (Mel Gibson) complains: “I’m only paranoid because they want me dead.”

If you are – predictably – now wondering who they are, I suggest that you look at this instructive flowchart. Here, I will only be interested in the word “they” itself. Semantically, one can distinguish three uses of third-person pronouns in English: In example (1), “they” acts as a variable bound by the quantifier “few linguists”. In (2), “they” is anaphoric to “John and Mary” appearing in the preceding sentence. Finally, in (3), “they” is said to have a deictic or indexical use, i.e. it directly picks the group of people from the extra-linguistic context which is made salient by the speaker:

  1. Few linguists believe that they will become rich.
  2. John and Mary are two linguists. They are homeless.
  3. (pointing to a group of indigent people) They are my linguist friends!

The interesting use of “they” in the conspiracy talk is of this third, indexical sort, i.e. cases in which there is no linguistic antecedent fixing the referent of the pronoun.

Indexicals are the paradigm of context-sensitive expressions: my utterance of the sentence “I am Italian” is different from your utterance of the very same sentence. In a sense, the two utterances share the same meaning, but they obviously have different contents and, probably, different truth-values. Such a straightforward intuition is at the core of David Kaplan’s theory of indexicals (Kaplan 1989), the mainstream theory of indexicals in semantics and philosophy of language. Incidentally, indexicals are also one big battlefield in semantics and pragmatics, with theorists questioning the very existence of a natural semantic class of indexical expressions or arguing about exactly which expressions fall into this category. The more conservative faction (e.g. Cappelen and Lepore 2005) struggles to show that we should restrict ourselves to Kaplan’s very limited list of indexicals (including what he called pure indexicals, like “I”, and “true demonstratives”, like “that”). Others, more liberal, are very willing to expand this list to the point of encompassing context-sensitive predicates like “red” (Rothschild and Segal 2009) or vague words like “heap” (Soames 2002). Still others think that indexicals are a bit like the stars: there are many more that we can see with the naked eye! According to these authors, a myriad of hidden indexical-like variables are attached to most words and are ultimately responsible for all effects of extra-linguistic context on semantic content (Stanley 2000 and Stanley and Szabò 2000).

Kaplan distinguishes between the character, the stable linguistic meaning of words and more complex constructions, and the content they express at each context. The character of indexicals is a rule that guides the search for the relevant referent(s) in the context. Kaplan formulates such a rule-based analysis in functional terms: thus, the character of “I” is a function which, at any context, takes as its value something like “the speaker” or “the agent” of the context and returns as its content the relevant person[2]. Not all such rules are so straightforward, though. For example, the character of “here” is said to be a function that picks the location of the context, but the extent of such a location can be extremely indeterminate (in different contexts, “I was born here” can mean that I was born at this exact spot/in this country/in this world).  Not all indexicals are so strictly constrained by linguist conventions: perhaps “I” always picks up the speaker of the context without requiring other information[3]. By contrast, the linguistic meaning of a pronoun like “she” (in the indexical use) only specifies a few grammatical features (gender, number and animacy), leaving the identification of the relevant referent to the speaker’s intentions (i.e. to whomever the speaker has in mind with her use of the pronoun).

Our original “they” is even more unconstrained: it does determine the number of the referent but leaves other features completely unspecified and dependent upon the speaker’s referential intention.  Be that as it may, the standard picture has it that indexicals are tools of direct reference: intentions and pragmatic reasoning may be more or less crucial for fixing the referent of an indexical, nonetheless the content of an indexical, in the standard picture, is the object to which it refers in the context.  What about our conspirationist “they”, then? How should we make sense of the directly referential nature of indexicals vis à vis the referential indeterminacy and, possibly, emptiness of the conspiracy “they”? Well, I can think of three possible solutions. First, we could deny that the use of the pronoun here is really indexical: perhaps the seemingly indexical “they” is here actually a disguised (and very general!) description, as when Obama utters (4)

  1. The Founders invested me with sole responsibility for appointing Supreme Court justices. (Nunberg 1993)

Here the indexical “me” is used descriptively to mean “The president of the United States” rather than the individual Barack Obama.

Secondly, we may say that the conspiracy theorists’ speech is completely meaningless, in the sense that a sentence of the form “They don’t want us to know such and such” does not express anything at all or at most expresses just a vague content, because the use of the indexical is accompanied by only an imprecise referential intention (Numberg 1993).

Or perhaps the conpirationist “they” is truly indexical and does actually refer… to the indexicals themselves! After all, semanticists know that indexicals are such an ugly beast to cope with: nobody would be surprised to find that they are nothing but conspiring indexicals.


[1] Such a continuous success of conspiracy theories in the European history is fascinatedly described and exploited in two novels by Umberto Eco, The Foucault Pendulum (1988) and The Prague Cemetery (2010).

[2] Actually, Kaplan’s content is itself a function (from a circumstance of evaluation to an extension). I will gloss over this point (and over many others!) for the sake of simplicity.

[3] But see Predelli 2005 for apparent counterexamples to this generalisation.


  • Kaplan, David (1989). “Demonstratives”. In Almog, J., Perry, J. and Wettstein, H. (1989). Themes from Kaplan. Oxford: Oxford University Press, pp. 481–563.
  • Lepore, Ernest & Cappelen, Herman (2005). Insensitive Semantics: A Defense of Semantic Minimalism and Speech Act Pluralism. Oxford: Blackwell Pub.
  • Nunberg, Geoffrey (1993). “Indexicality and deixis”. Linguistics and Philosophy 16 (1):1–43.
  • Predelli, Stefano (2005). Contexts. Meaning, Truth, and the Use of Language. Oxford: Oxford University Press.
  • Rothschild, Daniel & Segal, Gabriel (2009). “Indexical Predicates”. Mind and Language 24 (4):467–93.
  • Soames, Scott (2002). “Precis of Understanding Truth and replies”. Philosophy and Phenomological Research 65 (1):429–452.
  • Stanley, Jason (2000). “Context and logical form”. Linguistics and Philosophy 23 (4):391–434.
  • Stanley, Jason & Szabó, Zoltán G. (2000). “On Quantifier Domain Restriction”. Mind and Language 15 (2&3):219–61.