Five languages from Spain you never knew existed


Spain, known as the land of sol, siestas and sangría, is less well known for the diversity of its linguistic heritage. Though most people could probably identify Basque and Catalan as languages spoken in Spain, the Iberian Peninsula is home to a number of minority languages and dialects you’ve probably never heard of.

1. Galician
Even though it has 2.4 million native speakers and is Spain’s third most spoken language, it’s surprising how many people have not heard of Galician. That’s maybe because Galicia’s most famous residents – including current Spanish PM Mariano Rajoy, the 20th century writer Ramón Valle Inclan, and Spain’s late dictator Francisco Franco – are known for being Spanish, not Galician, speakers. Galician and its neighbour across the border, Portuguese, were originally one and the same language, Galician-Portuguese, a highly prestigious medieval language famous for its lyric poetry. Although Galician then became a low prestige language for many centuries, today it is co-official with Spanish in Galicia and has its own publicly-funded television channel.

2. Aragonese
Aragon, the land of Henry VIII’s first wife and the 18th century painter Francisco de Goya, is also home to the luenga aragonesa, or Aragonese language, which descends from the now extinct medieval language Navarro-Aragonese from North-East Spain. Aragonese has a core of native speakers in Aragon’s remote Pyrenean villages, but is understood by many more people in the surrounding areas, and is mutually intelligible with neighbouring languages such as Castilian Spanish and Astur-Leonese. Aragonese is protected by local laws, and has its own language academy, but, like many of Spain’s minority languages, is still considered endangered by UNESCO.

3. Judeo-Spanish
Judeo-Spanish, also known as Ladino, is a language from Spain that hasn’t been spoken in Spain since 1492, when the Jewish population was expelled from the country by the Spanish monarchs. Though since 2015 their descendants have been able to apply for dual Spanish citizenship, Judeo-Spanish is now mostly spoken in Israel, Turkey and Greece. Because the last time it was used in Spain was over 500 years ago, Judeo-Spanish is a linguistic time capsule, and sounds more similar to Medieval Spanish than modern Spanish. It’s also the only Spanish language to be written in Hebrew script.

Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)
Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)

4. Leonese
One half of the Astur-Leonese language branch, Leonese descends from the everyday Latin spoken in the geographic area that would become the medieval Kingdom of León. The kingdom’s capital, also called León, or Llión in Leonese, was founded as a military camp and settled by the Roman Seventh Legion. León/Llión, which means ‘lion’ in today’s language, actually comes from the Latin name of the capital’s founders, legio septima gemina, meaning ‘the twin seventh legion’. Despite a flourishing medieval literature, history has not been kind to the llingua llionesa, which is now a UNESCO endangered language that, unlike its other half Asturian, has no official status in Spain.

5. Aranese
Less well known than its sister dialects of Gascon and Occitan in France, Aranese is spoken in the Valley of Aran (Val d’Aran), one of only two areas of Spain on the Northern side of the Pyrenees. Even though it only has around 3,000 native speakers and is used in a small geographical area, Aranese has co-official status in its home region, Catalonia, and is taught as a compulsory subject in schools in the Val d’Aran. You can even pick up some Aranese yourself thanks to the University of Barcelona’s multilingual conversation guide.

This blog post was previously published at the Huffington Post.

Fiction favourites for die-hard linguists

With summer supposedly at its height (calling all semanticists: is the fact that it is July enough to meet the definition of ‘summer’, even if rain, hail, wind, and multiple layers of clothing are featured?) it is time to put down that book on allophony in Gujarati, the processing of possessives in Chhattisgarhi, or whatever linguistic page-turner you’re dipping into. Quite often though, a dedicated linguist cannot stop making language-related observations even when switching off from full-blown research-mode. As a consequence, linguistics literature is generously sprinkled with references to novels, the authors of which were at the time of writing gloriously unaware of their work turning into scientific data. I bring to you the top three works of fiction for linguists.

First up we have Lewis Carroll’s classic Alice’s Adventures in Wonderland – an inspiration for film makers, fantasy lovers, and keen readers of all ages, but also a pragmatist’s wet dream. As Alice plunges down the rabbit hole, she ends up in a world defying the rules of physics and also in something of a nonsensical linguistic wonderland. The Mad Hatter’s tea party is a celebration of not only unbirthdays but also of linguistic rule bending and pragmatic acrobatics. Take the exchange between the March Hare and Alice:

‘Take some more tea,’ the March Hare said to Alice, very earnestly.
‘I’ve had nothing yet,’ Alice replied in an offended tone, ‘so I can’t take more.’ ‘You mean you can’t take less,’ said the Hatter: ‘it’s very easy to take more than nothing.’

The reason for Alice’s confusion is that we tend to take ‘more’ to imply that something has already happened; however, as the March Hare points out, technically more is more even if it’s just more than nothing. Pragmatics brain pain anyone?

Having a very non-pragmatic tea party

Having a very non-pragmatic tea party

If you aren’t feeling the strain of mad tea party communication, perhaps you would enjoy some even more strenuous intellectual effort in the form of James Joyce’s Finnegans Wake. The epic stream on consciousness is written in a mixture of real English words, neologisms, and portmanteau, or blended, words. As such, it’s one for morphologists and phonologists; by putting bits and pieces of the right sound combinations together and throwing in some actual endings, you can get something apparently nonsensical, yet very much English-like.

“Loud, heap miseries upon us yet entwine our arts with laughters low!”

Joyce’s tour de force really is a literary gem, but somewhat on the challenging side of summer reads. As my friend Wikipedia kindly puts it, “Finnegans Wake remains largely unread by the general public.”

Very much the opposite is the case with – *drumroll* *gasp* – The Lord of the Rings. Okay, I know many of us got into the trilogy by staring into the dreamy film-version eyes of Legolas, drooling at the hunky figure of Boromir, or (despite female characters being few and far between, my inner feminist would like to point out) admiring the airy wardrobe of the mysterious Galadriel. But as die-hard fans will know (and who will probably roll their eyes at me pointing out the obvious), Tolkien was something of a language geek, and this shows throughout the linguistic landscape of Middle Earth.

"Really enjoying the Sean Bea... sorry, linguistic aspects."

“Really enjoying the Sean Bea… sorry, linguistic aspects.” Credit: Jason Parrish.

One of the author’s passions was Finnish, as he wanted to read the Finnish national epic Kalevala in the original. I don’t know if Tolkien ever managed to conquer Kalevala without the aid of translations and dictionaries, but his Elvish language of Quenya was certainly inspired by the sounds of Finnish: Mindon Eldalieva (‘Lofty Tower of Elvish-people’) and Oron Oiolosse (‘Ever Snow-white Peak’) resemble Finnish just as Finnegans Wake resembles English. There’s also a healthy dose of Old English squeezed into proper names: Saruman derives from the root searu- (‘treachery’ or ‘cunning’), while Mordor is rather morbidly based on morthor (‘murder’).

And on that cheerful note, I wish you happy linguisticky reading!

In a manner of speaking

Way way back many blog posts ago, I wondered why some pragmaticians have been so obsessed with eating cookies. Well, not exactly, but they have spent a lot of time investigating utterances like:

Ben ate some of the cookies.

Some pupils failed the exam.


On a standard view, these utterances literally mean something like “Ben ate some and possibly all of the cookies” and “Some and possibly all pupils failed the exam”, but in the right context, the hearer infers the speaker’s intended meaning that “Ben ate some but not all of the cookies” and “Some but not all pupils failed the exam”. These implicated meanings are known as scalar implicatures (‘implicature’ being a technical term coined by Paul Grice for non-deductive implications beyond the literal meaning of what a speaker says, based on assumed principles of co-operative conversation). That’s because the key word in the utterance, here ‘some’, belongs on a scale with some alternative word that the speaker could have said but didn’t (like ‘all’).

And we can think of other examples like:

The coffee is warm
+> but not hot.

The concert was good
+> but not excellent.

There are loads of reasons why pragmaticians (and especially the experimental sort) have concentrated on the ‘some but not all’ implicature: it’s easy to depict visually, it’s pretty consistent across contexts (or without much context – good for controlled but not very natural experiments); it’s easy to make nice balanced stimuli by just changing one word, and so on. However, what we’re now learning is that ‘some’ is perhaps not so representative of scalar implicatures after all1. And if we can’t even generalise from ‘some’ to scalar implicatures, what about quantity implicatures (of which scalars are a subtype) or other kinds of implicature, manner and relevance?

Given the apparent dearth of research on manner implicatures, I decided to do some investigating myself. Now, manner implicatures arise when speakers some marked form to convey a marked meaning; an unconventional phrase to express that what they’re describing is not a stereotypical instance. Grice’s own example was:

‘Miss Singer produced a series of sounds corresponding closely to the score of an aria from Rigoletto

The idea is: why did the speaker go to such lengths, when they could have just said ’sang’? It’s because the singing was in some way not stereotypical – probably downright awful!2

Here are some other potential examples:

Ben constructed a pile of bricks and mortar.
+> Ben built a wall, if you can call it a wall.
(Otherwise the speaker would have said “built a wall”)

Mary caused the car to stop.
+> Mary stopped the car in some unusual way (e.g., pulling the handbrake, driving into a tree…)
(Contrast with “stopped the car”)

Terry put the duvet and pillows on top of the bed.
+> Terry made the bed, but messily.
(With the alternative “made the bed”)


Now, why have these inferences received so little attention? One possibility is that they’re not really a definable category on their own, but really a motley bunch of quantity implicatures, conventions and other stuff (as, for example, Horn would have it3). However, the fact that what is important here is the form of the utterance rather than the content (the lexicon and syntax, not the semantics), means that they are in some ways distinct, and at least in principle worthy of research in their own right (as Levinson, 2000, thinks) – even if, in the end, we find out they’re not so interesting after all.

Another possibility is that they’re hard to investigate. This has certainly been my own experience. It’s hard to think up examples, and it’s almost impossible to search corpora for them, except for the most conventionalised of cases. They seem to be rare and somewhat precarious in real life conversation, and when you do try to test them out, people seem to have very varying degrees of sensitivity to them.

This could give us pause for thought and suggest that maybe they are not a distinct pragmatic phenomenon after all. However, perhaps it’s not surprising that they don’t lend themselves to the normal tools of experimental pragmatics, like acceptability judgement tasks and picture matching tasks, which tend to rely on participants’ intuitions about isolated utterances. Depending, as they do, on the speaker’s choice of words and grammar – on how she communicates her meaning, not just what she communicates – they may rely on a greater degree of knowledge of language and its conventional use, or at least on a greater degree of confidence in that knowledge. They are likely to be extremely variable depending on the linguistic context: is it formal? is there jargon? is the speaker a native or second language learner? does the speaker have their own unusual style? They are likely to be cued with intonation or hedges or discourse markers (“Well, he constructed a pile of bricks and mortar”). This means that in an unnatural experimental context (like choosing a picture that matches what an utterance), participants may not be confident enough about any inference they do make, or that they don’t make any manner inference without those extra cues and information about the speaker.

I’ve found some evidence that some adults are sensitive to some manner implicatures, but I’ve no show-stopping conclusions yet. So if you think you’ve made a manner inference recently, then do give me a shout!

1 Van Tiel, Bob, Emiel Van Miltenburg, Natalia Zevakhina & Bart Geurts. 2014. Scalar diversity. Journal of Semantics.

2 As my supervisor pointed out, the example actually works better as a case of manner without ‘closely’, otherwise you could just get a scalar implicature of ‘not exactly’.

3 Horn, Laurence. 2008. Implicature. In Gregory Ward & Laurence Horn (eds.), The Handbook of Pragmatics, vol. 26. John Wiley & Sons.

“One please”, and language attrition

My boyfriend, a huge fan of Japanese anime, once recommended an old anime series to me called Strawberry Marshmallow (ichigo mashimaro, 苺ましまろ), which is about the life of a group of Japanese schoolgirls. Although I watch anime now and then, its style is not my type, so I thought about giving up after finishing the first episode. But later, when I had reluctantly proceeded to the second episode, I suddenly decided to continue – not because the story was good, but because of one notable character and, well, the linguistic phenomenon behind her.

Ana Coppola, the character in question, moved to Japan with her family when she was six. Before that, she was born and raised in Cornwall, and English was the only language she was exposed to, so, by definition, she was a native speaker of British English. When her family lived in Japan, she enrolled in a local primary school and began to take courses and speak to classmates in Japanese; her parents (who were definitely native speakers of British English by definition, but I know that the anime studio could only hire Japanese voice actors) also spoke to her in Japanese at home, maybe to encourage her to use her second language. At the start of the story, she has been living in Japan for five years, and she can speak Japanese fluently – even over-fluently, since she uses words and sentences that other girls of her age never use. At the same time, her English becomes a mess: she needs to take the same English course with her classmates, and some of her friends even outperform her. For example, in the anime, when she is asked to introduce herself in English, she goes as far as to say “one please”, which is a word-to-word translation of ひとつよろしく (hitotsu yoroshiku, more or less like “it’s my pleasure to meet you this time”). It seems that, due to the years she has spent using Japanese as her dominant language at both home and school, Ana has finally lost her first language, and in the story she could not be qualified as a ‘native speaker of English’ anymore, even if she always claims that she is from Cornwall.

Well, you say it.

Well, you say it.

Ana’s case is extreme. After all, it is just an anime show and the producers always want to pursue some dramatic effects, and we can see it from the language Ana uses at home – how could a British migrant family suddenly begin to speak Japanese at home? Milder examples, however, are rather common in real life. We always believe that, once we acquire a language and can use it fluently, especially our mother tongue, we cannot forget it. However, if we do not use the language for a long time, we may come across certain difficulties when we pick it up again. In a word, one may feel ‘clumsy’ when using a language that one can manage but rarely uses – that is exactly the word used by Aneta Pavlenko when she first systematically investigated this phenomenon in 2003. That can happen to one’s first language if one lives in another linguistic environment for years, or, sometimes to one’s second language if one has moved back home for years. I have received questions from people complaining about their inability to speak their first language ‘naturally’: some of them forget the intended words in their first language and are forced to switch to their second language (which is called code-switching, and I have discussed it here), while some others start to use the structures that are available in the second language but not the first language. Even I myself experience these symptoms now and again. While the use of a language is diminished, it is suppressed and gradually ‘worn out’. The phenomenon is called ‘language attrition’. Recent research in bilingualism focuses particularly on the attrition of the first language of immigrants, but there are also studies that investigate the mechanism and phenomenon of attrition of a second language of multilingual people.

Attrition can happen in different aspects of language, including vocabulary, sentence structure, and pronunciation. I have described the first two aspects in the previous paragraph: the difficulty of word selection and misuse of ‘false friends’ (words with different meanings that are pronounced similarly in two languages) are reflections of lexical attrition, and the blending of sentence structures in two languages can be seen as an instance of syntactic attrition. One possible reason of attrition, according to previous research, may be the higher cognitive load that multilingual people experience when they process language: compared to their monolingual peers, multilingual speakers manage more lexical items and more complicated structures in two or more languages, and they may need additional time and cognitive effort to select the lexical items or the syntactic elements that not only match their intentions but are also consistent with their current language. Therefore, they seem to be slowed down when processing the less used language.

While the first two aspects, vocabulary and sentence structure, are prominent in both L1 and L2 attrition, pronunciation appears less frequently in the research on L1 attrition. It seems that one can still preserve the pronunciation of one’s mother tongue even if one shows problems in sentence construction or lexical selection. The attrition of pronunciation in L2 is more interesting: the L2 learner might be able to pronounce their second language in a way that is closer to native speakers of that language when they are in the L2 environment, but they will gradually change their accent after returning to the L1 environment, developing a kind of foreign accent. I have observed this change in the Japanese singer Keito Okamoto: he studied in the UK between ages 9 and 13 and is able to speak English fluently still today, but his English accent is now a blend of British and Japanese, with the Japanese features having gradually become more obvious.

(Well, I bet his accent was definitely not like this when he was 12.)

The other day I had a discussion with a good friend at Edinburgh about the possible relation between first language attrition and second language acquisition (particularly adult L2 acquisition), since her topic is related to the former and I focus mainly on the latter. If we look into the most superficial level of the two phenomena, i.e. the performance of a L2 learner and a L1 attritor, we can see that they both receive influence from another language they already know: for the L2 learner, that is their first language, while for the L1 attritor that is their second language. Therefore, from a macro perspective, we can connect L1 attrition and L2 acquisition: both of them reflect the role of cross-linguistic influence in multilingual people’s language ability, and we can even make some predictions of L1 attrition based on the established results of the research on L2 acquisition.

However, L1 attrition and L2 acquisition are essentially different because their internal mechanisms are opposite. The process of L2 acquisition is ‘from nothing to something’, and it happens at the level of both competence and performance. The most advanced adult L2 learners still cannot possess the intuitions that are generally available to native speakers of that language. The process of L1 attrition, on the other hand, is ‘from everything to something’; although we can observe how attritors’ performance changes, their knowledge of the language, as well as their competence of using the language, does not change significantly. Studies have shown that L1 attritors can recover their performance if they stay in the L1 environment for a period of time, which means that they have preserved their native speaker intuitions, and what is influenced is only the performance.

One particular point about language attrition I believe is worth mentioning is that it does not receive the attention it deserves – when I say it, I mean the attention from ordinary people, since more and more applied linguists have begun to look into the phenomenon. Similar to code-switching, people without proper linguistic knowledge often show prejudice and bias when they hear about language attrition. They can hardly believe that one can sound ‘unnatural’ when speaking one’s first language after using one’s second language for decades, and sometimes unfriendly people may call these immigrants ‘traitors’ or ‘pretenders’, which I have occasionally observed. Knowing more about language attrition can help us not only understand better the human cognitive system and its ability to language learning, but also reduce prejudice. If I have convinced you that ‘we can lose our language, even if it is our mother tongue’, congratulations! Now you know a bit more about how language works in our brain.


Special thanks to Wenjia Cai and Maki Kubota @ Edinburgh!

To get a professional overview of first language attrition: (Professor Schmid is one of the leading scholars in the area of first language attrition.)

For more details about L1 and L2 attrition:

De Bot, K., & Weltens, B. (1995). Foreign language attrition. Annual Review of Applied Linguistics, 15, 151-164.

Pavlenko, A. (2003) “I feel clumsy speaking Russian”; L2 influence on L1 in narratives of Russian L2 users of English. In Cook, V. (ed) Effects of the second language on the first. Clevedon, UK: Multilingual Matters, pp. 32-61.

Schmid, Monika S., Barbara Köpke, Merel Keijzer and Lina Weilemar. 2004 (eds). First Language Attrition: Interdisciplinary Perspectives on Methodological Issues. Amsterdam/Philadelphia: John Benjamins.

Breaking news: Guy who learned Japanese from girlfriend speaks like a girl

The life of a Japanese learner is not an easy one: you’re faced with not one, but three, non-Roman writing systems, an array of politeness forms, and freaky word order options. To top it all off, the language learning community swarms with warning examples of how to make a fool of yourself by not only making simple grammatical mistakes but also *Psycho tune* using the language of the wrong gender. The stories of ‘Guy who learned Japanese from girlfriend now speaks like a girl’ or ‘Girl shunned for using male language’ could make Daily Mail headlines were the publication more linguistically inclined.

... and speak accordingly! Image credit: Beth Granter.

… and speak accordingly! Image credit: Beth Granter.

Although reality isn’t quite as much of a minefield as a cheeky Google search for ‘why is learning Japanese so difficult?’ might suggest, gendered language is a very real phenomenon in Japanese. Gendered language is nothing grammatical in this case, and is separate from grammaticalised aspects such as gender-specific or neutral pronouns: if you use a form strongly associated with the opposite gender, your utterance won’t be deemed ungrammatical, just weird or out of place. Rather, it refers to gender roles and ideologies of what female and male speech should sound like; very broadly, female language tends to be more submissive and gentle, male language being more direct. Indicators of gender are scattered throughout the language, showing up in choices of words, interjections (things like oh, uhm), directives (i.e. commands, requests, and questions), pronunciation, and so on, but most prominently in the choice of sentence endings and pronouns referring to ‘I’ and ‘you’.

Take sentence endings first. Japanese is full of particles – a bit of a dustbin category for little word-like elements that don’t always mean much on their own – and many of these appear sentence-finally, expressing things like questioning or affirmation: think ‘this is nice, isn’t it’ type of things. One of the most clearly gendered expressions here is wa: as a sentence-final particle, it indicates the femininity of the speaker. A girl would typically say takai-wa (‘tall’), but the same utterance for a boy would be ridiculed as effeminate, the socially prescribed option being plain takai. On the opposite end of the scale is zo indicating new information and used exclusively in male speech. It is considered informal and even rude, mirroring the directness ideologically associated with male speech. A gender-neutral way of expressing a similar meaning is the particle yo.

Where things get slightly more puzzling for a Western learner is the proliferation of words referring to ‘I’ and ‘you’. Some of them relate to degrees of politeness and differences in social status, but many of them encode additional aspects of gender. Gender-neutral choices are exemplified by the Japanese class favourite watashi ‘I’. Typically feminine pronouns are again perceived as softer and gentler; these include atashi, atakushi, and uchi, while typically male pronouns feature boku and ore. As for referring to ‘you’, male forms tend to be more direct – kimi, omae, anta. Feminine counterparts encode a greater degree of politeness, so that a typical form of address comes in the form of the pronoun anata followed by the addressee’s name or title and a socially appropriate marker.

Boku? Wabash? Ore? Who am I? Image credit: myrealnameispete.

Boku? Wabash? Ore? Who am I? Image credit: myrealnameispete.

Forms differ, then, but whether there is a yo or a zo at the end of a sentence doesn’t say much in itself to a non-Japanese aficionado. Where things get interesting is when these funny little word forms are considered in the broader social context (cue gender studies students!).

Slightly archaic as it may sound with its submissive feminine and direct masculine forms, gendered language as it is conceived of today is in fact a relatively recent innovation. This goes also against the popular depiction of gendered language as an ancient innovation, a case of this-is-the-way-it-has-always-been. Although differences in male and female speech have been recorded earlier as well (and this is not surprising; even in languages like English where ’gendered language’ is not made into a big deal for learners, speakers will think, perhaps unconsciusoly, of certain ways of speaking as typically feminine or masculine), gendered language proper kicked off after the start of the Meiji era (from mid-19th century). Something of a celebrity among Japanese linguists, Orie Endo compared two literary works, Ukiyoburo from 1813 and Sanshiro from 1909 to show the timescale the modern gender differences emerged along. In the earlier text, the differences in speech patterns reflect social status, but not gender, while in the later one gendered differences have clearly emerged.

Of course, particles, pronouns and the like don’t just turn into carriers of gendered meanings in a vacuum: as always in language change, there is a human component. At the start of the Meiji era, schoolgirls, as teenagers so often do, came under criticism from societally higher-up men for speaking ‘improperly’, in ‘vulgar’ or ‘unpleasant’ way (déjà vu? My earlier post on be like, might, like, bear like a resemblance to this). But as sometimes happens to schoolgirls, they grow up and take on positions of role models. At the time, there was an ideal of ryoosai kenbo ’good wife, wise mother’ hanging around that was supported by the government and featured in women’s magazines, written about by the very schoolgirls in the very language they had been criticised for. The form of language became associated with the ideal middle class and was therefore something to aspire to; and voilà, the parlance of vulgar schoolgirls had become the new vogue.

Babbling away in improper Japanese. Image credit: Danny Choo.

Babbling away in improper Japanese. Image credit: Danny Choo.

The establishment of the new feminine language, or onna kotoba, was further propelled by reactions to the rapid modernization and westernization processes that Japan was undergoing: the nation needed traditions to hold on to, and gendered language made Japanese conveniently unique compared to the incoming western influences.

That is not to say that after its establishment gendered language has become inert to change. Quite on the contrary, recent developments see female and male speech losing their distinctness. Young women have been reported to have stopped using feminine speech in favour of more neutral or even masculine language, with teenage girls taking over traditionally male pronouns such as boku and ore. Some male forms are taking on a function of female empowerment: miki ’you’, usually used by men to close women friends, is now also used by women to talk down to men. The linguistic changes can, again, be tied to cultural shifts: more women than ever before are now delaying marriage and pursuing careers, and a speech form intended to convey submissiveness does not fit well with this emancipation of sorts. Interestingly, self-defining male speakers are not taking on features of female speech, and this would seem to be so engrained into the gendered mindset that it does not happen even in soliloquy, or speaking alone. So, while women happily use masculine forms even when blabbering alone, men don’t use feminine forms in the same way. This, some would argue, reflects the greater value associated with the masculine gender image in social hierarchy.

It's (linguistic) emancipation time! Image credit: DonkeyHotey.

It’s (linguistic) emancipation time! Image credit: DonkeyHotey.

But with that, I’m treading into non-linguistic waters. So, students of gender studies, rejoice – if you took in anything of the above, you are sorted for research topics.

Students of Japanese, on the other hand, relax – gendered language is becoming less and less of an issue for your learning process, and your gender-mismatched speech is unlikely to make a headline.


I said this was a hot topic, and the internet in particular is full of thrilling reading. I’ve drawn inspiration, examples, and information from Tofugu, Oxford Dictionaries, The Japan Times, LinguaLift, and Japanese – a linguistic introduction by Yoko Hasegawa.


Syntactic Islands

Last week’s post on movement highlighted just how useful it can be to think of elements in a sentence being able to move to different positions.

One of the really interesting things about movement is that it seems to be unbounded. In other words, there are apparently no bounds to how far an element can move (I say seems and apparently because there is a lot of evidence to suggest that the situation is far more complex. However, I’ll ignore those details here). We can see this unboundedness in so-called wh-movement (it’s called wh-movement because the moving element undergoing this type of movement typically begins with the letters wh– in English, e.g. who, what, where etc.). In (1b), the wh-phrase what is interpreted as the direct object of the verb see. Since direct objects in English normally follow the verb, as in (1a), what is also thought to originate in this position (I’ll indicate this original position with what in strikethrough, indicating that it is not pronounced).

(1) a. You saw something

b. What did you see what?

The interesting thing is that what can appear arbitrarily far away from its original position.

(2) a. What did you see what?

b. What did he say that you saw what?

c. What did she think that he said that you saw what?

d. What did they believe that she thought that he said that you saw what?

e. …

However, the story is much more complicated and interesting. In his 1967 PhD thesis, John Robert ‘Haj’ Ross identified various syntactic ‘islands’. Syntacticians generally take ‘islands’ to be units of structure that elements cannot escape or move from.

We saw in (2) that a wh-phrase can apparently move as far away from its original position as it wants. But now consider the following sentence:

(3) a. I met the man who saw a ghost.

b. I visited the house that you saw a ghost in.

The examples in (3) contain relative clauses (surprise, surprise! See my other posts) – who saw a ghost is a relative clause modifying the noun man in (3a), and that you saw a ghost in is a relative clause modifying the noun house in (3b). In (2), we attempted to move a wh-phrase which originated as the direct object of the verb see. As we saw, the result was a well-formed English sentence. So let’s try to do the same thing with the examples in (3).

(4) a. *What did I meet the man who saw what?

b. *What did I visit the house that you saw what in?

The examples in (4) are crashingly bad English sentences (hence the *)! In fact, if I’d put these sentences at the beginning of this post, you’d probably be wondering what on earth I was trying to say. But what’s wrong with them? What’s the difference between the examples in (2) and the examples in (4)?

As Ross observed, the problem with (4) is the relative clause. The relative clause seems to be an island, i.e. wh-phrases cannot escape from it.

There are other types of island beside relative clauses. Consider the example in (5) which involves two conjoined (or co-ordinated) direct objects.

(5) a. You saw a ghost and a monster.

b. *What did you see what and a monster?

c. *What did you see a ghost and what?

As (5b) and (5c) show, we cannot move out of co-ordinate structures (Ross called this the Co-ordinate Structure Constraint).

Relative clauses and co-ordinate structures seem to be very strong islands, i.e. if we attempt to move an element out of such islands, the result is very bad (given how much I’ve worked on relative clauses, I’m in two minds about whether I’m stuck on them because they are strong in the sense of an island paradise which you never want to leave, or in the sense of Alcatraz!).

Other structures seem to be weaker islands, i.e. we can move elements out of them, but the result is not quite fully acceptable (this is marked with a ? at the beginning of the example). An example of a weak island can be seen in (6b) (compare it to (6a), which does not contain an island).

(6) a. What do you think that I saw what?

b. ?What do you wonder whether I saw what?

The island effect seems to come from the fact that we are trying to move an element out of a subordinate clause beginning with whether. Similar effects are found with subordinate clauses beginning with how, where, who(m), what. They are thus called wh-islands because these islands are introduced by elements typically beginning with wh– in English.

(7) a. You asked how I fixed the car?

b. ?What did you ask how I fixed what?

Although it has been nearly 50 years since Ross first identified his ‘islands’ (and there are many more that I have not mentioned), they continue to pose problems for syntactic theory. A major step was to identify the islands in the first place. This shows how important it is to consider not only what languages can do, but also what they can’t (there’s also the massive question about how we intuitively know that sentences such as those in (4) and (5b,c) are bad). The next step was to understand what makes an island an island (and whether all islands are in fact alike). We can list them and classify them as strong or weak, but ideally we’d want to know why these structures are islands and not others. Attempts have been made (notably by Chomsky (1973), see also the recent overview of the issues by Boeckx (2012)) but the problem still remains.


Boeckx, C. (2012). Syntactic Islands. Cambridge: Cambridge University Press.

Chomsky, N. (1973). Conditions on Transformations. In Anderson, S., & Kiparsky, P. (eds.) A Festschrift for Morris Halle (pp. 232-286). New York: Holt, Rinehart & Winston.

Ross, J.R. (1967). Constraints on Variables in Syntax. Doctoral dissertation, MIT.


Moving things moving around

Image taken from Copyright © Bob Harvey ( and licensed for reuse under this Creative Commons Licence.

It has recently come to my attention that – although we’ve made numerous references to the issue – we don’t seem to have had a proper post on this blog devoted to one of the most important and central* ideas of modern syntactic theory: movement.

Take, for example, the sentence Are you a cat? Now, normally in English verbs come after subjects, e.g. in the statement You are a cat. A good way of looking at the differences between the statement and the question is to say that, in the latter, the verb has moved from its usual position after the subject to a position at the start of the sentence. Syntacticians like to represent sentences using so-called “tree diagrams” (we needn’t go into the reasons for this here) and the one for Are you a cat? looks something like this:


The arrow here indicates the movement and I’ve “struck out” the lower copy of are to show that it’s not pronounced.

Can things other than verbs move? Of course they can. Compare The cat sipped the milk with The milk was sipped. In both cases, the milk is semantically the “object” of the verb sipped – the same thing happens to it in both sentences – but in the second (“passive”) sentence it appears in the subject position! A nice way of accounting for this is to say that it, too, moves to a higher position in the sentence. (Obviously we still have to account for things like the appearance of was, what’s happened to the cat etc., but those are somewhat separate issues.)


Another place we see movement is in sentences like What film shall we watch? Semantically, what film is again the object here, and we know objects ordinarily follow verbs in English, so again we can say that it’s moved from the end of the sentence to the start.

Potentially a big advantage of accounting for these things by movement is that it allows us to unify our explanations of what’s going on in all these different types of sentences: we can say they are all instances of a single phenomenon, movement, rather than having to come up with separate explanations for each case.

There’s been a huge amount of work on the theory of movement and it’s proved very profitable; it seems to tell us a great deal about language. An interesting thing that has come out of this work is that there seem to be restrictions on movement: you can’t, in practice, just move anything anywhere. For example, it appears that across languages movement always or almost always goes up the syntactic tree, not down it. In English, this means something can move leftwards in a sentence (as in the examples given in this post) but not rightwards – so you never get sentences like You are a cat are where the verb moves to the other side of the object.


(A challenge for the reader: can you come up with any apparent counterexamples to this claim that we don’t get movement down the tree / to the right in English?)

I personally think the idea of movement is one of the best insights to come out of linguistic theory; it’s truly impressive how much it can tell us about language. You only need to read some of our other posts on this blog – try looking under the “syntax” tag – to see just what a wide range of data it’s helpful in explaining.

* Caveat: many approaches to syntax do reject the idea of movement. This seems wrong-headed to me, as they still have to come up with some way of accounting for the facts discussed in this post, and movement is arguably the most straightforward way of doing so.

The words and sentences not taken

Two roads diverged in a wood, and I—

I took the one less traveled by,

And that has made all the difference.

— Robert Frost, The Road Not Taken

Ask Chris: Last week I watched a TV interview with a famous Chinese novelist. I pretty much enjoy most, if not all, of his works, but I found that his speech in the interview was not as fascinating as his novels. That reminded me of my best friend, who is really good at telling us her stories but can never write good articles. I believe that we use the same language in speaking and writing – but how could it be? Is there any difference in the use of language when we speak and write? Is the difference only limited to Chinese?

(Note by Chris: This blog involves the development of Chinese language since the original question is asked by a Chinese netizen, but I hope it will not bother most of the readers. If you find it difficult, please imagine that you are in the Middle Ages when the common written language was Latin.)


Chris answers: Of course we use the ‘same language’ when we speak and write if we are talking about the general system of information coding. However, if the language has a good history of written records, or it has been used on some formal occasions, it will develop two sub-systems: the spoken system and the written system. This phenomenon is not limited to Chinese: English, Japanese, and other well-known and less-known languages all have the two-sub-system phenomenon, so it is possible for the native speakers of any language that a good speaker is not a good writer and vice versa.

Although structuralism is not the current trend in the field linguistics, it is still very useful when we analyse a language as a comprehensive system of symbols. When we define a language, we need to define all the possible symbols that can be used, and all the possible rules and principles of the combination of symbols; these two form the entire system of a language. However, it is not the case that any element or rule of the system can be used anywhere: we prefer some elements and rules in the spoken discourse and others in the written discourse, and such preferences lead to two subsets in the system of symbols, where the differences between spoken and written language lie. In general, the elements of the two subsets are pretty much shared, such as the sound patterns (we call them ‘phonological rules’), the word-formation rules (we call them ‘morphological rules’), the default word order, a number of lexical items, and some pragmatic rules, but there are exceptions – as we will see.

Please allow me to take Chinese and English as examples. Looking at the history of Modern Chinese, some people have the impression, or what I would call misconception, that Modern Chinese is merely a spoken language because it originates from Vernacular Chinese (which is literally called ‘plain speech’ in Chinese). That is not true, though. When it was born, Vernacular Chinese was in contrast to the formal written Classic Chinese, and there was a time when this variety was only used in spoken discourse. But with the development and change of Vernacular Chinese, it generated a written system and several literary traditions prior to the birth of Modern Chinese, and one famous example is Dream of the Red Chamber in the Qing Dynasty. The words and sentences used in Dream of the Red Chamber were not exactly the same as those used in the daily spoken discourse at that time. Similarly, you will find that current works of Chinese literature make use of words and expressions that are rarely seen in our daily conversations.

Maybe you would like to argue that the differences exist in Chinese only due to its long history, and life will be simpler if we move to English. Sadly, I am going to tell you that this is not the case. Below is a sentence that I randomly selected from a paper in my hands. I believe it is totally different from the sort of conversation you might hear between me and my friends:

When many different networks are generated in a process of simulated evolution, certain types of modular architectures are selected as “highly fit” in that they are particularly efficient at solving a given learning task.     (Jaap M. J. Murre, Models of Monolingual and Bilingual Language Acquisition)

This sentence is quite different from our daily chitchat in several aspects. The range of words is rich, and the lexical items are formal (you can judge this from their length), and the content words (nouns, verbs, and adjectives) are dense. The sentence structure is more complicated, since it includes several relative clauses, and the modifiers are longer. These distinct features of written academic English mark the genre as separate from other genres, and that is exactly the reason that some international students with relatively good speaking skills are required learn ‘how to write academic English’ after they enrol on a university-level course.

In a word, there are essential differences between the syntactic, semantic, and pragmatic aspects of the spoken sub-system represented by our daily conversations and online chatting, and those of the written sub-system, represented by works of literature, academic essays, manuals, and documents. Actually, these differences are the targets of some linguistic research areas, such as stylistics, discourse analysis, and sometimes also sociolinguistics.

Now let’s move back to your first question – why do people perform differently when they speak and write? The divergence between the spoken and written sub-systems creates a problem: if you use any spoken element when you write, or vice versa, your audience will feel awkward. If a lot of spoken elements are used in the written language, the audience may feel that the author has a shortage of vocabulary and the content of her writing is too shallow. On the other hand, when written elements appear frequently in a piece of spoken discourse, the audience may be easily bored by the long sentence structures and difficult wording. The feeling that ‘the speech and articles by the same person are very different’ may be due to the presentation manners of the person, or the mismatch between the sub-system and the context of discourse.

Why do we have such a feeling when we notice a mismatch? Since I am working on language acquisition and processing, my instinct is that the reason may lie in the mechanism of human language processing. When we process spoken language, we always do it linearly: a piece of speech is always continuous and most of the time we do not pause or backtrack – considering the history of human technology, this was totally impossible when language first appeared. The information in the spoken discourse is continuously pushed into our processing system, and in order to catch the following bits, we do not have enough time to reconsider the hidden message of a particular word or phrase. Moreover, we may even encounter difficulties when hearing a less frequently used word in spoken language. Therefore, when processing spoken discourse, we expect the message to be clear and easy to understand.

Reading is another story. The most prominent feature of reading is that we can control the time of attention at one particular point, and we can backtrack to previous information. That is because the written information is not ‘pushed’ at us. Maybe we do not really feel that, but the movement of our eyes when we are reading does not always move strictly forwards. It has been discovered in eye-tracking studies that around 10% to 15% of eye movement is backward when we read (see reference), which means that we are going back to review some information in previous constituents; this feature is called ‘regression.’ If the text is difficult because of the rare lexical items, complex syntactic structures, or grammatical errors, people will stop moving their eyes and gaze at a particular point (which is called ‘fixation’), or they will perform more regressions. Below is a typical illustration of eye movement when people read from Eye-Tracking While Reading – Kertz Lab – Brown University Sunset Wiki.


At the same time, it should be noted that some high-level semantic and pragmatic processing does not always occur simultaneously with receiving information. This is more obvious when we appreciate rhetorical elements as well as reading literature. Such processing is called ‘non-spontaneous interpretation.’ Usually, it requires more cognitive efforts and processing time, and sometimes readers are even required to re-evaluate the information they have received from the preceding text and simulate the intentions of the author. We can hardly perform such processing when we listen to spoken discourse because stopping processing at any point would mean missing the ongoing flow of information.

All these differences in processing mentioned above will in return influence the word choices, sentence structures, and information organisation we use in spoken and written discourse. Since the processing of spoken language is quick, plain, and linear, we will make our speech short, direct, clear, and easy to identify, while the pauses and backtracking in the processing of written language allow us to add some complicated sentence structures, rare words, and rhetorical methods. If we organise the information as it is in spoken discourse when we write an article, the amount of information in each sentence will decrease, and thus the article is too shallow; in contrast, if we speak in the way that we write, there will be too much information to process, and the lack of non-spontaneous interpretation will also influence our feelings towards the discourse.

That is more or less the full story, and I hope you enjoyed this piece of my writing and all the information included in it. Unfortunately, if you ask me your questions face to face, what I will do is to recite the whole text to you – yes, I am indeed the kind of person who is better at writing than speaking. I knew this when I was still in kindergarten. I knew.


For more information about language processing in general, please refer to the following articles, and you can always stop and backtrack:

Furlong, Anne. “The soul of wit: A relevance theoretic discussion.” Language and Literature 20.2 (2011): 136-150.

Rayner, Keith, and Charles Clifton. “Language Processing in Reading and Speech Perception Is Fast and Incremental: Implications for Event Related Potential Research.” Biological psychology 80.1 (2009): 4–9.

Oh don’t be such a snob

We all know the type. Enjoying a Netflix and chill session, you innocently comment on whatever you happen to be watching – “Geez, George Clooney’s manliness is so different than Orlando Bloom’s” – when your grammar snob of a friend’s eyes light up. “Different to”, they hiss viciously. Or you’re offering a cutting-edge analysis of the current political situation over lunch – “…according to Merkel, who I disagree with” – when your solution to the Brexit crisis is interrupt by “Ahem, you mean with whom I disagree.” Yes, they get everywhere: in the Ecuadorian capital Quito, radical grammar pedants have created a concept of ‘orthographic vandalism’, correcting the grammar of Quito’s graffiti.

A few weeks back, Mona Ghalabi launched on a rant against grammar snobs on the Guardian. They use “elite and increasingly outdated form of English language”, believing that “language evolves but grammar doesn’t”, and are, quite simply, “patronizing, pretentious and just plain wrong.” “If I look around a room and say there are less people here than I expected”, Ghalabi says, “does it really need to be pointed out that because people can be counted, I should have said there are fewer people here?”

No, it absolutely does not. I want to join forces with Ghalabi and deliver the final blow to grammar snobs: I present to you three ways in which allegedly substandard speech is, in fact, not a disgrace but a linguist’s goldmine.


Beware, grammar snobbery ahead! Tim Lawrenz.

Now, sometimes speakers utter things that are quite frankly errors even by non-snobbish speakers’ standards; everyone has the occasional slip of a hunk of jeep instead of a heap of junk. As random as they may seem, these types of errors are in fact constrained by the phonological structure of the language in question, and as such can tell observant linguist something about it. In English, sphinx in the moonlight becomes minx in the spoonlight and never features the expected sfoonlight (what kind of deep, poetic conversation these examples would occur in, I don’t know). The reason is simple: native English phonology does not allow syllables beginning with sf, except in loanwords. And so, apparently innocent slips of the tongue turn out to allow insight into the psychological structures and processes of generating speech.

Things get even more interesting when you throw in an extra language. Bilingual children – and adults, as a matter of fact – sometimes mix words and structures from their two (or more) languages. For uninitiated this may seem like a terrible failure to gain competence in either language. However, code mixing is very much not random, and it can be shown how language-specific grammatical constraints are at play in at first sight substandard mixes. In child French, so called weak pronouns (je, tu, il,…) can appear with finite verbs only (i.e. verbs that can function as the root of an independent clause: I like cake has a finite verb, I liking cake does not), while strong pronouns (moi, toi, lui) can appear with both finite and nonfinite verbs. French children might then utter sentences like Moi pousser (‘Me pushing’) with strong pronoun and a nonfinite verb but never Je pousser with a weak pronoun. English, on the other hand, has only strong pronouns, so that English children can happily say things like I washing.

Baby speakers - cute and informative. What else could a linguist ask for? Mulan.

Baby speakers – cute and informative. What else could a linguist ask for? Mulan.

Curiously, when French-English bilinguals use pronouns and verbs from different languages in one sentence, these constraints are still obeyed. I pousse lá (‘I am pushing there’) with an English pronoun and a finite French verb and They manger bonbon (‘They eating candy’) with an English pronoun and nonfinite French verb are attested, as is Moi play thing with a French strong pronoun and English nonfinite verb. But what French-English bilinguals never produce is precisely the case ruled out by French grammar: a French weak pronoun with an English nonfinite verb, as in Je find it.

For the final case, I would actually like to offer a special thank you to prescriptive grammarians, or grammar snobs. Their writings can provide evidence of how people spoke in the past: the types of texts that are preserved from several centuries ago rarely reflect the language used by the illiterate part of the population, in many cases the vast majority, so that linguists have to rely in forms of indirect evidence. The first English grammar books appeared in the 18th century, pioneered by Robert Lowth’s A Short Introduction to the English Grammar and Lindley Murray’s equally imaginatively titled English Grammar. Both Lowth and Lindley were notorious prescriptivists who held a firm belief that Latin and Greek are superior to English. Their natural conclusion was that because Latin and Greek happen to be relatively highly inflected languages, English should be, too. This mission of making English as worthy as the ancient languages is most famously encoded in the fight for whom instead who in positions other than the subject.

While the ideological grounding is in its bizarreness intriguing in itself, for the modern linguist the relevant fact is that the allegedly correct use is pointed out in these grammars at all. If all speakers had been using whom, and thus living up to Ancient standards, there would have been no need to correct anything in the grammar. So, the early grammarians snobbish efforts tell later linguists that people were using who over whom already in the 18th century.

Break bad, break the rules, and linguists will thank you. Chapendra.

Break bad, break the rules, and linguists will thank you. Chapendra.

“We should spend more time listening to what others have to say and less focusing on the grammar they say it with”, Ghalabi appeals. As a linguist, I disagree. We should focus on the grammar people speak with – but not so much on the grammar people claim we should speak with.

(Come to think of it, Ghalabi does have a point even in her last statement: please don’t interrupt my Netflix and chill just to point out how my slips of the tongue might inform the world about my mental processes. Thank you.)


If you fancy reading more about bilingual grammars, the pronoun study, and much more, can be found here:

Paradis, J. and Genesee, F., 1996. Syntactic acquisition in bilingual children: Autonomous or independent? Studies in second language acquisition, 18(1), pp.1-25.