Pop culture

Dothraki as a constructed language

A few weeks ago I held a talk through Litterær salong (‘Literary Lounge’), hosted by the student organization Gengangere in Trondheim. The topic was Dothraki as a constructed language (or conlang) created specifically for the Game of Thrones series. This post will summarize some of the main points I made. The purpose of the presentation was to highlight some aspects of the Dothraki language that I thought were interesting and reflect more on how the language has been tailored to fit into the ASOIAF universe. This post will be a slightly expanded version of my talk, but it will still be spoiler-free.


Khal Drogo and Daenerys.

How was Dothraki made?

If you’re still reading, you probably know that the Dothraki language is a language spoken by the Dothraki people in the ASOIAF universe. We know the Dothraki people through Daenerys’s storyline, where they’re presented as a people whose focus lies on riding horses and fighting. It can easily be characterized as a hypermasculine community (remember when Dany had to eat a raw horse heart?). So how can the language reflect this culture?

The Dothraki language was created by linguist David Peterson, who won a conlang competition hosted by HBO in 2009 with the purpose of creating the Dothraki language. After winning the competition, Peterson went on to construct other languages for Game of Thrones, such as High Valyrian (and its descendants), Mag Nuk (giants’ language, only used for one scene) and Skroth (White Walkers’ language, never used). The language that arguably gained the most interest is High Valyrian, which now has approximately 850,000 learners on Duolingo (more than Norwegian!).


David Peterson (at the top).

The main reason for Peterson’s success is due to his ability to create Dothraki based on the information already available in the ASOIAF books. Peterson was able to make Dothraki fit into the universe as it already existed in the books and made the language fully-fledged and functional for the series.

Features of Dothraki

The pronunciation and vocabulary of Dothraki have to be as based on the books as possible, and the new inventions in the language need to make sense in the context of the ASOIAF universe. The interest part of this is how accurately Peterson has resolved issues relating to this task, and what the end result was in the final version of the language.

The pronunciation is based on words and names mentioned in the books, such as the name Drogo, Khaleesi (the Khal’s wife), Dothraki (riders), etc. The words and few sentences already appearing in the books make up the basis for the sound system of the language. Dothraki ended up with the following speech sounds (taken from the Wiki page on Dothraki phonology):


Most sounds in the Dothraki phonology are the same as in English. The affricate [t͡ʃ] is used in words like change, [d͡ʒ] is used in just, [θ] is the th-sound in think, etc. The sounds that are foreign to native speakers of English are [x], the guttural sound also found in languages like Arabic, German and Spanish, and [r]/[ɾ], the trill/tap r sound used in languages like Spanish. The vowel system is incredibly simple: it’s worth noting that the vowels included in Dothraki are some of the sounds most commonly found in language in general.

It’s cool to think about the fact that the pronunciation of Dothraki is so normal! It’s exactly like a real language in terms of how it works, showing how good Peterson is at looking at language purely as a system.


A lot of the same thinking has gone into his development of the vocabulary of Dothraki. The vocabulary and idioms of the language heavily reflect the culture in which the language has developed.

Again Peterson has to largely base the vocabulary on words that already exist in the books. An example of this is the scene in the books where Khal Drogo speaks the Common Tongue to Daenerys and uses the word chair instead of throne, implying that there is no separate word for throne in Dothraki. Because of this scene, Peterson does not include a separate word for throne in the Dothraki vocabulary. Of course, it makes sense that Dothraki wouldn’t have a separate word for throne due to the fact that there is no concept for a chair that a person of power, often inherited, would sit on.

Another detail put into the language is that the word for book in Dothraki, namely timvir, is a loanword from the High Valyrian word tembyr. When a concept is introduced to a language community through another culture, it’s common to use the word already used in the other language community. In this sense it’s a perfect solution for the issue of what the word for book would be in a culture that doesn’t really use books.

Horse idioms


Another interesting part of the vocabulary and idioms in Dothraki is the fact that horses are a recurring theme. In an interview in the podcast Two Girls One Podcast, Peterson says that he was initially planning on having the Dothraki always distinguish between different kinds of horse: marestallion, breed, color of fur, etc. This didn’t work out, as the Dothraki lines written for the show would not always align with the types of horses that eventually ended up in the scenes.

There are plenty of idioms in Dothraki relating to horses and riding. One is aha dothrak adakhataan, used to mean ‘I am about to eat’ but literally translating to ‘I ride toward eating’. Another one is anha dothrak chek asshekh, which literally means ‘I ride well today’ but taken to mean ‘I am well today’. Using riding as an image of how one is doing is not far-fetched at all, seeing as a lot of languages use going/walking for the same purpose. How’s it going initially meant How’s it walking (which is still used in Norwegian Korleis går det).

Dothraki as an ASOIAF language

The Dothraki language is a believable language for a universe that’s meant to be a realistic fantasy series. The choices made by Peterson show a clear understanding of how languages work as systems and how language communities interact in real life. It will be interesting to see what happens now that the show is almost over: will it continue to develop? Only time will tell.

I’m considering writing more about the grammar of Dothraki, but that will have to wait until another time. Until then, dothras chek (ride well)! If you speak Norwegian and would like to read more thoughts on conlangs like the LOTR Elven language Sindarin and the Star Trek language Klingon, you should have a look at my article in the student journal Riss, 02/18.

Pop culture

Topicalization in The Witcher 3 and in Scandinavian

I’ve recently started playing The Witcher 3: Wild Hunt (2015),  a videogame set in a fantasy world where you’re paid to help people with whatever they’re having issues with. And their issues can vary a lot. One of the missions in the game is that of the Pellar, a soothsayer who can speak to the dead. Among other things, you can help the Pellar and his supporters raise the dead at Forefather’s Eve.


The Pellar (image source)

Being a character who spends a lot of time either alone, praying or speaking to the dead, the Pellar has a peculiar way of speaking. Here are some quotes from the game:

  • Across the lake we must journey.
  • There, in the circle of stones, we shall meet.
  • Beyond all help, some will be.

To some extent he is using Yoda language in that he, at least in the first and third example, “fronts” the predicate. It doesn’t seem like the Pellar uses Yoda language as a standard, however. He often uses standard English SVO constructions as well, which made me start wondering why I was so perplexed by his way of speaking. What this reminds me of is topicalization in the Scandinavian languages.

Since Mainland Scandinavian languages, like English, do not have overt Case marking for common nouns, it is strictly SVO, meaning that the standard sentence structure consists of the subject, followed by the verb and then the (potential) direct object or other arguments. Old Norse, with its overt Case marking, has constructions in which declarative clauses are verb-initial, meaning that a construction like ‘eat you food’ could mean ‘you eat food’ (see Haugen, 2001). Later on the V2 rule would end up being fairly strict in all Scandinavian languages, stating that in declarative clauses, the finite verb is the second element to appear. Scandinavian has left behind verb-initial declarative clauses (as far as I know).

The fact that Mainland Scandinavian syntax tends to be so rigid is one of the reasons why it’s so interesting that it so easily allows for topicalization, arguably more so than English. Consider the following sentence:

(1)    Hunden  jagar    katten.                                                 [Norwegian]

dog.DEF chases cat.DEF

The construction in (1) is ambiguous in writing, and it could mean either that the dog is chasing the cat or that the cat is chasing the dog. In spoken language topicalized elements are stressed, clarifying that the first constituent is not the subject as would normally be expected. In English, you could make a construction like The cat, the dog chases, but this is marginal, and it isn’t ambiguous due to the lack of V2. The topicalization of adverbials is much more regular in English. Sentences like the Pellar’s Across the lake we must journey is an example of this.

Topicalization often has the effect of putting focus on a certain constituent over the focus that the subject normally would have. What’s interesting about the Pellar’s use of topicalization, then, is that he does it regardless of focus. In Beyond all help, some will be, the point isn’t that some will be ‘beyond all help, among other things’. In the way that the Pellar is saying these sentences, they seem to have the same meaning as Some will be beyond all help in standard English. The Pellar is topicalizing without changing the focus, which makes it seem incredibly unnatural and makes us as players feel off about him.

Let me know if you have any thoughts about the Pellar, or topicalization, or both! See also my other post about constituents and linearity in Heptapod.

Pop culture

“Don’t learn these languages”: a criticism of Lingo by Gaston Dorren

Lingo is a popular science book about “European languages” by Gaston Dorren. He includes mainly Indo-European but also Celtic, Uralic and Sámi languages. I learned some cool fun facts from the book, since every chapter ends with loanwords English has taken in from that language and an ‘untranslatable word’ from the language. For instance, did you know that the word ‘robot’ comes from Czech and technically means ‘slave’? Don’t tell our artificially intelligent friends that.



I am also so delighted that a book about language has grown popular. Its translated version was on the shelves of all book stores in Norway when it was released. It may encourage people to start or continue learning a foreign language, and it may spark an interest in linguistics that people didn’t know they had. Its existence, and success, is a cause for celebration.

This is why I’m disapppointed that there is little of substance in the book. Some chapters are only 2-3 pages long and rarely go past the first punchline. Sometimes a chapter sounds like a dramatized version of the first paragraph of a Wikipedia article. I appreciate the attempt to include as many languages as possible, and not the standard combination of English, French, Italian, German, etc., but writing about more than 60 languages in roughly 290 pages is a bit unrealistic. Actually, it leaves an average of five pages per language.

When there is so little room for each language, the content within the chapters needs to be succinct and carry some message for the reader to bring with them. Each chapter should at least inspire the reader to open their browser and learn more. The book does not do this. With such short chapters, there is no room for nuance or more than one aspect of the language. Dorren picks one topic about the language and neglects the rest, sometimes only leaving the reader with some problematic implications. I will get into the worst ones here.


“Don’t even bother learning Faroese.”

The chapter on Faroese is three pages long and can be summarized by: no one speaks Faroese, it has Case and spelling and pronunciation are often different, and therefore it will be too hard to learn. Faroese is a cool language, from my perspective, because it’s kind of like Icelandic but also has influences from Danish due to contact. How does it differ from Icelandic, and why?

Dorren does not delve into this. He calls Faroese ‘the Romans north of Hadrian’s wall’ because it has Case, which is a bit strange: what about Icelandic? What about all the other European languages that have Case?

I think the most problematic aspect is the fact that he concludes that Faroese is useless to learn: “Even if you should happen to find yourself on the islands, you could chat to the locals in the somewhat more useful Danish language.” (p. 259). Essentially this implies that any language that doesn’t have more than some tens of thousands of speakers is a fruitless endeavor. Why would someone claiming to encourage language-learning actively argue against the learning of Faroese? It makes no sense.

Another reason to not learn Faroese, according to Dorren, is because it’s too “difficult to learn”. His main reasons for this lie in the fact that Faroese has Case and that its spelling is different from the way the language is pronounced. This is not an argument. My native languages don’t even have person agreement, but that didn’t stop me from learning English. It’s such an absurd statement to make, and especially when the purpose of the book is to embrace languages.


“Esperanto is not similar enough to Germanic and Romance.”

This one is so annoying to me. What I expected before reading the chapter on Esperanto was the whole criticism of it being so European that it technically can’t be called a world language, since it is so much easier for speakers of Indo-European to learn than for others. This is a valid criticism. Dorren, however, takes a different turn, claiming that Esperanto is a failure because it’s too similar to Slavic, making it unintuitive for speakers of Germanic and Romance languages. In four pages he manages to complain a lot: that Esperanto has Case; that its lack of gender is illogical because of phrases like la viro ‘the man’ will feel wrong to Romance speakers, and that knabino, meaning ‘girl’, “looks masculine”; and that the /x/ sound, like in the German ich or Norwegian kjær, is foreign to Anglophones.

Criticizing Esperanto for not being a flat, boring, inflectionless language, Dorren shows a lack of understanding of what a bias this is toward Germanic and Romance. What he considers an easy grammar is not necessarily what Slavic speakers will find easy. Surely, saying la viro is better than having Gender in the language, a feature which carries little to no semantic content?

Arguing that Esperanto was unsuccessful because it was unintuitive for some Europeans is also untrue. I would instead assume that its lack of a speech community was the issue, in addition to the rise of English in the same period. Saying that Esperanto’s demise was because it was badly designed is a bit misleading.


Essentially, my problem lies in Dorren’s implicit argument that if there are any obstacles in learning a foreign language, you shouldn’t. Case is not a reason to put down your Russian textbook. No feature in a language is unnecessary or less valuable than another. Even if a language is spoken by only one person, it is not useless. Learn it. It will be worth it.

Pop culture

Constituency and linearity in Heptapod

The constructed language created for Arrival (2016) may be just a tiny bit more human-like than you would think.

I know what you’re thinking: “But Arrival was released in cinemas two years ago!” I didn’t have a linguistics blog two years ago, so here’s my contribution to one aspect of the linguistic details featured in the movie.

And if you’re one of those people thinking, “But Heptapod isn’t even a real language!” I know, but let’s imagine for a second that it is.

For those of you who haven’t yet watched Arrival, the plot can be summarized as follows: the linguist and polyglot Louise Banks is asked to help develop an interpretive method for communicating with the so-called Heptapods, a species of aliens who mysteriously came to Earth and don’t seem to be using any human-like means of communication.

The movie is great because it’s one of the only high-budget, award-winning movie whose protagonist is a linguistics professor. What this does is make the study of linguistics more accessible to the general public and inspire more young people to embark on linguistics studies. While the movie definitely has its flaws, especially by assuming the Sapir-Whorf hypothesis, is a topic for a different post.

But let’s assume for a second that Sapir and Whorf are right, and that the language you speak entirely shapes your perception of the world. Speaking Heptapod will make you see all of time simultaneously. Your entire timeline is put out in front of you, and this is because you master a language in which all separate elements occur at the same time. How does this compare to natural languages?

This is where constituents come in. If you’re new to syntax, let me give you a general, minimalist intro à la Adger (2012). You can also listen to the Lingthusiasm podcast if you want more explanation here.

Each word makes up a constituent of its own. In a tree representation, this is what that would look like:


Constituency is marked in the syntax tree through phrases. If we add a determiner, we have a constituent at a higher level.

(2)the car

This type of representation of constituents accounts for the fact that some combinations of words may be moved around, while others can’t. This can be exemplified through wh-questions. Consider a context in which A asks B what they did yesterday, and what they did was to eat pizza. (4-a) shows an acceptable response, while (4-b) sounds off.

(3) A: What did you do yesterday?

(4) a) B: Eat pizza.

b) B: #Pizza.

The reason why (4-b) sounds weirder than (4-a) can be explained by what it is that what replaces, namely a verb phrase. The role of B is then to fill in the gap left by the wh-word. This makes it sound weird when B replaces the VP gap with an NP.

Now, back to Arrival. The idea of constituents is important to consider because it puts emphasis on the fact that languages are not linear. Even though, when we talk, one word has to come before the other, that’s not necessarily how the structure is organized in our head. The way we move constituents tells us that syntactic structures are hierarchical rather than linear. So my question is: is Heptapod hierarchical? That is, does Heptapod have constituents?

Actually testing movement in the language is impossible, since it’s not actually a language. Some clues appear in the movie, though.


The written language of Heptapod was created by software designer Stephen Wolfram and his son, Christopher Wolfram. The program used by Louise in the movie to decode and create new sentences in the language divided each circle into twelve sections. Here’s some more information about the logograms, if you’re interested. It appears that each section may contain several subsections, too, at least in the code, but the way Louise uses the software in the movie makes it look like each section represents a word or concept.


While Louise is using the software, she picks out three symbols, reading out ‘give’, ‘technology’ and ‘now’. I think it’s fair to assume that each of the twelve sections of the circle represents a constituent. Whether ‘give’ in Heptapod is just one word, or whether that is just the English translation of a constituent including the verb to give with the imperative interpretation, does not make a difference in terms of whether it is a single constituent. The main difference between Heptapod and natural languages then seems to be that instead of having a hierarchical structure, Heptapod has the entire sentence produced or interpreted at once.

As far as I can tell, there is no hierarchy within the constituents, either: only loose words or concepts attached to it. Here is one example of the constituent structure within the software programmed by the Wolframs:


If anyone has any idea what sort of hierarchy or internal structure these constituents could have, I would very much be interested in hearing your thoughts. But for now I think I’ve established that Heptapod does have constituents, though not hierarchical like in natural languages, which is what makes it so weird and (supposedly) changes speakers’ view of space and time. But what do I know.



If you enjoyed this post, you may be interested in reading my article about interesting syntactic and sociolinguistic aspects of Klingon, Dothraki and Sindarin which was published in the Norwegian students’ journal Riss this year. You can find out how to order a copy of the journal here. (The article is in Norwegian, though.)