Meillet on Calques in European Languages

In his 1911 article Différenciation et unification dans les langues (“Linguistic Differentiation and Unification”), the French linguist Antoine Meillet noted that even highly dissimilar European languages are lexically similar for reasons unrelated to their being Indo-European:

The tendency for languages to become linguistically unified when there is unity of civilization is so strong that a kind of unity can develop even between languages that are fundamentally distinct. The main common languages of present-day Europe are, in every respect, completely different, and their pronunciations and grammars are independent of each other. But the speakers of these languages are part of the same civilization, and their languages have a significant number of elements that they obviously share in common.
One example consists of words that are shared between them as a result of borrowing or inheritance from Proto-Indo-European. When creators of artificial languages looked at what words were shared between Italian, Spanish, French, English, German, and Russian, they found enough words present in four or five of these languages to build a vocabulary in which, because of the many borrowings in English and the fairly numerous Latin and Romance borrowings in German, Latin was the main source of words. By contrast, Russian did not contribute anything, since it is a language that has for a long time been outside the mainstream of European civilization.
Loan-translations are another source of similarities. The Greek συνείδησις, the Latin conscientia (in French, conscience), the German gewissen, the Polish sumienie, and the Russian совесть may at first glance appear to be a mishmash of dissimilar words. However, an analysis of the words reveals that they have the same structure; this is because they are calques of each other. One might say that this has, in a certain sense, led to a kind of unification of the languages of Europe. This unification started at the founding of Mediterranean civilization (and even before the arrival of the Hellenes in Greece), was manifested in the Hellenization of Latin, and has continued up until the present day.

La tendance à l’unité de langue là où il y a unité de civilisation est si forte qu’une certaine espèce d’unité tend à se réaliser, même à travers des idiomes profondément distincts et qui restent distincts. Les grandes langues communes de l’Europe actuelle forment à tous égards des systèmes absolument différents ; elles ont des prononciations et des grammaires strictement autonomes. Mais ces langues reposent toutes sur un même fonds de civilisation, et il est aisé de constater qu’elles présentent en grande quantité des éléments communs. D’abord, par emprunt des unes aux autres, ou par suite de leur unité d’origine indo-européenne, elles ont en commun beaucoup de mots ; quand, pour constituer des langues artificielles, on a dressé le bilan des mots communs à l’italien, à l’espagnol, au français, à l’anglais, à l’allemand et au russe, on a trouvé assez de termes communs à quatre ou cinq de ces langues pour constituer un vocabulaire où, par suite des emprunts innombrables de l’anglais et des emprunts assez nombreux de l’allemand au latin et aux langues néo-latines, le latin est l’élément essentiel, et où le russe, demeuré longtemps en dehors du grand courant de la civilisation européenne, ne fournit rien. En second lieu, les manières de parler ont été traduites d’une langue dans l’autre ; le grec suneidêsis, le latin conscientia (français conscience), l’allemand gewissen, le polonais sumienie, le russe sovêst sont autant de mots distincts au premier abord ; mais il suffit de les analyser pour apercevoir qu’ils se superposent exactement et présentent un même mode de formation, résultant de ce qu’ils ont été calqués les uns sur les autres. On peut donc dire qu’il s’est produit par-là, en un certain sens, une unification des langues européennes ; cette unification a commencé lorsque s’est fondée la civilisation méditerranéenne, avant même l’arrivée des Hellènes en Grèce ; elle s’est continuée par l’hellénisation du latin et n’a jamais cessé depuis.

This paragraph resonated with me, because I’ve found translating words morpheme-for-morpheme to be a useful language learning technique for guessing the meanings of words I’ve never learned before or for remembering the meanings once I look them up.

For example, when I was learning Russian, I found that the Russian word согласный meant consonant. This made sense, since its morphological division was clearly со-глас-ный (with-voice-adjective). But I was also able to relate it to languages I already knew by noticing that the morphemes were very similar to those in the Latin-derived word consonant (morphologically, with-sound-ing).

For the Russian words рукопись (manuscript) and безответственность (irresponsibility), I was able to use a similar method to guess their meanings even before looking them up. The word рукопись is morphologically hand-write, but by translating it into Latin morphemes, I managed to figure out that it was also equivalent to the word manuscript (morphologically, hand-write). As for безответственность, it can more or less be broken down into the morphemes without-response-ness-ity. This doesn’t necessarily make much sense on its own, but by translating the first morpheme as the Latin ir-, I was able to connect the word to the Latin-derived word irresponsibility.

Sponsored Post Learn from the experts: Create a successful blog with our brand new courseThe Blog is excited to announce our newest offering: a course just for beginning bloggers where you’ll learn everything you need to know about blogging from the most trusted experts in the industry. We have helped millions of blogs get up and running, we know what works, and we want you to to know everything we know. This course provides all the fundamental skills and inspiration you need to get your blog started, an interactive community forum, and content updated annually.

Bathroom, Bedroom, and Living Room: Noun+Noun and Gerund+Noun Compounds

The room where you take a bath or shower is called the bathroom, the room where you go to bed is called the bedroom, and the room you live in most of the time is called the living room. But “bathroom” and “bedroom” both have two nouns smooshed together, while “living room” is not only written as two words, it also uses an -ing form, which is not used in either “bathroom” or “bedroom”. Why is this?

As it turns out, all three words are compound nouns, but the rules they follow for forming compounds are slightly different. One of the ways of forming a compound noun is to combine two nouns, and this is what happens in the case of “bathroom” and “bedroom”. But compound nouns can also be formed by using a gerund (which is one of the uses of an -ing form) followed by a noun, and this is the case in “living room”.

As for the space in “living room”, there is apparently no infallible rule for determining which compound words have spaces and which don’t. At the same time, I can’t think of any gerund+noun combinations off the top of my head in which the two components run into each other without a space or hyphen.

Syllepsis: “Get Lotion and Over It”

A few weeks ago, in a Reddit thread about the xkcd comic “Common Cold“, one redditor said (in a downvoted comment) that they did not want to wash their hands frequently, even if this meant occasionally catching a cold.

In response, another redditor replied: “Get lotion and also over it.”

This sentence twists English grammar for humorous effect, but why does it sound strange? The answer is that it ties two different functions of the verb “get” together in an unusual way.

Individually, the two functions of the verb are perfectly normal. In the verb phrase “get lotion”, “get” functions as a one-word transitive verb that has “lotion” as its object. “Get over” is also a common use of the verb, and a phrase like “Get over it” would be totally acceptable.

But the comment doesn’t just juxtapose the two meanings side-by-side, as in “Get lotion and get over it.” Instead, the comment deliberately mangles the syntax of “get over” by separating “get” from the preposition “over”. The reason this sounds humorous is that “get over” is a prepositional verb, which is a kind of multi-word verb consisting of a verb and a preposition that can’t be separated from each other. That’s why you can say “I got over it”, but not *”I got it over.”

Using a single instance of a verb with multiple objects requiring different meanings of the verb is known as syllepsis, and the Wikipedia article on syllepsis has a number of other examples of it.

The Social Skills of a Wet Mop

Several years ago, I came across a post on a forum where someone said they had (if I recall correctly) “the social skills of a wet mop”.

Though I had never encountered this phrase before reading that particular forum post, I immediately understood that it meant the speaker was an awkward person who was not very good at social interactions. But would a non-native speaker be able to understand the phrase without consulting a native speaker?

I recently decided to look up the phrase. The search engine results for “the social skills of a wet mop” and variants are not very numerous. “The social skills of a wet mop” only has two pages of Google results, and the similar phrase “the personality of a wet mop” has (as of this post) 5,950. Neither appears to exist in any dictionary, so a non-native speaker would have to figure it out from context.

On the other hand, the shorter noun phrase “wet mop” does exist in Urban Dictionary, which describes it as referring to “A mopey or pessimistic person who lacks personality”. Maybe the longer phrases are variants of a much more common expression, but if so, I don’t know what the original expression might be.

In any case, could “the X of a wet mop” be a new expression in the making, and how long has it been used? Searching “the personality of a wet mop” on Google Books gives very few results (mostly from the 21st century), but one of them is all the way from 1950. If “wet mop” has been used in this sense for decades, then why has it not made its way into any published dictionaries, and why is it apparently so uncommon?

French Spelling Reform in the 16th Century: The Phonetic Orthographies of Louis Meigret and Jacques Peletier du Mans

French spelling has a reputation of being relatively difficult because of the fact that it is etymological and mainly based on the pronunciation of Old French. (In this way, it resembles English orthography, which reflects Middle English, rather than modern English, pronunciation.)

Because of this, it is perhaps unsurprising that, much as there have been a number of attempts to simplify English spelling, there have been some who have attempted to write French in a way that is closer to the pronunciation of spoken French.

One of these reformers was the grammarian Louis Meigret (c. 1510-1558), who developed an orthography intended to be more phonetic. He used it in, for example, his Defȩnſes de Louís Meigrȩt tovchant ſon Orthographie Françoȩze, contre lȩs calōnies de Glaumalis du Vezelet, ȩ de ſȩs adherans (Louis Meigret’s Defense of his French Orthography, against the Slander of Glaumalis du Vezelet and His Adherents), which starts like this (modernized spelling and English translation mine):

Come j’açheuoȩ de reuoȩr vn trȩtté qe j’ey dreſſé çet yuȩr touçhant la grammȩre Françoȩze, j’ey u çȩ’ derniers jours nouuȩlles d’vn trȩtté intitulé « de l’Antique eſcripture de la lāgue françoyſe & de ſa poeſie, cōtre l’Orthographe des Maigretiſtes ».

Comme j’achèvais de revoir un traité que j’ai dressé cet hiver touchant la grammaire française, j’ai eu ces derniers jours nouvelles d’un traité intitulé « de l’Antique écriture de la langue française et de la poésie, contre l’Ortographe des Maigretistes ».

After reviewing a treatise I wrote this winter on French grammar, I have recently learned about a treatise entitled On the Ancient Manner of Writing the French Language and on Poetry: against the Orthography of the Maigretists.

Interestingly, while researching this post, I found an answer to a question on Stack Exchange asking why the standard French word for “orthography” is orthographe and not orthographie that cites this very passage in Meigret’s book and discusses the history of French spelling reform.

The answer also mentions Jacques Peletier du Mans (1517-1582 or 1583), who also developed an orthography for French. The Wikipedia article on him gives an example from his Dialoguɇ Dɇ l’ortografɇ e prononciation françoȩſɇ written in his orthography:

Madamɇ, lɇ grand dɇſir quɇ j’auoę̀ dɇ deſſe̱ruir (a toutɇ ma poßibilite) la gracɇ ſouuɇreinɇ dɇ feuɇ la Reinɇ votrɇ tre dɇbonnerɇ e tre rɇgretteɇ merɇ, m’auoè̱t induìt a lui vouloę̀r dedier un mien Dialoguɇ dɇ l’Ortografɇ e Prononciation Françoȩſɇ. Mȩ́s j’è etè priuè du bien, lɇquel j’etoe̱ tout pré̱t arɇcɇuoę̀r : c’ȩ́t dɇ cɇ bon e auantageus rakkeulh qu’ȩllɇ ſouloę̀t fe̱rɇ a toutɇs pȩrſonnɇs qui auoȩ́t lɇ keur a bonɇs choſɇs, e ſingulierɇmant aus lȩttrɇs.

Interestingly, both orthographies spell the diphthong now spelled <oi> and pronounced [wa] (as in vouloir “to want” [vulwaʁ]) as some variant of oȩ (as in vouloę̀r), which is indicative of the fact that it was pronounced [we].

But the word feuɇ puzzles me. Fève means “bean”, but I’m not sure how that would fit into the phrase la gracɇ ſouuɇreinɇ dɇ feuɇ la Reinɇ (“the sovereign grace of feuɇ the Queen”). Apparently, fève can refer to a bean or trinket “hidden in a king cake”, but I don’t know if this is related to how feuɇ is used in the passage.

I also found the word ſouloę̀t fairly opaque at first, but after some searching, I discovered the archaic verb souloir, which is probably the infinitive of ſouloę̀t and means “to be in the habit of”. It is apparently from the Latin verb soleō, which also resulted in the Spanish verb soler (also meaning “to be in the habit of”), which, unlike souloir, is still in common use.

“Ideal Critic”: Non-Native Syntax and Native Linguistic Competence

There is an Itchy Feet strip from a few years ago titled “Ideal Critic” about the experience of learning a language and trying to get feedback from native speakers:


The question is, why are the sentences of the language learner in the comic incorrect? Well, let’s see.

The dialogue in the first panel begins with the request:

*Please to be telling me when I am make the mistakes when I speak yours language!

First, let’s consider the main clause before the two instances of the subordinating conjunction “when”:

*Please to be telling me.

One of the problems with this phrase is that it is supposed to be a command but contains no imperative form. Only the word “please” could potentially be used as an imperative verb (as in Please the judges to win the award), but (a) this is clearly not the intended meaning, with the word instead being used in its adverbial sense to make a command more polite, and (b) imperative please would have to be followed by an object (Please the judges), whereas the please in “Please to be telling me” is instead followed by an infinitive and a present participle.

We can remove the infinitive particle to to make the clause somewhat more palatable:

*Please be telling me.

Now we have what appears to be an imperative construction, since we have the bare form be. But even though auxiliary be can be used with present participles to form the present progessive (as in I’m asking you), this clause still feels “off”. The reason for this is that, unlike dynamic verbs (which refer to actions), stative verbs (which refer to states of being) like the auxiliary “be” in the clause are not usually used in imperatives. As Geoffrey Pullum explains, “Progressive be (like be generally) and perfect have are stative, and hence relatively infrequent in imperatives.”1

In other words, we need to change the clause from the present progressive into the imperative by getting rid of the auxiliary and just using the bare form of the only other verb in the clause, “tell”. This way, we finally get the correct form of the clause:

Please tell me.

So far so good. Now, let’s try to add one of the subordinate clauses from the original sentence:

*Please tell me when I am make the mistakes.

In this subordinate clause, the words *I am make consist of a pronoun functioning as a subject, the first-person present simple form of be, and the bare verb make. Be can be used to form the progressive aspect or the passive voice, in which case it should be followed by some kind of participle (e.g. made or making), or it can be a copula, in which case it should be followed by a noun or adjective (e.g. I am hungry or I am a native speaker). Make is none of the above and so contributes to the ungrammaticality of the clause. We can fix this either by removing the auxiliary am or changing the form of make. For the sake of expediency, let’s do the latter. In that case, we get the sentence:

(*?)Please tell me when I make the mistakes.

This sentence is more acceptable but does not really fit the context, since the definite article the suggests that the character is talking about already known information–in this case, types of mistakes previously talked about–when this information has not actually been mentioned in the comic before. We can remove it to get the grammatical sentence:

Please tell me when I make mistakes.

Now, let’s add the final subordinate clause:

*Please tell me when I make mistakes when I speak yours language.

Here, the clause is mostly accurate, with only one issue: the word yours. The reason why yours is unacceptable here has to do with the way a noun phrase is structured and the functions of the different English possessives, which can be determiners or pronouns. In a noun phrase, the noun can be preceded by a pre-determiner, a determiner, a post-determiner, or one or more adjectives, in that order, as in the phrase All the many old books. Yours does not belong to any of these pre-nominal categories, since it’s a possessive pronoun; possessive pronouns are words like mine, ours, and yours that function as nouns or pronouns (as in This book is yours and not *This book is your) rather than determiners, and so can’t be inserted into the determiner slot before the noun language. In order to fix the clause, we need to change yours to the equivalent possessive determiner, your, to get the sentence:

Please tell me when I make mistakes when I speak your language.

Now, let’s see how Mr. Language Learner does when he tries to self-correct. His first correction of the first clause is:

*Be telling me please

This is somewhat of an improvement, since it at least uses a bare form of be rather than the infinitive, but the position of please is rather odd. When used in imperatives, please is placed at the beginning of the clause. The next self-correction is:

*Tell the please to be

Unlike the other examples of ungrammatical use, this one is clearly an exaggeration for comic effect that it is unlikely any speaker of English (whether native or non-native) would produce in real life. The overall syntactic structure of the clause in the comic is broadly correct; when used to refer to a command, the verb tell is followed by an indirect object and an infinitive (as in Tell him to go). The problem is that please is not a noun and can’t be an object or preceded by the. It would be pretty hard to tell *a please anything.

To recap:

  • Imperatives should use bare forms of verbs (e.g. Be kind), not infinitives (e.g. *To be kind), even if they are preceded by please
  • Imperatives should generally not use auxiliary verbs like progressive be or prefect have
  • The adverb please should be used at the beginning of a clause in imperatives
  • Forms of the verb be should be followed by a past or present participle (like made or making), an adjective, or a noun phrase, not a bare infinitive
  • One of the uses of the definite article the is to refer to already known information
  • Nouns can be preceded by a pre-determiner (e.g. all), determiner (e.g. the), or post-determiner (e.g. many)
  • Possessives come in two forms, possessive determiners (like your) and possessive pronouns (like yours). Determiners can precede a noun in a noun phrase (your book), while possessive pronouns can’t (*yours book)
  • The verb tell, when used to refer to a verbal command, should be followed by an indirect object and an infinitive (I told him to run)

Of course, it’s unlikely either of the two native speakers in the comic would be able to provide this kind of grammatical feedback. Itchy Feet previously touched upon the subject of native speakers having intuitive linguistic competence but lacking conscious knowledge of the rules that underlie it:


1. Pullum, Geoffrey, and Rodney Huddleston, The Cambridge Grammar of the English Language, p. 932.

Un grand linguiste danois – Vilhelm Thomsen (Vilhelm Thomsen – A Great Danish Linguist), by Antoine Meillet

The following is my translation of Antoine Meillet’s 1922 article Un grand linguiste danois – Vilhelm Thomsen (“Vilhelm Thomsen – A Great Danish Linguist”).

The degree to which a civilized nation is able to contribute to the development of science has nothing to do with the size of its population. There are small nations that are particularly skilled at certain sciences and, at certain points in history, can play a decisive role.

Denmark has a prominent place in linguistics.

Before the beginning of the 19th century, barely anyone had noticed the striking similarities between the main languages of Europe; those who had done so saw these similarities as an idle curiosity, and nobody had attempted to lay the groundwork for a science based on these similarities: the comparative grammar of Indo-European languages.

If we investigate the beginning of this new science, we encounter the name of a Dane. Before Bopp, the Dane Rask had clearly recognized that all the major languages of Europe derived from a single unattested common language that was never written down, but whose existence had to be assumed to explain the observed similarities—a discovery that Rask published. It was left to the German Bopp to truly establish the new science, with his work leading to its first major development; it was left to another German, Pott, to complete the discovery by identifying most of the aspects of Indo-European etymology. Rask could not take full advantage of his ideas because of the turbulence and brevity of his life, the lack of a protector like the one Bopp found in Wilhelm von Humboldt, and an inability to use Sanskrit, which was only beginning to be studied in Europe. But the clarity, sobriety, and precision of Rask’s views lead one to believe to that if the comparative grammar of Indo-European languages had followed his direction and not that of Bopp, quite a few pointless structures would not have been constructed.

To his precise and largely definitive work, Bopp unfortunately mixed in a enormous number of hypotheses as to the “origins” of the grammatical forms of the ancient Indo-European languages. These speculations, unsupported by any evidence, about “primitive forms” undoubtedly contributed more to the initial success of comparative grammar than the solid part of his work. But, from 1870 to 1880, it was necessary to clean up the science.

Vilhelm Thomsen, who was born in Copenhagen in January 1842 and whose eightieth birthday is currently being celebrated, is one of the first “comparativists” to abandon imaginary theories on the “primitive” constitution of Proto-Indo-European and instead focus on a history of languages based on positive facts, rather than on haphazard hypotheses about “origins”.

While comparative grammar had up until that point been almost the exclusive domain of Germans, towards 1870 the new science began to be studied outside of Germany. The Frenchman Bréal and the Italian Ascoli pointed research in the direction of relatively modern facts and were more interested in history rather than imaginary prehistory. A little later, young scholars like the Genevan Ferdinand de Saussure and the Russian Fortunatov provided the theory of Indo-European languages with rigorous formulas. Scholars from Germany still contributed to a significant extent to the development of comparative grammar, but they no longer had a monopoly.

In 1877, a Dane, K. Verner, published a discovery that altered the course of linguistic research. In a penetrating insight, he explained the surprising irregularity of pronunciation present in the history of old Germanic by comparing it with the accentuation in ancient Indian texts. In this way, he helped to definitively establish that pronunciation evolves according to fixed and constant laws and to banish whims and caprice from the history of languages in general and from etymology in particular.

The work of Vilhelm Thomsen is of an even more historical and fact-based nature. He has not only provided linguistics with new facts, but also steered it in a new direction whose excellence is continuously being shown by recent work.

Besides the Indo-European languages, there is another group of languages spoken in Eastern Europe and Siberia. The two most successful languages of this group are Hungarian and the Finnish of Finland, but there are many others–notably, Sámi in the far north of Europe and various varieties spoken in Russia in the Urals and the Volga basin. This group is known as Finno-Ugric.

Instead of focusing solely on Indo-European or solely on Finno-Ugric, V. Thomsen has studied the historical interactions between the two groups. Social groups that are less civilized1 “borrow”, as linguists say, many words from groups with a more advanced civilization; the Romans borrowed significantly from the Greeks, the Germans from the Romans, and so on. The development of the Finno-Ugric-speaking populations was hindered for a long time by an unfavorable climate, and so they borrowed many words from their Indo-European neighbors. Hungarian, for instance, is full of Slavic words. Long ago, when Finno-Ugric was still linguistically unified, the entire family borrowed Indian or Iranian words: the word for “hundred” in Finno-Ugric is Indo-Iranian. V. Thomsen has worked on loanwords in Finnish; in 1869, he described Germanic loanwords, and in 1890 he described loanwords from what is now called Baltic, a group including Lithuanian and Latvian, which are still spoken today.

The importance of this work is significant. The author’s mastery, the reliability of the evidence on which he based his conclusions, and the soundness of his conclusions were such that his results immediately became absorbed into the body of scientific knowledge. Published in 1869, his work on Germanic loanwords in Finnish was translated into German in 1870 by Sievers, a then young linguist, who was destined to become one of the most important Germanists of his generation. Indeed, the dissertation related as much to Germanic as it did to Finnish and contained facts shedding light on both. Borrowed during a very early period, before the existence of any written texts in Germanic, the words borrowed by Finnish from Germanic are as archaic or even more archaic than those found in the oldest Gothic or Scandinavian monuments.

But the use specialists found in V. Thomsen’s important initial work, as valuable as it is, is nothing compared to the more general significance of this type of research.

Before Thomsen’s work, linguists observed the (so to speak) linear development of a language considered as an isolated entity, without attributing much importance to outside influences. In fact, they were more interested in guessing at the initial “primitive” forms of these languages than in closely studying their development over the centuries. Thomsen’s 1869 work was a new and singular development in linguistics in that he focused only on words borrowed from one group of languages to another, showed how this phenomenon can be used to observe the development of the two groups, thus shedding light not only on the history of linguistic facts but also the history of nations themselves, and showed the ways in which one civilization influenced another.

Though it did not cause as much commotion, V. Thomsen’s second work on contact between Baltic and Finnic, published about twenty years later in 1890, was no less valuable. The author is just as masterful, the evidence just as reliable (and less well known), and the conclusions just as certain. The work’s relevance for the study of Finnish is in fact greater, because its Baltic loanwords are older than those from Germanic and occur in a larger number of Finnish varieties. Not only that, but it is at least as relevant to Baltic as Thomsen’s first work was for Germanic. But few people are interested in “Baltic” languages. It took the Great War for the two small nations that speak them, the Lithuanians and Latvians, to gain independence: Lithuanian and Latvian are spoken almost entirely by peasants and so far have mainly only been studied by a few curious linguists. Most of the languages’ historical range has gone to other languages; the third known Baltic language, Old Prussian, which was still commonly spoken in the province of Eastern Prussia in the 16th century, has been replaced by German–so much so that it is now the most German region in Europe, even though its linguistic composition was completely different three centuries ago–and all we have left of Old Prussian is a paltry number of old documents. To the east and the south, the Baltic domain was reduced by Slavic; the city of Vilnius, whose name is obviously Lithuanian, is no longer in Lithuanian linguistic territory, and the limits of the region where Lithuanian is spoken is a little west of Vilnius. But Baltic-speaking populations were formerly very powerful. We know that the Lithuanian princes extended their empire just beyond Kiev, and it is from them that the Jagiellonian dynasty (and later, after the definitive union of Poland and Lithuania, Poland) inherited their domination on the Western Russian populations. By showing how ancient Finnish populations were strongly influenced by Baltic speakers related to today’s Lithuanians and Latvians, but not influenced at all by the ancient Slavs, V. Thomsen has highlighted a fact barely taught by historians: the immense role played by Baltic speakers during the centuries immediately preceding and following the beginning of the Christian era.

Despite their linguistic and historical interest, these particular consequences resulting from V. Thomsen’s work are not of universal value. There is a much more important theoretical consequence that has been emphasized by modern developments in linguistics that were absent in 1869 or even 1890. The more closely the development of languages has been studied, the more important the role of civilizational influences (manifested in borrowings from one language to another) has proven to be. French is undoubtedly the form taken by Latin in particular historical conditions, and the main source of vocabulary in French consists of Latin words transmitted from generation to generation while changing in form and meaning. At the same time, it is almost impossible to write a phrase in French without using loanwords from written Latin; while the verb entendre “to hear” is an old word, related words like the abstract noun audition “hearing” and the agentic noun auditeur “hearer” are borrowed from written Latin. It was initially thought that vernacular varieties were pure and that studying them closely would show the results of the evolution of Latin on French soil; when they were finally meticulously examined, and when Gilliéron’s Atlas linguistique de la France uncovered the history of many local words through the process of comparison, it turned out that the vocabulary of dialects is to a large extent composed of borrowings and that the influence of literary French on dialects is much stronger than the other way around.

A more personal discovery later placed V. Thomsen among the ranks of the great decipherers and demonstrated his brilliance, rigorous methodology, and insight.

Inscriptions in an unknown alphabet were discovered in Siberia on the banks of the Orkhon and Yenisei Rivers and were subsequently published, but no one knew how to read or interpret them. In 1893, V. Thomsen published a method to decipher the alphabet and, shortly thereafter, provided a complete transcription and translation of all known texts. He single-handedly succeeded in determining the value of every alphabetical sign. Though not a Turcologist, Thomsen recognized Turkish in the texts; as all Turkish varieties are similar and have not changed much during the eight or ten centuries they have been written down, his translation was practically definitive on the first try.

The method used by V. Thomsen to decipher this unknown alphabet is the same as that used by the German Grotefend, who began to decipher the cuneiform inscriptions of Darius and Xerxes in 1802, thus laying the foundation for the decipherment of all cuneiform texts, and by Champollion, who discovered how to solve the Egyptian hieroglyphs. This method consists of determining which words are likely to be already known proper names; Grotefend deciphered “Darius” and “Xerxes”, Champollion deciphered “Ptolemy” and “Cleopatra”, and V. Thomsen deciphered a Turkish king known from Chinese texts. Once the value of a few characters is determined, then, if the text is in a known language, it becomes possible to identify words that can then provide the values of other characters; each new letter facilitates the discovery of other intelligible words and thus of the values of more undeciphered characters. In this way, the values of every character in the alphabet can be determined.

In order for such a discovery to be possible, the unknown writing must at least partially represent a known language. To a large extent, the values of all the characters in the cuneiform inscriptions of the Persian kings were determined because the Old Persian in which they were written could be interpreted by means of modern Persian, as well as closely related ancient languages, such as the language of the Avesta and the less closely related Sanskrit. Nevertheless, because Old Persian is significantly different from all these languages, it took the efforts of many scholars — including, among others, the Danish linguist Rask — and forty-five years of work for the inscriptions of the Persian Achaemenid kings to be read in their entirety. If the language V. Thomsen encountered had not been so close to already well known Turkish varieties, the alphabet would not have been deciphered so quickly or so perfectly on the first try. But it was a triumph both for the method and V. Thomsen’s mastery that he was able to add an entirely new domain to Turcology without being a Turcologist himself, as well as to provide, by interpreting ancient texts, fundamentally important information for the oldest known historical period of Turkish.

There are numerous examples of the difficulty, and even the well nigh impossibility, of reading a text in a familiar writing system but in an unknown language. In Cyprus, written texts in an unknown writing system have been found; although the principle of the writing system is very different from that of the Classical Greek alphabet, those texts written in Greek have been deciphered, and they have provided Hellenists with curious information. But this alphabet was also used to write another language. Though these texts written in an unknown language can be read because of the phonetic values provided by the Greek texts, it has so far been impossible to interpret them meaningfully. The Lycian inscriptions found in Asia Minor and the large number of Etruscan inscriptions in Italy can be read in their entirety, but as they do not resemble any known language, it has not been possible to interpret these inscriptions except to a very limited extent. V. Thomsen himself has worked on these texts, but though the resulting fruits of his ingenuity, wisdom, and accurate judgment are very interesting, they are quite paltry in comparison to his achievements in the realm of Turkish.

These important works are not the only ones published by the illustrious master of Copenhagen; they are only the ones that made him famous. His other studies on various subjects are no less a testament to the power of his discerning and bold spirit. All of his work is significant. For example, in one article he proposed the only plausible explanation for the origin of the French verb aller “to go”–namely, that a word can have an unusual pronunciation in certain unique circumstances and thus evolve in an unusual way. This has proved to be a fruitful principle. Thus, modern linguistics bears the mark of V. Thomsen’s ideas.

And this great scientist has not isolated himself in Copenhagen. Besides the Scandinavist Wimmer and the renowned Romanist Nyrop, who is one of the leading experts on the history of French, V. Thomsen also has among his students two of the most original contemporary linguists: Jespersen and H. Pedersen. Jespersen has mainly focused on the history of English. He has shown how the development of Indo-European languages led to English, which is the most divergent Indo-European language, and how, contrary to the bizarre ideas of the Romantics, language becomes clearer, easier to use, and more useful as it sheds the complexity of ancient grammatical forms like those found in the Vedas and in Homer.

Pedersen has an exceptional breadth of knowledge. From Slavic to Celtic and Albanian, he has tackled some of the most difficult issues, with his ideas leaving a mark everywhere. He has been the first to write a comparative grammar of those Celtic varieties whose obscurity has discouraged so many linguists. Today, V. Thomsen is the most well known name in linguistics. But it must also be taken into account that he is a successor of the great Rask, a follower of Verner, and the teacher of Jespersen and Pedersen. All these scholars combine complete independence of spirit with methodological rigor, and a sense of reality with a powerful imagination. Without their work, it would be difficult to imagine what modern linguistics would be like today.

A. Meillet.

1. [In modern terms, one might say that languages that are less prestigious tend to borrow words from languages that are more prestigious.–Fancua]

“Don’t You Think?”: The Irregularities of Subject-Auxiliary Inversion

Suppose you’re on vacation and hiking in the forest with a friend who asks you:

“Man, this place sure is beautiful, don’t you think?”

Being an outdoorsy type, you answer:

“Oh, yeah, absolutely.”

Nothing out of the ordinary here.

But suppose that your friend instead said, “This place is exceedingly beautiful, do not you think?

You glance at him and wonder why you’re still friends with an artiste who constantly insists on talking like a knockoff Bulwer-Lytton.

But after this useful reflection on where you went wrong in life, you start to think about the sentence itself. Logically, one would think that *Do not you think? would simply be the uncontracted form of Don’t you think? with the only difference being one of formality. Yet *Do not you think? is not just strange; it’s downright ungrammatical. The question is: why is this the case?

As it turns out, there is a paper by Zwicky and Pullum titled “Cliticization vs. Inflection: English N’T” that provides a useful explanation.

Essentially, the rule at work is subject-auxiliary inversion. To form a question from the sentence You do not think, we have to invert the subject You and the auxiliary do to get the grammatical, though odd, Do you not think? rather than the ungrammatical *Do not you think? In You don’t think, n’t is attached to the auxiliary, such that the whole unit must be moved. This is why we get Don’t you think? and not *Do youn’t think?

But subject-auxiliary inversion doesn’t work this way in the case of other contractions, such as the contracted forms of the auxiliaries have (’ve ) and is (’s), as shown by the ungrammatical *Should’ve he done it?, which is the result of inverting the subject and auxiliaries in He should’ve done it.

In the paper, Zwicky and Pullum argue that “Syntactic rules can affect affixed words, but cannot affect clitic groups.” As a result, they argue that n’t, unlike ’ve and ’s, is an inflectional affix rather than a clitic. For example, in the following sentences, subject-auxiliary inversion cannot be applied to ’ve or ’s or words they are attached to, such that they are therefore clitics (though, of course, in spoken English, elision may lead to sentences sounding roughly like the first “ungrammatical” one below):

You have done it.
Have you done it?
You’ve done it.
*’Ve you done it?

He should have read the book.
Should he have read the book?
He should’ve read the book.
*Should’ve he read the book?

By contrast, in the case of n’t, the contraction can move with the auxiliary when the subject and auxiliary are inverted, with n’t therefore being an affix:

You do not agree.
Do you not agree?
You don’t agree.
Don’t you agree?

Despite all this, sentences like Do not you think? are attested in English texts from the 19th century and earlier.

Searching the phrase on mainly results in works of fiction.

William Shakespeare, A Midsummer Night’s Dream (1595-1596):

Are you sure
That we are yet awake? It seems to me
That yet we sleep, we dream.—Do not you think
The duke was here, and bid us follow him?

Jane Austen, Mansfield Park (1814):

“Do not you think Edmund would have been in town again long ago, but for this illness?”

Beauty and the Beast: A Tale for the Entertainment of Juvenile Readers (1818-1825?):

“No, (replied the Beast,) you alone are mistress here; you need only bid me be gone, if my presence is troublesome, and I will immediately withdraw: but tell me, do not you think me very ugly?”

By contrast, searching the phrase on Google Books mainly results in Parliamentary documents, as in this publication from 1868:

1662. Do not you think, for example, that experience has made you a more effective officer?

3924. Do not you think that besides the operation of that Act, persons have less inclination to be drunken now than they used to have?

How common were such sentences in the past? I don’t know. Perhaps they were simply individual instances of hypercorrection. In any case, it’s an interesting topic that should be investigated further. Do not you think so?

Regional Variation and Diachronic Change in Guarani as Reflected in 17th-Century Spanish-Guarani Dictionaries

The open access journal CORPUS (described as Archivos virtuales de la alteridad americana) has a paper by Graciela Chamorro from 2014 called PHRASES SELECTAS: Un diccionario manuscrito castellano-guaraní anónimo (“Select Phrases: An Anonymous Spanish-Guarani Manuscript Dictionary”)1, which is about a revised version of Antonio Ruiz de Montoya’s 1639 dictionary, Tesoro de la lengua guaraní. According to the bilingual abstract:

Phrases Selectas es un manuscrito de autoría jesuítica, adjudicada a Pablo Restivo, fechado en 1687. El mismo es una versión reestructurada y actualizada del Tesoro de la Lengua Guaraní de Antonio Ruiz de Montoya (1639). Este artículo pretende ser una introducción a este documento inédito, al presentar el debate sobre su autoría, la forma en que se estructura y sus posibilidades de uso en la investigación de la lengua, la historia y la antropología guaraní.
Phrases Selectas is a manuscript of Jesuit authorship, attributed to Pablo Restivo and dated in 1687. It is a restructured and updated version of the Treasure of Guarani Language by Antonio Ruiz de Montoya. This article is intended to be an introduction to this unpublished document by introducing the debate on its authorship, the way in which the piece is organized and its possibilities of use for the linguistic research and in the historical and anthropological studies of the Guarani.

Some of the main points in the paper are that: more work needs to be done to determine whether Pablo Restivo (who also wrote under the pseudonym Blas Pretovio) really wrote Phrases Selectas (“Select Phrases”); the purpose of the creation of the dictionary was to update Montoya’s work by removing words no longer in use (as well as supposedly improper words related to bodily functions) and adding words used only in specific localities; historical context needs to be taken into account to figure out which words fell out of use because of cultural changes.

Because the paper is so interesting, I thought I would translate a few excerpts.

In this paragraph, Chamorro discusses the purpose of this revised version of the Tesoro:

We can see that the author of the manuscript seems to have had two criteria: (1) to select the most frequently used terms while discarding those no longer understood by contemporary speakers, and (2) to record words used in specific reductions or villages so that missionaries could communicate more effectively with the inhabitants of those places. The first criterion is temporal, as the goal is to avoid archaisms and record contemporary words and manners of speech. The second criterion is spatial. In both cases, the author relies on the logic of usage.

De lo expuesto hasta aquí, el autor del manuscrito parece haber tenido dos criterios: 1) escoger los términos más usados, dejando fuera los vocablos que ya no se entendían, y 2) registrar los vocablos particulares en uso en una reducción o pueblo, para que cuando el misionero estuviera allá pudiera entender a sus interlocutores y ser entendido por ellos. En el primer caso, el criterio es temporal, pues se pretende evitar los arcaísmos y registrar los vocablos y las formas de hablar actuales. El segundo criterio es espacial. En ambos casos, el autor se apoya en la lógica del uso.

She also discusses the issue of the authorship of Phrases Selectas:

In the case of Phrases Selectas, it is possible that the author did not sign his name because he did not consider himself to be the author. He saw the work as being predominantly by Ruiz de Montoya and the indigenous people who provided contemporary examples for the dictionary. This is what the author’s words “To the fervent and apostolic missionary fathers” appear to indicate: “I humbly and submissively offer to you the opulent Treasure of the venerable father Antonio Ruiz, reduced to a simpler method.


In this introductory article, we have discussed the question of the manuscript’s authorship. We conclude that one of the things that should be done in order to make progress in this regard is to compare the PhS with Restivo’s Vocabulario and Arte, as well as the manuscripts that were supposedly the drafts of these two published works. These documents have linguistic data that could help to determine whether the work was written by Restivo or another Jesuit. As this is a subject for future research, we consider the manuscript to be of anonymous authorship for now.

En el caso de las Phrases Selectas, el autor pudo no haber firmado porque no se consideraba autor. Reconocía en la obra la prevalencia de la autoría de Ruiz de Montoya y de los indígenas que cedieron ejemplos contemporáneos para el diccionario. Es lo que parecen indicar las palabras del autor A los fervorosos y apostólicos padres misioneros…: “les ofrezco humilde, y rendido el Tesoro opulentísimo del V. Pe. Antonio Ruiz reducido a más fácil método”.


En este artículo introductorio a las PhS problematizamos el tema de la autoría. Concluimos que para avanzar en ese debate, entre otros pasos a dar, hay que comparar las PhS con el Vocabulario y el Arte de Restivo y con los manuscritos que supuestamente fueron el borrador de estas dos obras publicadas. En el interior de estos documentos pueden encontrarse datos lingüísticos que ayudan a fundamentar la adjudicación de la obra a Restivo o a otro jesuita. Como eso es algo por hacer, optamos por mantener el documento como de autoría anónima.

Chamorro also talks about the topic of dialectal difference in Guarani and its relation to the creation of the Tesoro and Phrases Selectas:

The missionary’s experience among the Guarani is a key part of the debate over the authorship of the manuscript, as the author mentions several times his interactions with the reduced natives, as in the passage below:

“I do not say this without having any experience of it; in San Javier they have ways of speaking specific to that area, such that people did not understand me when I used them in Santa María and other villages, and I had to find other manners of speaking more commonly used in those villages. The same thing occurs in confessionals, where one hears ways of speaking specific to the village in question” (Phrases Selectas, first page of Amigo en Cristo y Benévolo Lector).

In other passages, the author notes that the Guarani used in the Tesoro was not familiar to all speakers and that Guarani was not linguistically homogeneous; rather, the various villages of reduced Indians presented linguistic variation. He states that “when I read paragraphs [from Ruiz de Montoya’s Tesoro] to the natives, they did not understand me”, such that “if such diverse manners of speaking can be found in a distance of only four leagues, one can only imagine what the venerable father encountered in all the leagues he traveled and in all the factions, communities, and villages worthy of his care and assistance.” The author gives various examples of this variation in the body of the dictionary. For example, in the entry for “fatty meat”, he writes: “this will be explained with terms used in San Javier” and lists three words with example sentences: mesẽ, so’o mesẽ; kambeũ, so’o kambeũ; joy, ijoy so’o. The author concludes: “I have not seen or found these three ways of speaking in the Tesoro, but they are used in San Javier and Santa María” (Phrases Selectas, 168).

What is clear is that the author was not a native speaker of Spanish or Guarani. In his sixth note to the “benevolent reader”, he writes: “When you encounter barbarisms in my Spanish or Guarani, do not be surprised, as they are not my native languages; rather, I learned them”. This was precisely the case of Restivo, but also of many other missionaries of the period.


The last observation we would like to make as far as the content of the PhS is concerned is in regards to the dialectal differences noted by the author of the manuscript between the Guarani languages of the period and the forms recorded eight decades prior in the Tesoro. In our opinion, these differences could be the result of the fact that these works relied on ethnic communities who did not speak the same varieties of Guarani and in some cases learned Guarani as a second language in the reductions. Montoya relied on the language used in the villages of the Guairá region (Melià 1998), while the forms on which PhS is based were probably from the Tape missions and the reductions on the banks of the Uruguay River (Melià 2003, p. 112, Chamorro 2009, p. 70). Specifically, the reductions of Santa María la Mayor and San Xavier are the ones most frequently mentioned in the manuscript. This fact, combined with the constant use of the expression “not used in Santa María or in San Javier” indicates, in our opinion, that the author of PhS used the varieties of these villages as a reference. The various Guarani languages spoken today by, among others, the indigenous Avá-Guaraní (Ñandéva), Kaiowá or Paĩ-Tavyterã, and Mbyá—who live along the Paraná, Paraguay and Uruguay rivers and their tributaries in Argentina, Brasil and Paraguay—helps us to imagine the linguistic diversity encountered by missionaries in the 17th century. Their ancestors had contact (though partial and with varying degrees of interaction) with the Jesuits or were reduced by them and certainly brought their speech to the new social order that was being established.

La experiencia entre los guaraníes es muy importante en el debate sobre la autoría del manuscrito, pues el autor menciona varias veces su interacciónin situ con los indígenas reducidos, como en el pasaje abajo:

No digo esto sin alguna experiencia, porque en San Javier se usan unos modos de hablar tan particulares, que valiéndome yo de ellos en Santa María, y en otros pueblos, no me entendían, y fue necesario mudar de rumbo, y buscar otros usuales en aquel pueblo. Lo mismo suele acontecer en los confesionarios, donde se oyen particulares modos propios de aquel pueblo y no de otros” (Phrases Selectas, primera página de Amigo en Cristo y Benévolo Lector).

En otros pasajes, el autor registra su constatación de que el guaraní del Tesoro no era conocido por todos y que no se hablaba el guaraní de una única manera sino que había una diversidad lingüística en los pueblos de indios reducidos. Afirma: “leyendo algunos párrafos [del Tesoro de Ruiz de Montoya] a los naturales, no me entendían”. De modo que “si en distancia de solo cuatro leguas se hallan modos de hablar tan diversos entre sí, qué será en distancia de tantas leguas, cuantas anduvo el venerable padre y de tantas parcialidades, cuantas comunidades y pueblos tan diversos, que merecieron su cuidado y asistencia”. Y el autor da varios ejemplos en el cuerpo del diccionario. Así, en la entrada “carne gorda”, escribe: “explícase con términos usados en San Javier” y lista tres vocablos y sus ocurrencias en frases que corresponden a la entrada: mesẽ, so’o mesẽ; kambeũ, so’o kambeũ; joy, ijoy so’o. El autor concluye: “Estos tres modos de hablar no los he visto, ni hallado en el Thesoro, pero los usan los de San Javier y Santa María” (Phrases Selectas, 168).

Una cosa sí es clara y es que el autor no era ni hispano-hablante ni guaraní-hablante. En su sexta advertencia a su “benévolo lector”, escribe: “cuando encontrares algunos barbarismos en el idioma castellano, o guaraní, no te admires; porque ninguno de ellos me es connatural, sino aprendido”. Este era precisamente el caso de Restivo, pero también de muchos otros misioneros de la época.


La última observación que nos gustaría presentar en cuanto al contenido de las PhS son las diferencias dialectales notadas por el autor del manuscrito entre las lenguas guaraníes de su época y entre estas y las formas registradas unas ocho décadas antes en el Tesoro. A nuestro modo de ver, ellas pueden derivar del hecho de que esas obras se apoyaron en comunidades étnicas que no hablaban exactamente la misma lengua guaraní y en algunos casos aprendieron el guaraní como segunda lengua en las reducciones26. Montoya se apoyó en la lengua de los pueblos de la región del Guairá (Melià 1998), mientras que las formas de hablar que están en la base de las PhS probablemente fueron las del Tape y las de las reducciones ribereñas del río Uruguay (Melià 2003, p. 112, Chamorro 2009, p. 70). En particular, las reducciones de Santa María la Mayor y San Xavier son las más nombradas en el manuscrito. Ese hecho sumado al uso constante de la expresión “no se usa en Santa María ni en San Javier” indica, en nuestra opinión, que el autor de las PhS tomó como referencia las formas de hablar de esos pueblos. Las varias lenguas guaraníes habladas hasta hoy, entre otros, por los pueblos indígenas Avá-Guaraní (Ñandéva), Kaiowá o Paĩ-Tavyterã y Mbyá —de las regiones bañadas por los ríos Paraná, Paraguay y Uruguay y sus afluentes, en Argentina, Brasil y Paraguay— nos ayudan a imaginar la diversidad lingüística en la zona misionera, en el siglo XVII. Aunque de forma parcial y en distintos niveles de interacción, sus ancestros tuvieron contacto con los jesuitas o fueron reducidos por ellos y ciertamente habrán aportado sus modos de hablar al nuevo orden social que se establecía.

Chamorro notes possible sources for Phrases Selectas beyond the Tesoro:

What did the author of the PhS base his work on? He frequently attributes his skill in Guarani to his master, “the venerable father Antonio Ruiz”. However, during this period, the Jesuit reductions of Paraguay had missionaries and indigenous people who knew Guarani well and wrote in that language; furthermore, works written in Guarani by previous generations were also available. As one can read in Restivo’s Arte, the Jesuits relied on the work of Alonzo de Aragona, the notes of Simón Bandini—”the prince of this language”—and several compositions by “very capable Indians”.


In his Arte, Restivo cites (in addition to Ruiz and Bandini) Mendoza, Pompeyo, Insauralde, Martinez and Nicolás Yapuguay and praises them, saying that “they are all first class” (Restivo [1724] 1996, p. 1). The works of all these authors and their anonymous collaborators were certainly consulted by the author of the PhS. Furthermore, the author collected data for the work directly from the indigenous inhabitants of the reductions, using the speech of Santa María la Mayor and San Javier as a reference. This means that, contrary to what the title indicates, the phrases are not solely taken from Ruiz de Montoya’s Tesoro.

Ahora bien, ¿en qué se ha basado el autor de PhS? Él adjudica constantemente su pericia en la lengua guaraní a su maestro, “el venerable padre Antonio Ruiz”. Sin embargo, en esta época había en las reducciones jesuíticas del Paraguay misionarios e indígenas que conocían bien la lengua guaraní y escribían en esa lengua; además había obras en guaraní heredadas de otras generaciones. Como se puede leer en el Arte de Restivo, los jesuitas contaban con la obra de Alonzo de Aragona, los apuntes de Simón Bandini —“príncipe de esta lengua”— y varias composiciones de “indios muy capaces”.


En su Arte, Restivo cita, además de Ruiz y Bandini, a Mendoza, Pompeyo, Insauralde, Martinez y Nicolás Yapuguay, elogiándolos con la mención “todos son de primera clase” (Restivo [1724] 1996, p. 1). Las obras de todos estos autores y de los colaboradores anónimos ciertamente fueron consultadas por el autor del manuscrito en foco. Además, el autor recogió datos para su obra directamente con los hablantes indígenas de las reducciones, teniendo como referencia el modo de hablar de Santa María la Mayor y San Javier. De modo que, al contrario de lo que dice el título, las frases no son seleccionadas solamente del Tesoro de Ruiz de Montoya.

The paper notes that the linguistic changes reflected in Phrases Selectas may be indicative of broader cultural shifts:

In terms of differences in content, the examples from the Tesoro discarded in the PhS are: roads related to ants, wild animals in general, vipers, tapirs, as well as the movement of the stars. Compared to the PhS, the Tesoro has a significantly larger quantity of words for roads, which can be short, long, full of holes, well made, flat, wide, narrow, hilly, marshy, heavily used, little used, or abandoned. They can lead to a river, to a forest, to home, to a harbor, to another road, to the sky, to hell, to God, or to the Devil. The examples also give some indication of the experiences of those who used roads.


In this context, it is necessary to bear in mind that when the authors speak of a word being in “disuse”, this is because of social and cultural change during the period of the creation of the PhS that was probably on a larger scale than that of the change that occurred during the creation of the Tesoro. Thus, we can suppose that if there are no expressions in the PhS that describe roads in terms of their relation to animals and stars, this is because the environment, references, and the social uses of this cultural element had changed. Furthermore, the fact that fewer types of baskets are recorded in the PhS compared to the Tesoro may indicate a decrease in technological diversity and the uses of that cultural element.

As a result, studying these sources to compare cultural aspects of the missionary-era Guarani people in two different periods could be an interesting contribution to the social and cultural history of Guarani in the reductions, as well as to the history of the speakers of the language. In this regard, it should be noted that writing is a historical form of social relations that eventually leads to the creation of institutions (Orlandi 2001, p. 8); the use of sources like Phrases Selectas in historical anthropological studies is clearly necessary and promising. The question is how to do this when the available sources (like the one studied here) are mostly short words and phrases taken out of context, without further information about the natural, social, psychological, linguistic, historical, and political environment.

En cuanto a las diferencias en el contenido, los ejemplos que PhS no retoma del Tesoro son: los caminos relacionados con las hormigas, con los animales salvajes en general, con las víboras, con los antas, así como con el camino de los astros. El Tesoro tiene muchos más vocablos que las PhS para describir los caminos. Ellos pueden ser cortos, largos, llenos de hoyos, bien hechos, planos, anchos, estrechos, con subidas, pantanosos, muy usados, poco usados, abandonados. Pueden conducir al río, al bosque, a la casa, al puerto, al otro, al cielo, al infierno, a Dios, al diablo. Los ejemplos dicen también de las experiencias de quienes frecuentan los caminos.


En este contexto hay que tener presente que cuando los autores hablan del “desuso” de un vocablo estamos delante de un cambio social y cultural, que en la época de las PhS era probablemente mayor que el cambio ocurrido en los años de elaboración del Tesoro. Así, podemos imaginarnos que si en las PhS no tenemos expresiones que relacionen los caminos con animales y astros es porque el medio ambiente, las referencias y los usos sociales de ese elemento cultural habían cambiado. Y el hecho de registrarse menos variedades de cestos en las PhS que en el Tesoro puede indicar que disminuyó la diversidad tecnológica y los usos de ese elemento cultural.

De modo que estudiar esas fuentes, comparando diversos aspectos culturales de los guaraníes misioneros en dos épocas distintas, puede ser un aporte interesante para la historia social y cultural de la lengua guaraní en las reducciones, así como para la historia de los hablantes de esa lengua. En ese sentido, cabe recordar que la escritura es una forma histórica de relación social, que acaba dando forma a las instituciones (Orlandi 2001, p. 8); el uso de fuentes como las Phrases Selectas en los estudios histórico-antropológicos es obviamente necesario y promisorio. La cuestión es cómo hacerlo si lo que tenemos en fuentes como la estudiada aquí son palabras y frases, en su mayoría cortas y descontextualizas, sueltas, sin más datos sobre el ambiente natural, social, psicológico, lingüístico, histórico y político.

Finally, the paper ends by noting the need to study the non-lexical aspects of diachronic change in Guarani:

The text was written by a non-Indigenous individual with help from “first class” Indigenous people. A more detailed linguistic analysis of this work would have to take into account the author’s insistence that the language had changed and that the Tesoro had become obsolete. What were these changes? Apparently, they were only lexical ones. Nevertheless, the possibility of syntactic, morphological, and semantic change must also be taken into account. What was the intensity or proportion of these changes? Which social changes were implicated in or implied by these changes? These are questions for a future study.

Estamos ante de un texto escrito por un no indígena con apoyo de indígenas “de primera clase”. Un estudio más minucioso de esta fuente en el ámbito de la lingüística tendría que tener en cuenta la insistencia del autor de este documento en el hecho de que la lengua había cambiado y que el Tesoro se había vuelto obsoleto. ¿A cuáles cambios se refería él? Aparentemente solo a cambios lexicales. Sin embargo, tendría que considerarse también la posibilidad de cambios sintácticos, morfológicos y semánticos. ¿Cuál fue la intensidad o proporción de esos cambios? ¿Cuáles cambios sociales estaban implicados o indicados en esos cambios lingüísticos? Son cuestiones para otro trabajo.

1. The paper is freely licensed under the CC BY-NC 2.5 AR license.

Why Does the Word “Mosquito” Come from Spanish?

A few years ago, it occurred to me to wonder where the word “mosquito” came from, and I realized it was almost certainly from Spanish. I mused that, unlike borrowings from Latin and French, “mosquito” was probably a colonial loanword, as Spanish loanwords in English tend to be post-Columbian borrowings from Latin American Spanish into North American English.

Looking up the etymology of the word, I found that “mosquito” was in fact a post-Columbian borrowing from Spanish, with the earliest occurrence of the word in English being from the 1580s (although the website does not specify if the borrowing occurred in the Americas or Europe). But were mosquitoes present in Europe before colonization, and if so, what were they called in English?

Recently, I decided to dig deeper. Mosquitoes apparently did exist in Europe before Columbus, being a thorn in the side of the Byzantines, Ancient Greeks, and Romans. Given that mosquitoes were present in pre-Columbian Britain and that the English word “mosquito” is of post-Columbian origin, the original word for the animal must have been something else. I found a page on the Maryland Department of Agriculture website stating the English word for “mosquito” was originally “gnat”, but it cites no sources. Nevertheless, coming across that page led me to look up “gnat” in the Historical Thesaurus of English, and as a result, I found that the word has indeed been used since Old English times to refer to mosquitoes, though only to certain genera (I assume the genera present in Britain during that period). The first word to be used to refer to the entire Culicidae family was “mosquito” in about 1583.

But the fact that the word “mosquito” is used throughout the English-speaking world rather than only in North America puzzled me. There is nothing unusual in North American anglophones in close contact with Spanish speakers borrowing Spanish equivalents for existing words, but for British English to do so strikes me as odd. Why would British English borrow a term for an animal already existing in Europe from Spanish (presumably Latin American Spanish via North American English) rather than French or Latin, which are geographically and culturally closer and its usual sources of loanwords?

Given that “gnat” originally referred to a specific subset of mosquitoes, I assume that English speakers felt the need to refer to the mosquitoes they encountered in the Americas as “mosquitoes” rather than “gnats” because they were different from European “gnats” in some noticeable way, with this semantic difference thus providing a reason for the adoption of the word in British English. In what way specifically these species might have differed, I don’t know, as I neither am a biologist nor know the historical differences between European and American mosquitoes. If anyone has more information, please let me know.

Another possibility is that English speakers in the Americas started using “mosquito” more frequently than “gnat” not because of the biological differences between European and American mosquito species, but rather solely because of their more extensive contact with Spanish, with this new synonym of “gnat” then spreading from North American English to British English. I am less confident in this being the case. It seems more plausible to me for “mosquito” to have initially been used in English to refer specifically to American species, with this usage later extending to refer to European species as well. Perhaps the British adoption of “mosquito” was influenced by the French mousquite (modern moustique), which is also a loanword from Spanish from around the same period.

(Incidentally, this whole topic reminds me of this WordReference thread I found while trying to figure out how to translate the Spanish word zancudo, which can apparently refer to any of several different types of biting insects depending on the region, and which occurs in Voyage à la Sierra-Nevada de Sainte-Marthe by Élisée Reclus. I eventually decided to leave the word as is, since it was a foreign word in the source text to begin with, and I did not know which of the various possible species was meant.)