JAT  
Search JAT Search tips
Updated 2000-09-01
JAT Bulletin 184-185, July-August 2000

Notae Philologicae — I
ETYMOLOGY AND JAPANESE (1)

Roger Machin

Note on the transliteration of Japanese: My transliteration follows the conventional kana spellings as closely as possible. Thus, long vowels are written double with the exception of /o:/, which is rendered <ou> or <oo> according to derivation. I write <si>, <ti>, <tu> etc. in preference to <shi>, <chi>, <tsu>. The first element of the double consonants conventionally rendered <kk>, <ss>, <tt> etc. (ie small っ) is written universally as <Q> (small capital), and the syllabic nasal as <N> (small capital). Morae in bold font are pronounced high. Words written without any bold font are unaccented (have no fall in pitch), which is to say the first mora is low, the second (if any) is high, and any following word begins high.


A recent correspondent to the JAT list asked about the etymology of the Japanese equivalent of St John's wort, otogirisou. A respondent was able to provide the story behind the characters 弟切草 conventionally used to write the word. Although it may be tempting to accept colourful explanations of this sort, I have my doubts and would tentatively assign them to the realm of fantasy. It is probably an example of what is called 'folk etymology', a common enough phenomenon in any language. There is, for instance, an English word sparrowgrass. No longer in common use, it used to be an alternative name for asparagus, from which it is derived by association of sound. For all I know there may be a story or two connected with the word, but in fact it is a folk etymology which will have come into being as the result of people's attempts to make some sense of a difficult and foreign-sounding word.

I think it was Voltaire who said that in etymology the consonants count for little, and the vowels for nothing at all. I can see what he meant. It may be obvious that the English word six, the Latin sex (as in English sextuplet or sexagenarian) and the German sechs all share the same origin, although before you can spot even that, you have to understand that <chs> in German is just a way of representing the same sound as is spelt <x> in English and Latin. It may be somewhat more difficult to see the connection with zes in Dutch, hex in Greek (cf. English hexagon) or chwech in Welsh. Still on the subject of numerals, I daresay it requires more than a little imagination to accept that English five, Dutch vijf, Latin quinque, Welsh pump and Greek pente, however different they may look or sound, are all essentially the same word.

Etymology is an exact science, and like all exact sciences it is governed by sets of rules which can be demonstrated to be true. But ample care must be taken to guard against false friends. Let me give an example. The verb 'to have' in Latin is habere, and in German haben. 'Ah,' says the would-be etymologist, who is fully aware that classical Latin is related to modern German as an uncle is to a nephew (or rather as an aunt to a niece, since languages tend to be feminine in gender), 'I can see they're obviously from the same root.' Indeed, it looks pretty much like a cut and dried case. But he would be mistaken, for in this case the resemblance is purely fortuitous. The Latin root which corresponds etymologically to the German hab- is actually cap- (eg the verb capere, 'to take', or English words such as capture or, with modification of the vowel, the -cep- in accept, both loanwords from Latin). Meanwhile, Latin hab- corresponds to German geb- (in geben, 'to give'). Incidentally, these examples provide us with another, fairly simple correspondence, that between German /b/ and English /v/ in haben, geben as against have, give.

Coincidental resemblances outside the same family of languages are by no means uncommon either. The word for 'eye' in modern Greek is mati, and the Malay or Indonesian word mata means the same thing (Mata Hari, the pseudonym of the Dutch dancer, courtesan and First World War spy Margarete Gertrude Zelle is literally 'eye of the sun'). But few would venture to suggest on this slender evidence that there is any connection between the two languages, and indeed we have historical evidence to show that the modern Greek word is derived by a process of syncope, or shortening, from the classical Greek ommation. Nearer to home, the Japanese sou そう can often be translated so in English and German, and what about the English equivalent of the element -bone in sebone 背骨?

The study of Japanese etymology is full of uncertainties and fraught with pitfalls for the unwary. To begin with there is the imprecise nature of the Japanese writing system. Please do not misunderstand me: this is not a criticism. In fact, I happen to believe that kanamaziribuN 仮名混じり文 is as near perfect an instrument for recording the Japanese language as you are likely to get. But the fact remains that it is next to useless when it comes to studying the phonological history of words, and that is really what etymology is all about.

Of course, there are substantial parts of the Japanese lexicon where the immediate etymology is plain for all to see: the vast numbers of loanwords and loan elements from Chinese, and the newer imports from Portuguese, Spanish, Dutch and English. But when it comes to the native vocabulary, the picture remains very unclear. In short, we still do not know much about the origin of Japanese. Numerous theories have been proposed, some of which are more plausible than others. Perceived connections with Tibetan dialects and the Dravidian languages of southern India can perhaps be dismissed as being based on too little factual evidence. What resemblances there may appear to be are probably coincidental. There have also been the inevitable attempts to link Japanese to other languages of disputed pedigree, of which a fair number exist in various parts of the world. But there are two strong contestants: the northern link and the southern link, and the truth of the matter would seem to lie in a combination of the two. Japan's geographical situation argues for a link with Korean to the north, and beyond that to Mongolian and related languages, while the string of islands stretching like stepping-stones from Hatizyouzima 八丈島 south through Ogasawara 小笠原 to Saipan, Guam and beyond cannot but argue for a connection with the languages of Micronesia. Some scholars now suggest that Japanese has developed from a creole, a mixture of these two strains.

Be that as it may, and the link with Korean is easily observed in the very similar grammatical structure of the two languages, it remains true to say that much of the native vocabulary of Japanese continues to present something of a mystery. It is not possible to map out a genealogy for words of the sort that we have for English and many other languages. As we have already seen, one of the problems lies in the nature of the Japanese script. In any etymological argument we may discount kaNzi  漢字 immediately. Interesting though it may be to speculate why a given Chinese character was assigned to a given Japanese word, this has nothing whatsoever to do with the origin of the word in question. All that should concern us is how the word is written in kana 仮名. Because in a syllabary most symbols stand for a combination of sounds (restricted to consonant plus vowel in Japanese, but other scripts of this type permit of more complex combinations) rather than a single sound as in an alphabet, there is considerably less room for change or adaptation to mirror changing pronunciations. We know, for instance, that the /h/ sound in はひへほ was originally a /p/, which remains to this day after っ, and accounts for why the corresponding voiced sound (dakuon 濁音) is /b/. What we cannot tell, because the kana symbols remain the same, is when and how the change took place. In this case, as luck would have it, external evidence is provided by the 16th-century Jesuit missionaries, who use the letter /f/ to represent the sound, suggesting that it was similar to the allophone still heard in ふ.

Let me conclude this first article with an illustration of how the Chinese character conventionally assigned to a word may hide rather than reveal its etymology. Take the word mizuumi. The fact that it is conventionally written with the single character 湖 will not blind us to its obvious derivation as a compound of mizu 水 and umi 海. The literal meaning will be '(fresh)water sea', although the existence of a blanket term covering both 'lake' and 'sea' is not uncommon in languages. We need only look at the Old English word mere, which appears in Windermere and the names of several other lakes in the English Lake District, or the German See. It is true that the German noun is treated as masculine when it means 'lake', and feminine when it means 'sea', but it is still one and the same word.

However, look a little closer and you will see that even umi 海 can be divided into two elements, u and mi. Neither has an independent existence any longer, but the first of them can also be found in usio 潮, unabara (or unabara) 海原, and perhaps uneri うねり (uneru うねる). The second element appears in a whole host of words which are all connected in one way or another with water, and we shall look at these and some other elements in detail in the second article in this series.
There is a postscript to this article. I had just finished writing the last sentence when it was time to take the dog for his afternoon walk. As often we went down to the beach, and I was sitting on the shingle looking at the waves and watching some large ships go by on the horizon: the Straits of Dover is said to be the busiest stretch of water in the world. It was then that it suddenly occurred to me that the element u might also figure in the names of certain seaside place names in Japan, and I thought of Ube 宇部 in Yamaguti-keN 山口県, Uno (or Uno) 宇野 in Okayama-keN 岡山県, and Uzina 宇品, the port for Hirosima 広島 as possible candidates.

Contents | Bulletins