One of the factors that makes Esperanto relatively easy to learn is its method of creating new vocabulary by combining existing elements: domo “house”, dometo “cottage”, urbodomo “town hall”, samdomano “housemate” and so on. But the number of these building blocks has grown substantially over the years.
In 1887, Zamenhof’s first Esperanto book listed just over 900 roots. By 1894, his Universal Dictionary had more than 2,600, garnered from publications of the intervening few years. Kazimierz Bein’s 1910 Dictionary of Esperanto contained 5,000 headwords, the (allegedly) Complete Dictionary of 1930 offered almost 7,000, and the 2020 edition of the Complete Illustrated Dictionary presents us with nearly 17,000, many of which are extremely rare technical terms. Impressive, but perhaps excessive.
How many such components – such morphemes – do you actually need? Some language designers have experimented with reducing the number to a bare minimum. These tiny systems are termed oligosynthetic, from Greek roots meaning “putting few together”.
And which meanings are important enough to merit their own morpheme? How on earth do you decide? The highly developed language Kah comes with a comprehensive 10,000-word dictionary composed from a list of roughly 425 basic elements that includes – somewhat unexpectedly – “shrill”, “juniper”, “weasel”, “coffee” and “nonsense”.
Looking back in time, we find a very interesting system in Kenneth Searight’s Sona of 1935. He took the 1000 semantic categories of Roget’s famous Thesaurus and whittled them down to just 360 elements. Each of these radicals thus covers a range of related meanings across several parts of speech. Bi, for example, means “use” or “tool” or “by means of”, and fa means “risk” or “luck” or “perhaps”.
Furthermore, he tended to organise his radicals into semantic fives, such that the items in each group share the same primary consonant and vowel, as in the lu series: lu “game, play”, lun “absurd, trivial”, alu “invite, guest”, ilu “laugh, merry” and ulu “mischief, monkey”!
The various aspects of a radical’s meaning come into play when it appears in compound words: te is glossed as “hand, [to] project, take”, and from this Searight formed such words as bute (smell-project) “nose”, tebi (hand-tool) “handle” and sute (flow-project) “stalactite”. Tara means “big man”; rata means “giant”. These meanings are somewhat arbitrary and unpredictable – a common problem in oligosynthetic languages.
How do designers go about selecting spellings and pronunciations for their morphemes?
Searight chose to have 180 monosyllables like ba and ban, plus a further 180 disyllabic elements such as aba, iba and uba. This meant he had to make his joining rules more complicated than they would ideally be: ban + aba produces banyaba, for instance, which is unfortunately a different word from baniaba (ba–ni–aba). But one can sometimes glimpse possible etymologies: lu is probably a truncation of Latin or Esperanto ludo, and the initial vowel of ilu may have been inspired by the word hilarious.
Krishna Amrito’s Nonlen uses 210 meaningful syllables in an almost fully populated matrix of 18 initial consonants with 12 endings: ba, bai, ban, bau, be, ben, bi, bin, bo, bon, bu, bun … This is marvellously compact but tremendously difficult to memorise, and it’s also hard to distinguish pairs like bayan and baiyan in speech. Some of these items are drawn, in distorted form, from European languages – ganwi (organ-vision) means “eye” – and many come from Chinese, e.g. rangau (nature-high) “mountain”. The name Nonlen simply means name-language.
This system has a striking feature that seems obvious with hindsight: it allows the use of international (or at least European) terms for specialised concepts like cokolat, gitar, kilometer, plutonium and sosiologik that one would struggle to express clearly via the basic syllables. Some might regard this as cheating, but so long as one avoids words like medisin that would conflict with what can be constructed from the oligosynthetic matrix, I say it’s a neat solution.
Downsizing still further, we come to Vuyamu, a language by T. F. Yik that has only 99 syllables, each consisting of a consonant and a vowel. Oddly, it wastes an item by treating ma “each” and mo “every” as two separate concepts. As none of the other syllables means “all (taken together)”, perhaps mo is meant to have this interpretation.
Its syllables seem to be completely a priori, and they have primitive meanings like “big”, “colour”, “stone”, “water”, “part”, “fruit”, “make” and “take”. The vocabulary formed from these can be more than a little opaque, especially when it comes to abstractions: yevado (literally thought-similarity-mountain) means “respect”, and “freedom” is povade (possibility-similarity-sky). The name Vuyamu means “this language”, literally way-speak-this.
Learning the lexicon is made considerably easier by a nice touch: each syllable’s consonant identifies one of four broad categories of meaning – things, actions, descriptions and spatial/temporal relations – roughly akin to the popular misconception of what nouns, verbs, adjectives and prepositions are, although in the syntax of Vuyamu any morpheme can be used as any part of speech.
Talking of which, the grammar of oligosynthetic languages is not always particularly well explained – the designers sometimes seem to have been too focused on the vocabulary. Occasionally the syntax turns out to be hopelessly ambiguous, with no reliable way to determine the structure of sentences. In the case of Sona, the author’s aim was to produce an international auxiliary language, but his work suffered greatly from being published in a book that was far too short to allow for a decent presentation of the grammar, and overall this aspect of it remains somewhat baffling.
Perhaps the most extraordinary oligosynthetic language of them all is aUI by John Weilgart, a psychiatrist who claimed that a little green alien taught it to him as a child. It has just 31 morphemes (plus a further 11 used only for forming numerals), each of which is merely a vowel or a consonant. The language’s name means “space-mind-sound” and thus “cosmic language”.
Capital letters denote long vowels. The language is in fact vowel-heavy, with 12 such sounds to be found among the main 31 elements; the 11 numeral items are likewise all nasal vowels. Q has the sound of German ö.
A typical sentence looks like this:
bu wQv cEv nEm vem, Qg bu Ov rom Ib wom.
“You can be very active, if you feel healthy and strong.”
– literally Together-human Power-condition-action Existence-matter-action Quantity-matter-quality Action-movement-quality, Condition-inside Together-human Feeling-action Good-life-quality Sound-together Power-life-quality!
Weilgart published a detailed textbook of aUI in the 1960s and 1970s. His daughter subsequently worked on the language, and its dictionary now contains a phenomenal 10,000 words – much the same as Kah, yet built up from a morpheme inventory less than a tenth of the size.
It can pass the time on a sleepless night to take the reduction process even further and ponder what meanings you would put in a vocabulary if you were only allowed (say) 20 items. How about you, me, person, thing, have, put, say, feel, do, go, to, now, not, same, type, tool, break, hot, eat and home?
In my next article, I’ll be looking at one of the world’s best-known tiny languages – Toki Pona, the language of good – and a couple of its recent tiny offshoots.