History of gjâ-zym-byn (gzb)
(unfinished draft)


I started developing this personal constructed language in late February or early March, 1998. Before that I had seriously been interested in constructed languages (and linguistics in general) for about two years; and several years prior to that, I had (inspired by reading about Tolkien's languages in the notes to The Book of Lost Tales) created a couple of unsophisticated fictional languages (all lexicon and not much grammar, except for derivational morphology; they were not relexes of English, but they had English phonology). Since mid-1996, I had developed several more interesting (though still somewhat naive) fictional languages for the cultures of a constructed world my brother and I had been working on. I had some ideas that did not seem suitable for a naturalistic conlang, and I wanted to have a conlang suitable for my own day-to-day use (for instance, in writing my journal). The fictional languages I was working on then seemed unsuitable for this for two reasons:

  1. being intended as naturalistic, they involved some irregularity (though not, to be honest, plausibly much), so that I didn't care to take the effort to learn them with real fluency, and
  2. being the languages of hunter-gatherer tribes or low-tech agriculturalists, they lacked vocabulary for many things I needed to think and talk about frequently, and I did not want to distort their fictional plausibility by anamundistically adding such terminology to them.

After studying Esperanto for a while, I admired the consistent, internally mnemonic construction of the table of correlatives, and I wondered whether it might be useful to try constructing other functional words (prepositions and conjunctions) with a similar morphemic matrix. I drafted a proposal for how one might do this with prepositions and posted it to the CONLANG mailing list, which led to an interesting thread (I was not the only conlanger to have this idea, though I've taken it farther than some others). However, I didn't do anything with the idea until March 1998.

Initial design

If I recall correctly, I started with the spacetime postposition system and built up the rest of the language around it. The truth-table matrix for clause conjunctions, and the number system (based on having root words for many primes and other interesting constants, and using compounds for composite numbers) were also among the earliest parts of the language developed and for a long time the most stable.

I had some rough design goals and ungoals:

  1. The language would be lexically parsimonious (forming a large number of words from a small number of roots and a powerful derivational morphology) — like Esperanto, only much more so.
  2. Except in a few interesting areas for which it would have many terse words for concepts requiring compounds or phrases in other languages; either lexicalizing some of the concepts I think about most, or trying out unusual distinctions to see whether and how they affect my thinking.
  3. It would be grammatically much simpler than English, French, Esperanto, or Greek (the languages I had seriously studied up to that time), and not a relex of any of them. That is,
    1. It would have no grammatical irregularity, and
    2. it would have minimal grammatical complexity.
  4. This simple grammar would also be very different from that of the other languages I had studied. That was one reason I chose to use postpositions rather than prepositions, for instance, and the few categories verbs inflect for would be different from those for which the aforementioned languages inflected.
  5. It was not to be an international auxiliary language, and could not possibly be mistaken for one.
  6. It would be fun to use.

I think it was mainly #5, though partly of course #6, that induced me to use such a baroque phoneme inventory. I was a little worried that, when and if I eventually told people about the language, it might be mistaken for an IAL proposal (not so much by fellow conlangers, as by fellow Esperanto speakers) due to its matrix construction of postpositions and conjunctions; and I figured no language with 50% more phonemes than English could be mistaken for an auxlang. That may have been an influence on the decision to use a primes-based number system, as well.

I think the main influences on the design were probably Vorlin's sparse inflectional morphology and Esperanto's luxurious derivational morphology. Lojban, which I had only studied a little bit at the time I started developing gjâ-zym-byn, was a minor influence in some areas, as was Tom Breton's AllNoun. Reading about Taneraic was probably an influence on my notion to develop a conlang for the purpose of keeping my journal in a hermetic language, though I don't think its specific features influenced the design. E-Prime and the ideas of General Semantics were an influence in that I intended the language to have no copula verb, and to have distinct ways of expressing attributes, subsetness, equality and existence. Some of Rick Morneau's essays on constructed language design, especially the discussion of verb types and case tags in The Lexical Semantics of a Machine Translation Interlingua, and the engineered language Voksigid, influenced the design of the verb and case-postposition systems.

The earliest form of the postposition system lacked the orientation prefixes {š-} and {đ-} for temporal before and after; it used a w- for time (so {i} meant "at", but "during" was {wi}), and combined it with initial {s-} or {ŧ-} for "after" and "before" (i.e., it derived its time postpositions by a different spatial metaphor than English and related languages: the future was above, the past below.) It had only the three contact/inside/part-whole suffixes {-n, -m, ŋ}; postpositions lacking one of these specifiers were ambiguous as to "near" or "at" or "in". A couple of years later I added two more suffixes {-j, -r} for marking proximity and distance explicitly. Still later (in early 2005), after trying several options for expressing "through" (including stretching the meaning of the prefixes {k-} "between, among" and {r-} "beyond"), I added one more suffix {-l} "through".

Other than the number system, spacetime postpositions, and conjunctions, that earliest form of the language was pretty different from the current one in several important respects:

I translated a few verses from the beginning of the Gospel of John into the language at that stage. Before very long (I think it was just two or three weeks) I decided that this version had major problems. I reworked the phonology and phonotactics, dropping a few hard-to-distinguish phonemes and severely restricting the allowed syllable shapes (just C(S)V(S)(N), where S = /j/ or /w/ only, not including /l/ and /r/). This necessitated recreating almost all of the root words I had devised in the first couple of weeks. I also revised the grammar, loosening the word order and adding abstract subject and object postpositions, unique particles that didn't fit into the general postposition scheme.

At first I referred to the language privately as "Persona Lingvo Nova", but by the time the grammar and lexicon had stabilized in the state mentioned just above, I had rendered this name into the language itself, {gjâ zâň-bô bâm-bô} [language idiosyncratic-ADJ new-ADJ], abbreviated gzb. I kept this name until mid-2000; when the language was no longer so new, I hunted for another name that would fit the same acronym, and came up with {gjâ-zym-byn}, [language-thought-hack].

Subsequent development

After the first period of intense work on the language in March and April 1998, I left it mostly alone for several months as I was busy with my last quarter of college, then with travelling to San Francisco for the Esperanto immersion course at San Francisco State University. A month or so after that, in August, I picked up the language again and worked on it seriously for some time, adding two new phonemes (|ķ| and |Φ|) and a considerable number of new words, and making another major revision of the grammar, bringing it fairly close to the current version:

The next period of intense work on the language began in May 1999. I added more phonemes, the voiced palatal fricative |ʝ| and the palatal affricates |č| and |ž|, and redesigned the causal conjunctions matrix; I also dropped the special adverb suffixes, and decided to make one class of modifiers serve as adjectives and adverbs according to context. It was at this time that I first started to really use the language as I intended it — for journal entries — though at first only briefly and intermittently, because I wasn't fluent enough to write extended paragraphs in it, and its vocabulary was still too small to make such use convenient.

In summer 2000, I added two more phonemes, || and |ƴ|, and relaxed the phonotactics a little, to allow /l/ and /r/ in the same positions as /j/ and /w/; at the same time I added {-j} and {-r} as postposition suffixes (for proximity and distance respectively).

My development log from this period shows I was already having trouble with the polysemy of {mĭ-i}, the topic postposition. I used it to indicate the topic of a topic-comment sentence, or the "subject" of a stative verb; but it also seemed to be the most appropriate way to mark the object of a verb of thinking, feeling or sensing. This was OK in an active sentence, where the subject of the verb would be marked with {tu-i}, "agent"; but it led to ambiguity for stative verbs which described passive sensations or emotions. For a while I tried using {mĭ-ma-i}, "meta-topic", to indicate the second topic (object of thought or sensation) in a sentence that already had one topic, the experiencer, but I wasn't satisfied with this; it was bad enough that several frequent postpositions were already two syllables, I didn't need a trisyllable of similar frequency.

Eventually I hit upon the idea of altering the word {} (previously "looking at") to mean "paying attention to" (so "looking at" became {kâ-rĭm}, attention-seeing), which allowed {kâ-i} as a suitable postposition for the object of attention. This resolved most of the aforementioned ambiguity. Later still, in mid-2004, after studying John Quijada's Ithkuil, I added a new morpheme {ʝâr}, "experiencer", which produced {ʝâr-i} to replace {mĭ-i} for the subjects of verbs of emotion and sensation (as well as many others).

The only other significant changes in the core grammar recently (both in April 2005) were the addition of another postposition suffix, {-l} "through" (see above), and the inanimate third-person pronouns {te} and {ŋe} (previously the pronouns {ƥ} and {ɱ} could indifferently refer to animates and inanimates).

I said above that I wanted the language to have no grammatical irregularity. I later modified this goal slightly, as it turned out to be in tension with my primary goal of learning to use the language. Some of the changes I made in the language in the first year or two would logically require a great many words to be redesigned. In some cases I did so redesign them; but in other cases, where I had already learned the words in the form dictated by the old grammar, I decided not to implement the change with perfect consistency. Thus arose the suffixoid morphemes, for instance, which have the phonotactic form of content-roots but act in compounds like suffixes.

Some aspects of the grammar arose not from conscious design, but emerged from my use of the language; I described them in the grammar documents later, after realizing how I was actually using the language. I think the use of serial postpositions, and the possibility of modifiers modifying postpositional phrases, were both instances of this.

Over time I have increased the root vocabulary from around 250 words in August 1998 (the earliest date at which I recorded such counts in my development log) to 1205 as of 2008/3/3. The majority of those new root words were devised to express things I had no way of expressing before, but many replaced previous compounds which seemed too long, too imprecise or too inaccurate. In a very few cases I replaced an old word because it had unwanted semantic associations due to its natural-language cognate. I think if I were starting over I would use far fewer a posteriori words, only for those concepts I had thought carefully about and decided there was no point in disassociating from the semantic bundles carried by words in natural languages (for instance, most names of animal and plant genera and families). (About a fifth of the root words in the lexicon are listed with etymologies, but this is misleadingly high because some of them come from other a priori conlangs of my own, and in some cases I've used the etymology field to show where the idea for a word came from or credit the person who suggested its form.) There are also 1384 derived and compound words listed in the lexicon as of 2008/3/3.

I have generally, though not in all areas, tried to avoid coining words (even compound words) just for the sake of having a large lexicon. To me, the language is what exists in my head, not what is described on this website; there is no point in having a word in the lexicon if I don't learn and use it. So mostly the words I've coined have been words I actually needed to use, if only once or twice, and I've made a largely though not wholly successful attempt to learn the vocabulary as fast as I devise it. Recent experiments with my flashcard program show that I know around 85-90% of the vocabulary; that is, 85-90% of the theoretical lexicon described in this website is really part of the actual language that exists in my brain.

In late 2006, I began modifying the handwritten form of the language based on a frequency analysis of my electronic corpus (including some transcribed journal entries). Beginning with the most common suffixes and postpositions, I started introducing logographic glyphs to replace common morphemes and words. I would add one, use it for a while, and then add another when I had gotten used to the last one. I'm working my way down a list of words sorted by frequency-weighted-by-length, which I periodically regenerate as I expand the corpus with new texts and sample sentences. That is, of two equally frequent words, the longer one will get a glyph to represent it sooner. So far there are eight such glyphs.

Actual use of the language

This intermittent pattern of use and development has continued over the years. Judging my the dates of entries in my first language development notebook (which I used from May 1999 to October 2002; earlier notes were on a legal pad, since lost, and loose sheets of notebook paper which are mostly preserved in a file), and the dates attached to words in the lexicon, I seem (at least from 1998 to 2003) to have typically worked on the language during two periods each year of about one to two months each; more often in spring and fall than summer and winter, if that means anything. 2001 was an exception; I seem to have used it for two periods, but one in February apparently lasted only a day and the other (in August) only a few days. By 2002 I was already able to write coherent journal entries in gzb, though the process of writing in gzb was slow enough that I frequently reverted to English or Esperanto when the pace of events required fast writing. My handwriting in gzb was much better at this period than in English or Esperanto; later, once I got more fluent in the language and wrote more quickly, my gzb orthography got similarly sloppy.

How did writing in this language affect the actual content of my journal? In two ways. I found I started writing about embarrassing and unpleasant events that I had never brought myself to mention when writing in English (or Esperanto). Also, in order to get practice with using the language and test its capabilities — and, more and more, for the sheer joy of using it — I often wrote in the journal when nothing much was actually happening in my life. Many of the journal entries in gzb from 2002-2008 are very repetitive, recording the hours at which I woke, did respiratory treatments, arrived at the office, came home, slept... Probably many of those entries would not have been written at all had I been keeping the journal exclusively in English; but the challenge of talking about these mundane events in a language I was creating and learning at the same time made them more interesting. Often these entries combine development notes on the language with comments on various things that had been happening to me or that I'd done. Because most of the development notes were written in the language itself, in paper notebooks, this web site (and, indeed, my own text file grammar documents) was usually very out of date.

Conversely, when my life was very eventful, I would often find that I could not write quickly enough in gzb in the scant free time I had for journaling, so I reverted to English or Esperanto. By 2004, though, I could write pretty quickly in gzb when writing about routine occurrences. (More unusual events often required me to pause to think of how to express concepts I had not expressed before in the language, or not recently enough to remember offhand how I had done so before; sometimes it required coining new words, but more and more often I found I was able to make the existing root stock and affix system do the work.) I would often coin a word while writing, underlining the word and making an indication in the margin; then, once a week or so, go over recent journal entries and put the recently coined words into the lexicon — sometimes modifying them, and making a "nonce word" entry for the word I had coined off the cuff while writing. Sometimes, too, I would notice errors in the journal, and make lexicon entries for variant forms of words; my intention is that the lexicon should contain every word that has ever appeared in the journal, even if they aren't part of the current standard language. (These erroneous/variant form entries only occur in the tab-delimited text lexicon, not the HTML lexica.)

Although the periods of serious work on the language (and extensive writing in it) were still intermittent, by about 2002 or 2003 I had reached the point that I was no longer ceasing to use the language entirely between periods of development work. I would still think brief thoughts in the language, and use it for prayer, even when I might go weeks or months without writing very much in it. From time to time I would get more interested in developing it, start re-reading my older development notes and journal entries to refresh my memory, and continue where I left off.

Besides journal entries, I also made to-do lists and grocery lists in the language, and exercised its capacities by translating things into it (several passages from the Gospels, the standard Tower of Babel story, Psalm 95, one of Æsop's fables, an essay by G.K. Chesterton; some are on this website, others may show up here eventually), writing original dialogues and stories (mostly not very good; you probably won't see them here), and, after about 2000, most of the development notes on the language — thoughts about changes in or further specification of the grammar, definitions of new words, and so forth. I haven't, however, ever attempted to rewrite the grammar from scratch in the language itself, as some conlangers have done. More recently (2007-2008) I've tried writing first drafts of difficult scenes in stories I'm working on in gzb, before rewriting them in English.

When I first started creating the language, I thought of it as a secret private language, and had no intention of publishing anything about it. After I changed my mind about this, an Esperanto-speaking pen pal in Brazil asked to see it, and I uploaded my notes (which were in mixed Esperanto and English) as plain text files without much editing. It took me a long time to clean up the mixed-language documents and create a purely English version; I hope to create purely Esperanto versions later, but it's lower priority than keeping the English version up to date, as probably a majority of the (very few) people who have expressed any interest in the language don't speak Esperanto.

In summer 2007, gzb was used by David J. Peterson in the first Inverse Translation Relay -- each of the participants in which translated a text into a language created by the next person. I wrote to David at the time:

 It was a great pleasure to read your text and understand
 it at once.  I said when we were planning the relay,
 > Besides, you get to enjoy the fun of reading
 > text in your conlang that you didn't write
 > -- that would be a big plus in this kind of relay for
 > those who are trying to become fluent in their conlangs.
 > When you reread text you wrote yourself,
 > how can you tell how much of your comprehension
 > of it is really due to parsing and understanding
 > it linguistically, and how much due to just being
 > reminded of what you were thinking when you wrote it?
 And it was nice to have this additional confirmation that
 gjâ-zym-byn really "works", even though I've been
 using it for years.

As of this writing (2008/3/3), gjâ-zym-byn is ten years old. I can't say I'm perfectly fluent in it; when I'm thinking and writing in the language I still hesitate over a word choice or sentence structure far more often than in my fluent languages, English and Esperanto, and make mistakes more frequently than in those languages. But I'm more fluent in gzb than in Greek or French, which I've been studying for fifteen and fourteen years. I occasionally have dreams with fragments of gzb writing or speech in them, though, like most dreams, unfortunately, I remember them only vaguely on waking. In the last few days I've found myself thinking in gzb a lot more, not just during prayer; my internal monologue is in gzb almost as much as in English, and about as much as in Esperanto. I can probably say I've accomplished the goals I set out to accomplish, as far as they are accomplishable; I'm probably unlikely to become significantly more fluent than this with no one else to talk to in the language.


Off and on through the years, especially from about 2003-2006 when I was pretty active in editing Wikipedia, I've gotten involved in deletion debates over various articles, especially about conlangs. In most cases, where the article wasn't written by the inventor of the conlang and where there's some evidence that a significant number of people other than the language's inventor were interested in it, I argued for keeping said articles; in a few cases, of conlangs apparently of interest to nobody but their creators, I argued and voted for deletion. I was pretty active in trying to get consensus for a general policy in the English Wikipedia on how to determine what conlangs are notable enough to be included; that draft policy was never adopted.

In August 2006, I found out that Dr. Roberto da Silva Ribeiro had created an article about gjâ-zym-byn in the Esperanto Wikipedia. I thanked him for his interest in the language, but nominated the article for deletion, in consistency with the principles I'd argued for in the 2005 debate on conlang notability in the English Wikipedia. I was outvoted 2-1. I found this incident vastly amusing; if the English Wikipedia is plagued by deletionists who want to get rid of every marginal article, the Esperanto Vikipedio might have the opposite problem. Or maybe not. Marcos Cramer wrote:

La notindeco de artikoloj dependas de la kulturo de la uzantoj de la Vikipedio. Por Esperantistoj planlingvoj estas interesaj, do ni ne forigu artikolojn pri ili c^i tie.

The notability of articles depends on the culture of the users of the wikipedia. For esperantists, constructed languages are interesting, so we shouldn't delete articles about them here.

From: Jim Henry 
Date: Wed, Jul 11, 2007 at 7:56 PM
Subject: Re: [CONLANG] Toki Pona un-deletion in Wikipedia
To: Constructed Languages List 

On 7/11/07, Sai Emrys  wrote:
> On 7/11/07, Jim Henry  wrote:
> > > Yet another war with the WP deletionists (who seem to be almost
> > > entirely non-conlangers). Sigh. :|
> >
> > This apparently isn't so much of a problem on other language's Wikipedias;
> > some of them have the opposite problem, including non-notable conlangs.
> > E.g.,
> >
> > http://eo.wikipedia.org/wiki/Gj%C3%A2-zym-byn
> >
> > (I nominated it for deletion last year but was outvoted. :)
> .... you AfDed your own conlang? O.o

Applying even the more generous criteria proposed in the WP:CONLANGS
draft policy (which was never adopted anyway), it still seemed to
me that gjâ-zym-byn wasn't encyclopedically notable.  But apparently
the criteria for conlangs (and perhaps minor topics in general?)
on the E-o Vikipedio are even more generous than the most
inclusionist conlangers on the English Wikipedia ever dared
to propose -- the idea being that Esperanto speakers are
on average more interested in conlangs than English speakers.

More on my fluency or lack thereof

In general, I am more fluent in the written form of the language than in the spoken form. The aspects of the spoken language that give me the most trouble are the uvular plosive /q/ |ķ|, especially when it follows a closed syllable; the bilabial trill /ʙ/ |Φ|, which is hard to pronounce when one's lips are dry; and the nasal vowel harmony rule, when it results in nasalization of vowels in closed syllables. When I first started devising and learning the language, the front rounded vowels, the nasal vowels, and the velar fricatives were all a bit of a challenge — the palatal fricatives and affricates, somehow, not so much — but I've long since gotten used to them.

Aspects of the grammar that I've fluently internalized include the spacetime postposition system and most of the theta-role postpositions, the verb inflection system, the causal conjunctions matrix, and much of the derivational suffix system. (As complex as the postposition system is, I can report that it's not unnatural to the language centers of the human brain; I find myself occasionally coining new theta-role postpositions off-the-cuff without conscious planning, much as any fluent speaker of Esperanto unconsciously coins words on the fly using its derivation and compounding system.) Aspects I still have to consciously think about to apply include relative clauses, conditional statements, and comparatives. (I have been seriously reworking the system of conditional particles, and may majorly revise the comparative system as well; Thomas Payne in Describing Morphosyntax does not mention any natural language that works like gzb in this area, and it's possible the system is too unnatural for me to really learn.)

The evidentiality adverbs are another aspect of the grammar I haven't yet come to use without conscious thought, nor have I fully internalized the habit of using attitudinal adverbs automatically whenever they would be appropriate, although I use them more ofen than evidentiality adverbs. (This is probably partly because, as 95% or more of the gzb corpus consists of journal entries, and probably 99% of the clauses in the journal would have direct experience evidentiality if gzb were the kind of language where evidentiality were obligatorily marked, I haven't often felt the need for these adverbs. Still, there are occasions when I could/should use them and don't, through absentmindedly forgetting that they're available.) I have, however, long since formed a strong habit of using the (much older and more stable) attitudinal suffixes with many nouns, especially proper names, and occasionally with modifiers or verbs. I tend also to use a couple of these suffixes as interjections, {la} (the affectionate suffix) and {ħa} (the contemptful suffix).

I've pretty well internalized, and use fluently, the number-words up to about twenty, as well as {tĭm} (100); expressing specific numbers larger than that requires a bit of conscious thought.

I also, of course, make mistakes of word choice from time to time, both from phonological and semantic near-collisions. A look at the uses of the error-correction particle {Φej} in various passages of my journal reveals the following mistakes among others:

Most of these mistakes should be classified as grammatical rather than lexical, I reckon. The reasons for most of these mistakes are obvious; the last one is less so, the only thing the words having in common being their initial phoneme. Other mistakes I've observed myself making more than once include {wǒj} (because) for {wǒn} (therefore) and vice versa, using {mǒj} (but, however) as a postpositive clitic rather than a clausal conjunction, using {-gla} (the time-ordinal suffix) on cardinal numbers where the suffix {-bô} is required, and (oddly) substituting {mwĭl} (sleep) for {vlym} (clothing). I've also sometimes mixed up {ble} (the rest of) and {ʝǒ} (others), or {be} (maybe re: one's plans) and {še} (maybe re: events outside one's control). This last error is one I make so often, even though both these particles have been part of the language since at least 1999, that I occasionally wonder if it's an unnatural distinction that my brain can never learn to reflexively, unconsciously use. (If I recall correctly, I first thought of making this distinction when, in mid-1999, I was looking through the lexicon and realized I had two different particles glossed simply as "maybe"; instead of considering them redundant and getting rid of one of them, or leaving them both in the lexicon but marking one of them as an archaic or nonce word (as I've done in some other such cases of accidental semantic duplication) I came up with an interesting distinction between them — which, somehow, I've never fully internalized in all the years since.)

 > *23. Can such a language function?

I don't know yet.  Sometimes I can write in gjâ-zym-byn fast enough
that I hope I will be thinking fluently in it in just a few more
months.  Other times I despair, thinking I must have included some
hopelessly unnatural feature that I am trying futilely to learn to

The earliest version of the gjâ-zym-byn website that the Internet Archive stored is from April 2001.

Last update May 2010 (just adding links, and notes about what to cover in future updates; overall this essay reflects the state of the language in March 2008)