Theiling Online    Sitemap    Conlang Mailing List HQ   

C-IPA underlying principles and methods

From:Christophe Grandsire <christophe.grandsire@...>
Date:Tuesday, February 25, 2003, 13:32
OK, since it seems that people are actually interested in my IPA-ASCII scheme,
here is a presentation of its underlying principles and a glimpse at how it
currently looks like (only a glimpse because not all symbols are chosen yet).
Basically, this is C-IPA (for "Christophe's IPA" or "Conlanger's IPA") version
0.1 Beta ;))) .

OK, the underlying idea behind the C-IPA was to provide a scheme that doesn't
make some simple characters of IPA look like line noise in ASCII (unlike what X-
SAMPA and Kirshenbaum often do). The reason is that conlangers often
take "exotic" sounds in their conlangs and may want a scheme that doesn't make
the ASCII transliteration of the pronunciation of their words look like some
monstruosity which makes Klingon's transliteration in Roman characters look
like a parangon of aesthetics ;)) .

The basic principles of C-IPA are thus:
- only the characters of the 7-bit ASCII are usable (of course! ;)) ), namely:
a-z A-Z 0-9 ! " ' ( ) , - . / : ; ? [ ] { } # $ % & * + < = > @ \ ^ _ ` | ~ and
the space.
- simple small characters from the IPA are taken straight from it. Small
capitals from the IPA are taken straight as capitals in C-IPA (it means that
the uvular nasal in IPA becomes N in C-IPA, but the velar nasal doesn't have a
simple equivalent. I know this is not common to give an uncommon sound a simple
transliteration, but my point here is to transliterate the IPA in a version
that allows us to quickly find back the IPA form, not to "correct" mistakes of
designs of the IPA).
- this is the main principle of C-IPA: each place of articulation, manner of
articulation or position used by the IPA to organise its sounds (or almost all
of them) are provided with a diacritic (appearing as a non-letter character
behind the modified character), which can be used behind any meaningful
character to put it in the place, manner or whatever of articulation *without
changing its other parameters*. Basically those diacritics are shortcuts to
move in the IPA tables.
- this is the second-main principle of C-IPA: some IPA diacritics see their use
enlarged to the level of diacritics as in the previous principle. This is
difficult to explain but easy with an example: IPA has diacritics
for "advanced" and "retracted" used with vowels. Those are adopted as place of
articulation changers for both vowels *and* consonants. Basically, the
diacritic for "advanced" is used to advance the place of articulation of one
rank to the front of the mouth (so for instance it can be used to mark dental
consonants from alveolar ones). In the same way, the "retracted" diacritic is
used to retract the PoA one place to the back of the mouth (thus retracting
bilabials to labiodentals for instance). With vowels, it serves to move them
from front to center to back and vice-versa. In the same way, the diacritics
for "raised" and "lowered" are used to move of manner of articulation (this is
a feature already present in IPA itself).
- this previous principle may bring (rarely) some ambiguities: what if I want
to really use the "retracted" diacritic for its actual IPA use. That's where
the universal tie bar-diacritic marker comes in handy. It's the same principle
as X-SAMPA which uses _ both as tie bar and diacritic marker, one of its good
features. C-IPA uses the same feature (but not necessarily the same character
for it).
- there's more than one way to do it! (and I swear I didn't know it was Perl's
motto when I first uttered it :)) ) As you may realise by now already, with
such principles many IPA characters have various ways to be rendered in C-IPA.
Indeed, and all those various ways are all valid! The point is that redundancy
is a good thing here to reduce ambiguity, choose a way to render an IPA letter
which shows also some phenomenon happening in the language or just to fit the
aesthetics of the writer. But all those ways are transparent once you know the
- the last principle is also borrowed from X-SAMPA: go for similarity with the
actual IPA shapes, but don't try to make a geometric equivalent of them.

Those are the main principles in work with the C-IPA. Now, the actual
implementation of those is not stable yet. I want the result to be aesthetic
and somewhat mnemonic, and it's difficult to choose between the available
characters. But I can give you an example of what I mean with those complicated
principles, although you must remember that the choice of actual characters is
not set in stone and I actually don't find it that satisfying.

Now for those PoA, MoA, etc... diacritics:
stop: |
nasal: ~
trill: (no diacritic)
tap: *
fricative: \ (I don't like it at all!)
lateral fricative: (no diacritic?)
approximant: < (supposed to show that they are more "open")
lateral approximant: (no diacritic)
devoicing: 0 (not much choice)
voicing: _ (not quite what I want, but can't find better yet :(( )
"advanced": +
"retracted": -
"raised": {
"lowered": } (not quite satisfied with those two)
retroflex: `
click: !
implosive, ejective: / (should I provide two different symbols for those?)
roundedness, unroundedness, laxness: (no idea yet :(( )
tiebar-diacritic marker: ^

How do they work now? Easy: take any simple character, for instance the
voiceless alveolar fricative s (taken straight from IPA, since it's a simple
character). If you advance it: s+, you get the voiceless *dental* fricative (T
in X-SAMPA). If you retract it: s-, you get the voiceless *postalveolar*
fricative (S in X-SAMPA, although I'm thinking of taking it as S in C-IPA too,
not that it will make the previous version wrong anyway :)) ). If you voice it:
s_, you get the *voiced* alveolar fricative (also z in C-IPA. As I said,
there's more than one way to do it. And reversely, z0 is equivalent to s :)) ).
If you raise it: s{, you get a stop, that's to say the same as t. And you're
allowed to use diacritics more than once, so s++ is equivalent to f, but that's
a bit stupid isn't it? ;))) Another example is that p! is the bilabial click,
while p\ is the voiceless bilabial fricative (like X-SAMPA, but its voiced form
is simply b\ here - or p\_, or v+, whatever you want :)) ) and p/ is the
bilabial ejective.

As you can see above, for many PoA and some MoA I didn't provide diacritics.
That's because I only have a limited amount of letters, and the way the simple
characters are distributed, I can easily reach most if not all the IPA
characters. There's no diacritic for the trill, because they all correspond to
simple characters here: B, r and R. From them you can get the others: r* (or
r}) is the alveolar flap, r< (or z<, or d<, whatever you want) is the alveolar
approximant, R< (or g<) is the velar approximant. There's no diacritic for
laterals, since they are all simple to reach: l is the alveolar lateral, l` the
retroflex one, L+ the palatal one and L the velar one (since small capital L is
velar in IPA). And l\ is the voiced lateral fricative (which can also be l{),
although here I'm thinking of using $ for the voiceless lateral fricative, and
thus the voiced one can also be $_.

Ans what about vowels? Well, on those I didn't work that much, but from the
simple ones you can already get many. For instance, e} is X-SAMPA E (although
I'm thinking of taking it as E in C-IPA too), a{ is ae-ligature (which is also
& in C-IPA :)) ) and i- is barred-i (see how the principles behind C-IPA
provide some very mnemonic results in some places :)) ). As for slashed-o
(close-mid rounded front vowel), I could always render it o++ if I'm not afraid
of being ridiculous :)) . And of course, rhoticity is simply rendered by `, as
retroflexion. No ambiguity possible.

And as I said, using the tiebar sign restores the IPA value of the diacritic,
so if i- is barred-i, i^- is simply retracted i (underlined i in IPA). And by
the rule that says that things have to look like the IPA, it's simple to
indicate palatalisation: ^j, labialisation: ^w, lateral release: ^l, etc...
(the same rule provides us with ? for the glottal stop of course :) )

As for suprasegmentals, although I didn't work much on them, I know that length
is obviously :, primary stress ', secondary stress , and syllable break . .

As you see, this system makes for probably more digraphs than X-SAMPA, but it
also makes for less trigraphs and a lot less other polygraphs, and doesn't
make "exotic" sounds anymore difficult to write than they are already in IPA
itself. Also, since the principles behind the formations of those digraphs are
straightforward, you don't need to actually learn them all, as long as you know
the rules. The system is kind of "agglutinative" :) .

So, what do you all think of such a scheme? As I told you, only the principles
are set, the actual implementation is not stable yet (and not complete anyway).
If you have some ideas that may make it more aesthetic, I'd be happy to hear
them :) . And suggestions for diacritics of roundedness, unroundedness and
laxness would be more than welcome (although I could always consider that X-
SAMPA U is rendered u{ or Y-- ;))) ).

I'm eager to hear your comments! :) (praises are welcome too ;)) )


Take your life as a movie: do not let anybody else play the leading role.


Eamon Graham <robertg@...>