Re: Phonetics

From:	Benct Philip Jonsson <conlang@...>
Date:	Friday, March 23, 2007, 15:06
|< < Post > >| << List/Tree >> Reference March 2007 Index
Abel Chiaro isnerq:

 >> >Also you may want to use precomposed
 >> >ăĕĭŏŭĂĔĬŎŬ etc. rather than combining
 >> >diacritics, since those still are likely to look better
 >> >in most Windows applications, especially in Upper Case.
 >
 > Actually, I *do* use precomposed ǣ, Ǣ, ă, ĕ, ĭ, ŏ,
 > ŭ, Ă, Ĕ, Ĭ, Ŏ, Ŭ... I'm confused.

And I wasn't using my brain. MediaWiki has a 'normalization
feature' for Unicode. It bit me before when I wanted i with
breve/macron and ogonek. Since I wanted the i's to be
undotted I used precomposed i-breve and i-macton with
combining ogonek, which MediaWiki 'normalized' to
precomposed i-ogonek (with dot) + combining breve/macron --
i.e. exactly what i *didn't* want. I had to put a zero width
      joiner, most easily entered with the HTML entity &zwj;
      before the combining ogonek. Most irritating!

 > By the way, that's exactly why I had to resort to the
 > macron for stressed <æ>, since there's no LATIN
 > SMALL/CAPITAL LETTER AE WITH BREVE... >:-( If at least
 > they had another COMBINING BREVE, suited for uppercase
 > letters... Is there a better way around it?

Well, there are fonts and software that do diacritic
stacking right(1), but you can't rely on wiki visitors
having them.

(1) E.g. The DejaVu fonts (Info:<http://tinyurl.com/yamuto>,
     Download: <http://tinyurl.com/2e8v3u>) and Charis Sil
     (<http://scripts.sil.org/CharisSILfont> Follow the
     download link in the box on the right) seem to do the
     right thing with Firefox <http://en-us.www.mozilla.com/en-
     US/firefox/>, at least on Windows XP.

My advice for now is to use the acute accent, which comes
precomposed with Ææ at \u01FC \u01FD, as a stress mark,
though I must say I'm biassed in favor of the acute to mark
stress or length. (See
<http://wiki.frath.net/User:Melroch/Accents>!)

 >
 >> >Is this really your first language? Really impressive!
 >
 > Hey, thanks for that! I'd love some suggestions, if I
 > may ask!

I guess it's only that back when I started there was a
dominant association between constructed languages and
international auxiliary languages which confused the mind,
and I can assure you my first conlang didn't contain any
phonemes not found in Swedish or German (the two languages I
spoke natively), and one or two only found in Swedish. Also
the only 'exotic' morphological feature it contained was a
morphological plural (exotic from a Germanic POV)!
Admittedly I was very young at the time.

I'm afraid I cant give you very much structural advice ATM.
I'm in the midst of struggling with an overhaul of the case
and verbal agreement systems of my own conlang Kijeb
<http://wiki.frath.net/Kijeb>, so all you would get would
probably be the same ideas as I'm considering for Kijeb
(which might or might not be a good thing! :-) The one big
advice for beginners is not to fall into the trap of
relexifying their native language (see
<http://en.wikipedia.org/wiki/Relexification>), but you
seem to be safe on that point -- unlike me when I began, I
might add!

 >> >The breve as a stress mark is a good deal confusing! :-)
 >
 > Haha, I thought it could be. :-) But it does look exotic,
 > doesn't it?

Sure!

 >> >Is there any special (concultural?) reason you prefer it
 >> >to the acute?
 >
 > Yes, there is: the stress mark used in the Ályis script
 > (called Ánvalyis "the writing", by the way) is identical
 > to a breve.

I thought as much. At one point I considered using overdot
on c/s z g e for /S dZ G &/ in my conlang Sohlob on the
model of the 'native' script
<http://wiki.frath.net/Sohlob_writing> but decided against
it because I wanted to stick with a Latin-1 clean
Romanization. There is an ASCII clean Romanization too, but
I hardly ever use it any more. The g-dot for /G/ would also
have been inaccurate, since in Sohlob writing /G/ is a-dot!

 > For the romanization, I did consider using the acute as
 > the preferred default, since to me it's much easier to
 > input (I'm on an XWindow System with a BR-ABNT2 keyboard
 > mapping), but then the sample texts gave me the impression
 > of shouting... :-)

Why so? I do agree with those that feel that ALL CAPS is
shouting, but I never had that feeling WRT acute. One Dutch
guy on this list used the acute for emphasis, and I guess
that if you are used to that it might feel like shouting,
but as I have read a lot of Icelandic in my day I'm glad I
don't have that association! :-)

 > On the other hand, at least with XWindow and my keyboard
 > layout, and in OOo Writer, the breve is entered with
 > (behold!) AltGr+Shift+\ and the macron with AltGr+Shift+[,
 > so it's not really that much of a trouble to write it...

Can't you make custom keyboards rather easily on XWindow?

 >
 > Thus, the final word is: acutes (or graves or carets or
 > diaeresis or carons, for that matter, as one just needs to
 > mark the stressed vowel) are just fine, but breves are
 > more accurate (at least graphically speaking).
 >
 > In the mean time, I'm working on updating the glyphs for
 > the ánvalyis script to my wiki.

I look forward to it.

 >> >And I just can't get my head around Ŋ for /ɲ/ (CXS
 >> >/J/)! :-) If the issue is with capitalization, there
 >> >actually is an Ɲ U+019D LATIN CAPITAL LETTER N WITH
 >> >LEFT HOOK in Unicode, although I readily agree that
 >> >neither ɲ nor Ɲ are very aesthetically pleasing -- how
 >> >is one to write them in cursive to start with? I for one
 >> >vastly prefer diacritics to IPA xymbols in Romanization
 >> >exactly because of the issues of cursive writing and
 >> >capitalization, but I'm aware there are those who abhor
 >> >diacritics! :-)
 >
 > Yeah, this LATIN LETTER N WITH LEFT HOOK is indeed ugly...
 > and I'm aware that ŋ is the IPA letter for the velar
 > nasal, but again I chose the LATIN LETTER ENG for ease of
 > input: with XWindow+BR_ABNT2, all I need is
 > AltGr+[Shift]+G... and the eng is really beautiful when in
 > italics, especially with the Gentium font by SIL (great
 > font for linguists, by the way).

Agree. Too bad most capital ENGs look terrible. I prefer the
form looking like an enlarged lowercase eng to the one
looking like an NJ ligature. As for Gentium I can only
agree. I only hope it'll soon get the same upgrading as
Charis SIL got.

 > Speaking again of graphical proximity to ánvalyis, I
 > would have preferred to use ogoneks with all the basic
 > consonants (that is, b, p, d, t, z, s, g, k, l, r, m, n),
 > because there all their "h-counterparts" have descenders,
 > but the COMBINING OGONEK looks poor with most of them (and
 > typing AltGr+Shift+= (the ogonek) doesn't cut it, as there
 > aren't mappings for all those consonants with the
 > ogonek...) — and I'd have to thing of something else
 > about g, which already has a descender... H and Y will
 > have to do.

What about comma below? You have a rather full set for that:

* Ḑḑ U+1E10 U+1E11 LATIN ... LETTER D WITH CEDILLA
* Ģģ U+0122 U+0123 LATIN ... LETTER G WITH CEDILLA
* Ķķ U+0136 U+0137 LATIN ... LETTER K WITH CEDILLA
* Ļļ U+013B U+013C LATIN ... LETTER L WITH CEDILLA
* Ņņ U+0145 U+0146 LATIN ... LETTER N WITH CEDILLA
* Șș U+0218 U+0219 LATIN ... LETTER S WITH COMMA BELOW
* Țț U+021A U+021B LATIN ... LETTER T WITH COMMA BELOW

Unicode doesn't distinguish very clearly between cedilla and
comma below. The canonical shape used in Latvian and
Rumanian is comma, while the Turkish is cedilla. The
confusing names are a holdover from a time when one thought
the Turkish and Romanian forms could be considered variants
of one another. In fact you should take care *not* to use

* Şş U+015E U+015F LATIN CAPITAL LETTER S WITH CEDILLA
* Ţţ U+0162 U+0163 LATIN CAPITAL LETTER T WITH CEDILLA

since those have the cedilla shape. T-cedilla actually was
proposed for use in French at one time long ago, but no
language actually uses it AFAIK. The problem with the Dd +
comma (U+1E10 U+1E11) is that most fonts don't cover them.
Of course you can set up your wiki to select suitable fonts
-- in fact you should, since MS Internet Exploder is stupid
WRT font substitution.

Alas there is no precomposed Hh with comma below. You could
be radical and use Ȝȝ U+021C U+021D LATIN ... LETTER YOGH
(which look reasonably like the thing Egyptologists use for
'ayin') for /h\/, or if you feel French, use Ŗŗ U+0156
U+0157 LATIN ... LETTER R WITH CEDILLA for /G/ and g with
cedilla for /h\/, or the other way around! :-)


 > As for diacritics (or other markings) on consonants... I
 > do like them, but I wanted ályis transliteration as clean
 > as possible; hence I prefer to use LY to Ł, NY to Ŋ. I
 > also stick with TH and DH because the thorn (þ Þ) and
 > the eth (ð Ð) still look a bit weird to me. :-) Bear in
 > mind that those consonant variants in the transliteration
 > chart are the *optional* ones I would use to have letter-to-
 > letter correspondence between ánvalyis and the latin
 > alphabet. Personally, I think (or ) "our speak" looks much
 > better that way than <ðinăłis> or <ðináłis>...

Being an Islandophile I have no problem with þ and ð,
although word initial ð does look weird to me too. For some
reason word-internal þ goes down much more easily --
perhaps because it marginally occurs in compound words in
Icelandic.

 > Again, thank you for your comments!

Thanks for listening!

BTW the key to our CXS phone*ic transcription is at
<http://www.theiling.de/ipa/>
|< < Post > >| << List/Tree >> Reference March 2007 Index
Reply

Philip Newton <philip.newton@...>