Re: TECH: schcompile (Was: More Þrjótran)
From: | Benct Philip Jonsson <bpj@...> |
Date: | Sunday, April 23, 2006, 11:17 |
Henrik Theiling skrev:
>>> sub step_NAME($;$$)
>
>
> Functions are generated according to this template, so that's the way
> to invoke them:
>
> $result= step_NAME ($input);
> $result= step_NAME ($input, 100);
> $result= step_NAME ($input, 100, 1000);
> $result= step_NAME ($input, undef, 1000);
>
> (Replace NAME with the name of the step, of course.)
I've been wondering about one thing: you seem to have
the same rules (mostly fixes as I understand it) in several
different steps in s17.sch, (and I'm beginning to understand
why! :-) but won't they conflict with one another if you run
the output of one step trough another step, as you are obviously
supposed to do?
>>BTW you wonder in s17.sch where Modern Icelandic words in vo-
>>come from; they generally come from OIc vá-.
>
>
> Ah! That's interesting -- this probably means I'm missing a rule in
> the OIc > Ic step. ...types... Indeed, some forms changed. Is this
> transformation a generally valid rule or are there constraints or even
> random?
I think it's general. At least a quick compare of the vo-
section of a MIc dictionary with the vá- section of an
ON dictionary gives the impression that it's general.
Remember that ON |á| was [Q:]; it doesn't seem strange
that diphthongization to [aw] didn't occur after [w].
>>I'll have more comments and questions on your rules later...
>>Maybe we should take that on Germaniconlang?
>
>
> Sure! I'd appreciate it! :-)
>
>
>>Question: I derive 3 or 4 "dialects" from Kijeb. They'll have some
>>sound changes in common, either between two of them or between all
>>of them. How would that be handled?
>
>
> It depends on your taste: either use different .sch files, or use
> different steps in the same file. It does not matter technically,
> it's up to you.
I suppose I can also use options and condition flags for this
purpose -- that would correspond to what I have done in my old
(much less sophisticated) Sohlob soundchange script, e.g.:
unless($dialect_C){s/(sr|rs)y/hl/g;}
if($dialect_B){s/(sr|rs)/hr/g;}
elsif($dialect_A){s/(sr|rs)/hl/g;}
Wouldn't that translate into
MATCH > TRANSLATION / CONTEXT => CONDITION_FLAG
(sr,rs)y > hl / _ => !dialect_C
(sr,rs) > hr / _ => dialect_B
(sr,rs)y > hl / _ => !dialect_A
(Assuming that you *can* negate a condition flag
-- you don't do that anywhere in your example rules file...)
BTW since you have "backward|regressive|reverse" as
synonyms, wouldn't it make sense to have "progressive" as a
synonym of "forward"? After all "progressive assimilation"
is the term historical linguists use.
Also, what do the square brackets mean in a syllable
selector like:
syllable first heavy !h_end, [ umlaut_i !h_start ]
? I don't think you say that in the documentation.
>>Question: can one use special characters like þ ð åäö troughout
>>a .sch file, or does one *have* to use ASCII-IPA as you do?
>
>
> Because the file is translated to Perl with the literal strings left
> as is, you can use whatever your Perl installation accepts. I.e.,
> basically anything.
OK, but how do I make Perl know that an input file is in UTF-8?
I tried with a simple program:
while(<INFILE>){
chomp;
print length($_) . "\n";
}
Which printed 12 when the line in the input file was really
six UTF-8 characters! So I guess there must be some way of
telling Perl in what encoding INFILE is. I looked in the
Perl Unicode introduction and found no answer -- at least
none that I understood! :-/
BTW do you know the Sort::ArbBiLex module? It should come
in handy if you should ever want to make a Þrjótran
dictionary -- or indeed an alphabetical list of words from
any lang that uses its own sorting order!
>
> **Henrik
>
>
--
/BP 8^)>
--
Benct Philip Jonsson -- melroch at melroch dot se
"Maybe" is a strange word. When mum or dad says it
it means "yes", but when my big brothers say it it
means "no"!
(Philip Jonsson jr, age 7)
Reply