TECH: a possible bug and Latin stress rules in schcompile
|From:||Benct Philip Jonsson <bpj@...>|
|Date:||Tuesday, April 25, 2006, 14:11|
I have come upon a possible bug in schcompile, or rather its
output files: when using a subroutine in a schcompile-
generated module I have to specify the package, i.e.
$w = step_stress($w);
doesn't work. I have to say
$w = latinstress::step_stress($w);
Is this expected/as it should be, an error in schcompile/its
output, or ActivePerl/Windows acting up? Naturally I
suspect the latter, but I would like to know -- especially
if there is a workaround?
The good news is that I've succeeded in formulating the
rules for Latin stress placement. The formulation *is*
quite similar to my earlier formulation in plain, messy
Perl regexps, but it does look a lot better in sch syntax! :-)
rule "stress marks before vowels" -- Rule (1)
V > 'V / 'C+ _
-- This is so that the user can input words with stress already
-- assigned and written IPA style with the mark before the
-- first consonant of the syllable, but this is *not* the
-- notation we want internally: the mark must go immediately
-- before the vowel for rules like (5b/c), where we have to
-- include the group macro V in the translation, to work.
rule "delete excess stress marks" -- Rule (2)
'C > C
-- I don't know if V > 'V / 'C+ _ in rule (1) and 'C > C in
-- rule (2) will bleed each other, but it doesn't seem so when
-- testing. It is *not* possible to have them as subrules of
-- the same rule, since then if V > 'V / 'C+ _ has applied then
-- 'C > C will not be effected, which is not what we want.
-- Unfortunately it is not possible to formulate it as 'C+ > C+',
-- since the + sign will be copied literally into the translation.
rule "delete excess stress marks" -- Rule (3)
-- only one stress mark per word!
' > 0 / ' (C,V)* _
'' > '
rule "input orthography fix" -- Rule (4)
ngu > ngv / _ V
qu > qv / _
x > cs
-- This rule is AFAIK necessary in order for the identification
-- of heavy and light syllables in rule (5b) below to work out
-- right. At least x > cs is: with the other two it actually
-- comes down to the same thing either way!
-- In a *real* Romlang sch file this rule should clearly be
-- external to the stress assignment step...
rule "stress assignment" -- Rule (5)
(C,V)* ' (C,V)* > _ -- Rule (5a)
-- Leave the word alone if there is any stress mark in the
-- input, i.e. stress marked in the input should override
-- automatic stress assignment. This seems to work as intended
-- when tested.
-- antepenultimate_stress -- Rule (5b)
V > 'V / _ C* Vshort C? V C* #
-- penultimate_stress -- Rule (5c)
V > 'V / _ C* V C* #
-- Since rule (5c) may apply even if rule (5b) has applied the
-- fixing rule (3) becomes necessary to remove the non-leftmost
-- stress mark(s) if there are more than one in the word.
-- Still this is a more economical and safe way to do things:
-- the alternative would be to make a subrule for each possible
-- combination of long vowels, short vowels, single consonants
-- and consonant clusters in the three last syllables, and to
-- put them in the right order so that the different subrules
-- don't block each other. This way is *much* easier!
rule "monosyllable stress" -- Rule (6)
syllable first last
V > 'V
-- This rule should perhaps be optional, since most
-- monosyllables will be clitics that should normally be
-- unstressed. BTW there are IIRC also disyllabic clitics like
-- _ante_, so perhaps there should be a way to flag a word as
-- clitic in the input, e.g. to prepend it with some sign, then
-- have a subrule at the beginning of the stress assignment
-- subrule, since if it applies it will also prevent the
-- following subrules for automatic stress assignment from
-- Probably one should define two groups for stressed and
-- unstressed vowels, i.e. "'a, 'a:, 'e, 'e:" etc., and (and
-- also include the stressed vowel spellings in the vowel/V
-- group, so that one doesn't have to insert a '? every time
-- one wants to do something to a CV sequence regardless if the
-- vowel is stressed or not.
-- TODO: test if including "'a" etc. in the vowel/V group will
-- interfere negatively with the operation of the stress
-- assignment rules.
Benct Philip Jonsson -- melroch at melroch dot se
"Maybe" is a strange word. When mum or dad says it
it means "yes", but when my big brothers say it it
(Philip Jonsson jr, age 7)