Theiling Online    Sitemap    Conlang Mailing List HQ   

Re: TECH: Sound Change program

From:Paul Bennett <paul-bennett@...>
Date:Wednesday, April 6, 2005, 1:02
On Tue, 05 Apr 2005 12:24:17 -0400, Stephen Mulraney
<ataltane.conlang@...> wrote:

> On Mar 25, 2005 7:22 PM, Paul Bennett <paul-bennett@...> wrote: >> ----- Original Message ----- > > First, it all sounds wonderful. I hope it works as planned! > >> From: Benct Philip Jonsson <bpj@...> >> >> > Paul Bennett skrev: >> > > I think I like single-character variable names, to be honest. >> > >> > 2) you run out of *meaningful* single letter abbreviations even >> > sooner. >> >> Yes. > > I think single-letter var names would be terribly restrictive too. > Especially when you consider, > as you mentioned earlier, that you may want the variables to change > over time, e.g. if the > vowel inventory changes, you might need "V_time0=a,i,u" and > "V_time1=a,e,i,o,u". Of course, > if you implemented the wonderful ability to redefine variables > throughout the input file, it would > make this a bit easier.
That's entirely the plan, expressed via features.
> However, it would still be unnecessarily > confusing (e.g V for all vowels, > F for fronts vowel, B for back vowels, where V, Vfront and Vback or > whatever would help keep > users sane while editing complex files).
What I've devised allows you to combine variables and features (frontness, or whatever). At a given point in the sound change definition file, you can put (e.g.) i=V[+front][+close][-round] and later on, something like k > c / $V[+front] At least, that's the plan. I'm already overdue on handing the assignment back, so I'm going to go without variables, and just use the existing regular-expression syntax that's in place in the language. However, I'm fascinated enough that I will almost certainly finish this project "properly".
> Some positive remarks: unicode support sounds brilliant.
Sounds brilliant to me, too, but when I tried to use a UTF-8 corpus of Latin with macrons, the file reader class barfed hugely. I think I tried UTF-16 with the same nasty results, IIRC.
> Trying to > find suitable chars from > the feeble pool of alphanumerics plus the chars that taliesin listed > had me tearing my hair > out when working with the previously existing programs. And what's > worse, after spending > uncountable hours on writing the GMPs in the right format, it was nigh > unreadable and > undebuggable, and now, a year or two later, the whole mess is utterly > impenetrable. A simple > feature like unicode support (without ignoring case of alpha chars!)
Without?
> plus the ability to give > sensible var names would make writing maintainable files vastly easier.
Variable names with features I think is the way to go.
> But the thing that really has me excited is the mention of possible > featural, umm, features.
Aha. That'll teach me to read the entire message before hitting "Reply". I shall choose not to censor myself, though.
>> > If V is "vowel" then you want the varnames for e.g. >> > "rounded vowel" and "front vowel" to be VR and VF. >> >> Yes, although with features, that could be V[+round] and V[+front]. I >> think that's the notation I prefer reading. > > But there's likely to be a need for variables that don't have a > featural description, too. Not everyone > will want to use a strict featural approach. For that matter, in the > past I've used these kind of programs to implement things other than > phonological changes: flipping between orthographical systems is quite > a natural use, too, but it would be a pain to try to bludgeon such an > algorithm into a featural mold.
Features will be entirely ad-hoc, and absolutely voluntary. If you want i=X[+shlrdu][-qwerty] d=X[+shlrdu][+foo] you will be able to have it (with |i| unmarked as to fooness, and |d| unmarked as to qwertyhood), and if you want a,i,u=V b,d,g,k,p,t=C then you will be able to have that, too.
> Ooh! Feature request!!! :). Don't worry, it's easy: the ability to > produce output which is correctly formatted as input to a further run > of the program. That not only saves time writing awkward little sed > scripts or whatnot, but also makes it very easy to general related > langs, like:
Eh? The program reads a corpus file and a sound change file, applies the sound changes in the order they're written in the sound change file (stopping at the specified year), and outputs the changed results, either to stdout or a Text Box (depending on whether you're doing the GUI thing). I suppose I could also include a "Copy Output To Input" button in the GUI.
> A = ancestor lang, X, Y, Z child langs, and B=common-X-Y-Z (not > necessarily a fully developed lang, but containing all the changes > from A common to X, Y and Z). Have soundchange files A_to_B.sc and > then one of B_to_X.sc, B_to_Y.sc, or B_to_Z.sc. The run the > appropriate combination of sc files to get whatever child you like.
Yep, yep.
> Another approach to this is the one used in Geoff's Sound Change > Applier, where your preface each line in the soundchange file with > "X", "Y", "Z", "XY", "XYZ", or whatever; indicating the list of > children langs that the line affects. But this requires more > complexity in the parser.
Not feasible at this stage. I'm thinking about XMLifying the Sound Change format, despite my hatred for XML, because I think it would be a workable way to store several child languages with eachother. More thinking is required.
> By the way, what platform will it be for? Personally, I'm easy(*), but > a unix text mode version would be the answer to all my prayers :) > > * Well, once it's windows or unix :)
It shall be Java, with GUI completely optional. Thus, hopefully, if you've got it, I'll run on it. (sufficiently wimpy devices need not apply).
> But it all sounds great. I await with trembling fingers.
As do I, sir. As do I. Like I say, the drop-dead date for new function has long since passed (for now), but depending on what my workload is like this summer I will keep working on it until it is perfect. Paul

Reply

JS Bangs <jaspax@...>