Theiling Online    Sitemap    Conlang Mailing List HQ    Attic   

Re: IPA speech synthesizer

From:Gary Shannon <fiziwig@...>
Date:Friday, February 20, 2009, 20:49
--- On Fri, 2/20/09, Philip Newton <philip.newton@...> wrote:

> > Would it not be poassible to create a machine that > could read and reproduce spectrograms? (Or is that what > Alex's "bigrams" meant?) > > I interpreted "bigrams" as combinations of > phones. > > So that, for example, "conlang" would be, > conceptually, produced by > splicing together the bigrams for [kQ] [Qn] [nl] [l&] > [&N]. (Not sure > whether [#k] and [N#] would also get bigrams; possibly so.) >
The biggest problem to my (uniformed) mind is that of intonation. Bigrams, even very well done bigrams (or trigrams, or N-grams) will still produce monotone renderings. Just out of curiosity I took a couple seconds of natural speech from a recorded class lecture and filtered out the higher harmonics to emphasize the tonal changes. Then to make those tonal changes easier to pick out I doubled the frequency and put the original speech on the left channel and the tonal tracking on the right channel. You can hear how much frequency variation there is in even such a short sample of natural speech by clicking on this mp3 file: <http://fiziwig.com/he_was_born_etc.mp3> (Pan from left to right to hear each track separately) Just this one quickie experiment suggests to me that the most significant problem in natural-sounding synthesis is going to be patterns of intonation. --gary

Reply

Sai Emrys <saizai@...>