Re: IPA speech synthesizer
From: | Eric Christopherson <rakko@...> |
Date: | Sunday, February 22, 2009, 4:29 |
On Feb 20, 2009, at 10:55 AM, Roger Mills wrote:
> BP Jonsson wrote:
>> On 2009-02-19 Arnt Richard Johansen wrote:
>>> the individual segments in a speech stream affect each
>> other so much that you can't just splice together phones
>> and get a result that sounds like speech.
>>
>> Or to put it otherwise 'segments' are just the
>> wave-tops in the stream, corresponding to when
>> the speech-organs are closest to their target
>> positions, separated by throughs/transitions
>> which actually take up most of the stream.
>> The discreetness between segments which we think
>> we perceive are a product of the analysis
>> of the sound stream which our brain performs
>> before the perceived signal even reaches
>> our consciousness.
>>
> Would it not be poassible to create a machine that could read and
> reproduce spectrograms? (Or is that what Alex's "bigrams" meant?)
I think he means transitions between two segments; Wikipedia calls
them "diphones" (<http://en.wikipedia.org/wiki/Diphone>).
>
> But on second thought, that's redundant-- to create a spectrogram
> you have to (usually) make recording first, so why not just go with
> the recording...?
I hadn't known about it before, but WP also mentions a machine that
did just that (<http://en.wikipedia.org/wiki/Pattern_playback>). But
yeah, it seems like it would make sense to just use the recording.
Sai wrote:
> To recast it a bit: how hard would it be to make an IPA synthesizer
> that is at least as good as a single-language speaker, linguistically
> naïve, sounding out arbitrary IPA transcribed words using
>
http://www.phonetics.ucla.edu/course/chapter1/chapter1.html? It should
> be good enough to be recognizable, but it doesn't need to be perfect.
You'd still have to strip the supporting /A/s from the consonant
samples.