Text that is to be synthesized into human speech is first converted into an intermediate form representation that describe the acoustic-prosodic resolution of the spoken version of the text. The intermediate form can be generated manually, or by an intermediate form generation program at a server computer,...http://www.google.es/patents/US6510413?utm_source=gb-gplus-sharePatente US6510413 - Distributed synthetic speech generation