In article <Pine.LNX.4.62.0604130852320.28756 at localhost.localdomain>, Willem van der Walt <wvdwalt at csir.co.za> wrote: > I think there is a bug in the espeak program when reading long text > files. To me it is no problem as I am using speech-dispatcher which > sends smaller chunks of text at a time, but others might be using > the file feature. The program segfaults after a time. That's puzzling, and worrying. I often speak large text files with speak -f textile and I've not had any problem like that. Were you using any other options? > I am interested in the process of creating a new language using > espeak. Where can I get more detail on that? Firstly read the Documents section from the web site (http://espeak.sourceforge.net): Dictionary, Phonemes, Phoneme Tables, (and download phonsource.zip referenced from there). There is another program which I haven't released yet which compiles the Phoneme Tables, together with sound recordings of consonants and formant specifications of vowels. It also includes an editor for the vowels. The interface needs tidying up a bit, but the biggest job is writing user instructions so that others can use it. I hope to do this though. If you want to try it out without instructions, I could make it available fairly soon :-) It would be very interesting if someone did do another language implementation. It would help to identify language-dependent features. Depending on which language, I might need to add some new features to the speech engine. Firstly you need to get a phonological description of your language (eg. what phonemes it uses). Looking up "yourlanguage language" in wikipedia might give some useful information. It may be that, as a first approximation, you can use already provided phonemes from the list in the Phonemes.html document. You can try out example words in your new language by giving phoneme codes enclosed within double square brackets, eg: speak "[h at l'oU w'3:ld]]" would say "hello world" in English, speak "[g'yt at n t'A:g]]" would say "g?ten tag" in German, using the [y] phoneme, which isn't used in English, but which is already provided in eSpeak. Perhaps you can find a set of passable phonemes for your language (you can implement more accurate versions later). A Bantu language would be more of a challenge (eg. tonal language, click consonants). Then you can start constructing pronunciation rules in the <language>_rules file. The <language>_list file gives exceptions and also those common words which are usually unstressed ("the", "is", "in", etc). See the "data/" directory in eSpeak's "source.zip" download package for examples. Hopefully your language's spelling rules won't be as difficult as English! Set up a Voice in espeak-data/voices for your language (specifying your language's dictionary, but keeping the default phoneme set for now) and compile the dictionary files with speak --compile=yourvoice That should give you a very rudimentary implementation of your language. It might be intelligible :-) eSpeak is written in C++. You can write a new Translator class for your language which can keep the functions of the base Translator class, or can set options to vary their effect (eg. which syllable of a word usually takes the main stress, or the difference in length between stressed and unstressed syllables), or can override them with replacement functions (eg. a new sentence intonation algorithm). Now your language should be sounding better, as you listen to it speaking, notice problems, make adjustments to rules, phoneme realizations, and various tuning parameters. If you're serious about implementing a language, then I'll be happy to help with support, program features, information and documentation.