Ben Klang wrote: <snip>
Is it really required to use res_speech? If so, can we change the interfaces that ARI presents? Over the last few years we’ve evaluated res_speech vs. the various UniMRCP applications (SynthAndRecog primarily). We’ve always come to the conclusion that the res_speech API either couldn’t give us what we needed, or was not as performant. SynthAndRecog isn’t perfect, but it does a couple of crucial things, perhaps most importantly is the combined lifecycle of TTS + ASR so that you can “barge” into a TTS playback before it is finished.
The res_speech module and API is a very thin wrapper over common speech recognition concepts. It does some helpful stuff like handling transcoding and having a state machine but otherwise it relies on the underlying speech technology to do everything. It doesn't provide anything to the dialplan 'nor does it even know about channels.
What you probably found limiting was the interface provided to the dialplan/AGI for speech recognition, with the dialplan applications taking care of things. These wouldn't get used in ARI. We're free to make the interface there whatever we want.
During lunch though I gave this some more thought and think that speech recognition should always be a passive action on a channel (or heck, a bridge). It would sit in the media path feeding stuff to the speech recognition and raising events but does not block. This would allow it to easily cooperate with every other possible thing in ARI without requiring a developer to use a Snoop channel and manage it. It also doesn't put the "well if they start speaking what do I do" logic inside of Asterisk - it gives that power to the developer.
Thoughts? -- Joshua Colp Digium, Inc. | Senior Software Developer 445 Jan Davis Drive NW - Huntsville, AL 35806 - US Check us out at: www.digium.com & www.asterisk.org _______________________________________________ asterisk-app-dev mailing list asterisk-app-dev@xxxxxxxxxxxxxxxx http://lists.digium.com/cgi-bin/mailman/listinfo/asterisk-app-dev