Asterisk DTMF recognizer

mjordan at digium.com (Matthew Jordan) · Wed, 11 Dec 2013 21:34:24 -0600

On Wed, Dec 11, 2013 at 5:22 PM, Ben Langfeld <ben at langfeld.me> wrote:

The vast majority of IVR platforms (mainly VoiceXML) permit handling DTMF
> in a consistent manner to speech recognition, that is by way of a DTMF
> grammar. Asterisk, to my knowledge, does not currently include an
> SRGS-based DTMF recognizer.
>
> FreeSWITCH recently got one as part of mod_rayo, and Chris Rienzo has
> stated that he believes making a majority of this into a library general
> enough to be used in Asterisk also would be plausible, leaving limited
> integration effort in Asterisk.
>

Today, Asterisk provides a module, res_speech, which acts as a generic
speech detection engine interface. Asterisk uses this module as a way of
providing to its various user interfaces - notably AGI and dialplan - a
mechanism to manipulate any speech recognition engine that registers itself
with the generic engine interface. In general, the overall module stack
usually looks something like this (warning, bad ASCII art):

     _____________         __________
    |  app_speech |       | res_agi  |
    |_____________|       |__________|
            |__________________|
                 _____|______
                |            |
                | res_speech |
                |____________|
                      |
                      |
           ___________|_________
          |                     |
          |                     |
 _____________________   ________________
| res_speech_lumenvox | |   res_cepstral |
|_____________________| |________________|

   - app_speech provides the dialplan application and uses res_speech to
   send commands/interface with a speech detection engine
   - res_agi does the same thing, only for the AGI interface
   - res_speech registers engine bridges and passes commands down to the
   speech engines bridges. This is the API that other things in Asterisk use
   to manipulate a speech recognition engine.
   - res_speech_lumenvox/res_cepstral are speech engine bridges that
   register themselves with res_speech and interface to those speech
   recognition engines. They do the actual work of informing the speech
   recognition engines of when to load the appropriate grammar, handle the
   start of audio being fed to the engine, etc.

The res_speech module does have the capability to indicate DTMF to the
engine bridges. Currently, this only happens from the SpeechBackground
dialplan application. If a user presses a DTMF key, that DTMF is relayed
directly to the engine interface for processing. It's a relatively simple
call (ast_speech_dtmf) which passes a DTMF frame down to the engine.
Whether or not the bridges actually do anything with it is up to them.

What speech recognition engine does mod_rayo/Chris interface to? I think
Ben and/or Chris mentioned it at Adhearsion Conf - but this may be as
straight forward as writing a bridge to that particular engine and passing
the DTMF through to it, as well as deciding how (or if) there's a better
way to interface with a speech engine through AMI/ARI.

> It is interesting to me in order to simplify the implementation of
> Adhearsion atop Asterisk, since right now we have this in Ruby based on AMI
> DTMF events. Is there any appetite among the Asterisk core team to
> investigate the addition of this to Asterisk core or as a module?
>

Yes!

>
> If this gets agreement in principle, I'd love to talk more thoroughly
> about the kind of API that would be most useful for us, and how we can move
> forward with specification and implementation.
>
>
And I agree in principle :-)

Matt

-- 
Matthew Jordan
Digium, Inc. | Engineering Manager
445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
Check us out at: http://digium.com & http://asterisk.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.digium.com/pipermail/asterisk-app-dev/attachments/20131211/1b948bb4/attachment.html>