Re: Stream audio to Speech to Text engine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On 18.06.20 15:56, Mišo Belica wrote:


a few days ago I started experimenting with PJSUA Python binding and now I am able to receive the call and play some WAV file back. But my goal is to receive the call and redirect the incoming RTP stream to Speech to Text engine that gives me the text. As a response I will play WAV recordings back as I get the utterances. The problem is that I am not able to find the right approach. My idea is to modify the SDP payload in callback "Call.onCallSdpCreated" to point to opened Speech-to-Text RTP stream but I have no luck with it. I wanted to simply replace the port in the SDP payload, because the app is running on the same host. Can you maybe suggest how to do that and if I am going the right direction? Or if there is better way then point me towards it?

I already searched in the documentation, GitHub and StackOverflow and the best I could find is but that feels hacky and I have no idea how would I redirect the stream from that file to socket. Also I need to receive more than one call and I guess all would be mixed in the file so it's no the right way to go.

The code below is my poor try, but I always get some error from Swig. For example: TypeError: Attempt to append a non SwigPyObject

    def onCallSdpCreated(self, prm: pj.OnCallSdpCreatedParam):
        log("SDP created: ", dir(prm.sdp.pjSdpSession))
        log("SDP created: ", prm.sdp.wholeSdp)
        prm.sdp.pjSdpSession.append("m=audio 10000 RTP/AVP 0 101")

Thanks in advance for the answer and thanks for your work :)

First, let me say that I'm not familiar with the Python bindings, so I'm not sure this will help...
Here's how I would do it in C (and maybe the functions necessary for my solution are exported to Python...):

If I understand correctly, your app sits between a SIP caller and something generating an RTP stream.
With the raw C API you can actually manually create Media Streams and add them to the conference bridge.
That way you can simply connect the two streams on the confbridge and voila.

First you need to create a (UDP) transport, that the stream will use for sending/receiving RTP packets:

Create a media (RTP) stream pjmedia_stream_create():

After creation, get the media port using pjmedia_stream_get_port():

Add the media port to the confbridge with pjsua_conf_add_port():

And then start the stream with pjmedia_stream_start():

Start the underlying media transport with pjmedia_transport_media_start():

Connect the two streams using pjsua_conf_connect().

Tinkering with the SDP seems error prone to me, which is why I try to avoid it whenever I can.

All the best,

Visit our blog:

pjsip mailing list

[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux