Re: Stream audio to Speech to Text engine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.06.20 15:56, Mišo Belica wrote:

Hello,

a few days ago I started experimenting with PJSUA Python binding and now I am able to receive the call and play some WAV file back. But my goal is to receive the call and redirect the incoming RTP stream to Speech to Text engine that gives me the text. As a response I will play WAV recordings back as I get the utterances. The problem is that I am not able to find the right approach. My idea is to modify the SDP payload in callback "Call.onCallSdpCreated" to point to opened Speech-to-Text RTP stream but I have no luck with it. I wanted to simply replace the port in the SDP payload, because the app is running on the same host. Can you maybe suggest how to do that and if I am going the right direction? Or if there is better way then point me towards it?

I already searched in the documentation, GitHub and StackOverflow and the best I could find is https://stackoverflow.com/questions/31023274/how-to-catch-and-translate-incoming-audio-stream-in-other-languages-for-an-ios-c but that feels hacky and I have no idea how would I redirect the stream from that file to socket. Also I need to receive more than one call and I guess all would be mixed in the file so it's no the right way to go.

The code below is my poor try, but I always get some error from Swig. For example: TypeError: Attempt to append a non SwigPyObject

    def onCallSdpCreated(self, prm: pj.OnCallSdpCreatedParam):
        log("SDP created: ", dir(prm.sdp.pjSdpSession))
        log("SDP created: ", prm.sdp.wholeSdp)
        prm.sdp.pjSdpSession.append("m=audio 10000 RTP/AVP 0 101")

Thanks in advance for the answer and thanks for your work :)


First, let me say that I'm not familiar with the Python bindings, so I'm not sure this will help...
Here's how I would do it in C (and maybe the functions necessary for my solution are exported to Python...):

If I understand correctly, your app sits between a SIP caller and something generating an RTP stream.
With the raw C API you can actually manually create Media Streams and add them to the conference bridge.
That way you can simply connect the two streams on the confbridge and voila.

First you need to create a (UDP) transport, that the stream will use for sending/receiving RTP packets:
https://www.pjsip.org/docs/latest-2/pjmedia/docs/html/group__PJMEDIA__TRANSPORT__UDP.htm

Create a media (RTP) stream pjmedia_stream_create():
https://www.pjsip.org/docs/latest-2/pjmedia/docs/html/group__PJMED__STRM.htm#ga67575c8e7b15e325b98ebaa89639b550

After creation, get the media port using pjmedia_stream_get_port():
https://www.pjsip.org/docs/latest-2/pjmedia/docs/html/group__PJMED__STRM.htm#gae3cb31df5aa921ef3085d5eb539af063

Add the media port to the confbridge with pjsua_conf_add_port():
https://www.pjsip.org/docs/latest-2/pjsip/docs/html/group__PJSUA__LIB__MEDIA.htm#ga833528c1019f4ab5c8fb216b4b5f788b

And then start the stream with pjmedia_stream_start():
https://www.pjsip.org/docs/latest-2/pjmedia/docs/html/group__PJMED__STRM.htm#ga93d59e3be009de86a3823303784d31a2

Start the underlying media transport with pjmedia_transport_media_start():
https://www.pjsip.org/docs/latest-2/pjmedia/docs/html/group__PJMEDIA__TRANSPORT.htm#ga74ab1c1b9b09d75865a231519bb58aa7

Connect the two streams using pjsua_conf_connect().

Tinkering with the SDP seems error prone to me, which is why I try to avoid it whenever I can.


All the best,
Andreas

_______________________________________________
Visit our blog: http://blog.pjsip.org

pjsip mailing list
pjsip@xxxxxxxxxxxxxxx
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org

[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux