PJMedia subsystem stucks on media creation in case of epoll ioqeue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Hi there,

Migrating to PJSIP version 2.8 I've found a serious degradation in media subsystem - all current media were paused on new media creation. I found the reason is changes in #2097 ticket. Details are following.

I develop application to work on Linux. Application must support several simultaneous calls. I use Epoll IOQueue, it is the best for Linux - new sockets can be added to epoll while epoll is in wait state, so the new socket will be polled without waiting (as in select ioqueue) to timeout so we can recreate FD_SETs.

There're some differences in select and epoll IOQueue implementations - pj_ioqueue_register_sock, ioqueue_add_to_set and ioqueue_remove_from_set functions.
Select implementation is clear - registered socket is not added to any FS_SET, it is added/removed only by _add_ and _remove_ functions.
EPoll implementation set EPOLLIN and EPOLLERR events on socket registration. add_to and remove_from functions switch only the EPOLLOUT event for polled socket. In this way epoll ioqueue always polls socket for read and error.

2.7.2 version everithing was fine, but on 2.8 all current media were paused on one more media creation. Debugging for 2 days I found that media subsystem doesn't receive any RTP packets for 500 msec on new media creation, after that all paused packet arrived fast one after another. I found media worker thread was sleeping for 500 msec - you can see pj_thread_sleep call at the end of ioqueue_epoll.c file, it is described as "Special case" in comment. The cause of this "Special case" is changes in ticket #2097.

Before 2.8 socket registration and setting it for async read were both called in pjmedia_transport_udp_attach, so ioqueue started reading data from socket just on transport attach.
In 2.8 settings socket for async read is made later - in transport_media_start. In interval between thees two calls socket already was polled for read, but it wasn't scheduled for reading, read_list in pj_ioqueue_key is empty at this moment. So we're in "Special case" now - epoll result is >0, but no events are marked to be processed. Media thread is put to sleep now for 500 msec (poll timeout) so media can be started and socket can be scheduled to read.
By the weay, there's no such a problem in select ioqueue implemenation.

Suggested patch changes ioqueue_epoll behavior to the same as ioqueue_select - epoll events mask is set explicit way. So socket is added to epoll on transport creation but is not really polled for read until media is started (it is the goal of #2097 ticket).
One thing in my patch must be noted. epoll doesn't provide any API to get information about events scheduled for the socket, so on any events modification we must know the current events mask and set or reset necessary bit to modify epoll behavior. So I had to remember  somewhere the set of events to be scheduled. I added a field to pj_ioqueue_key_t structure to remember scheduled events.

Please review my patch and add it to the main stream.

PS It wasn't checked to work with TCP.

Alexey Ermoshin

Attachment: ioqueue_epoll.c.patch
Description: Binary data

Visit our blog: http://blog.pjsip.org

pjsip mailing list

[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux