Deadlocks: solved

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 8, 2008 at 6:11 AM, Turnaev Eugeny <turnaev at t72.ru> wrote:

>
> Hi.
>
>        I have found why my program was deadlocking.
>
>        in py_pjsua.c
>
>        in function
>                static PyObject *py_pjsua_handle_events(PyObject *pSelf,
> PyObject *pArgs)
>
>        I have commented
>        lines Py_BEGIN_ALLOW_THREADS
>        and Py_END_ALLOW_THREADS
>
>        those are python macro witch allow other python threads to run
>        while current thread is in some long running IO aperations for
> example.
>
>
Thanks for the info. Frankly I've forgotten why those macros are there, but
reading the code comments there I'm not too sure if removing it is a good
idea. According to the comments, and if I recall correctly, without those
macros the readline() call (used by pjsua_app.py to operate console menu)
will block forever. So are you sure that's a good idea?

If so then probably we can also remove
PyGILState_Ensure()/PyGILState_Release() call in
ENTER_PYTHON()/LEAVE_PYTHON() macros.

One thing for sure, I was unsure about which threading model works best when
I created py_pjsua.c, and I experimented with several threading models. So
it could be that some of these are artifacts from that experimentations.



>        Also i get rid of worker_thread to poll with handle_events() .. now
> i am
>        polling from the thread where all other calls to py_pjsua lib is
> located.
>
>
That would be the best way indeed. But now that you invoke pjsua from one
thread, the original handle_events() (with Py_BEGIN_ALLOW_THREADS) should
work, shouldn't it?



>        Maybe the deadlocking situation was like this:
>
>                a worker thread called hanle_events
>                        inside handle events .. some pjsua internal
> functions
>                        is called to get events.. mutexes acquired.
>
>                now a context switches.. ( because we have other python
> threads and Py_BEGIN_ALLOW_THREADS was called)
>
>                context switched to other python thread calling another
> py_pjsua func..
>                        so another pjsua internal function is entered..
>
>                This must not be a problem as long as mutexes acuried in the
> same order..
>                and in FAQ it is stated that in pjsip acquires mutexes in
> one order..
>

Yep, normally that shouldn't be a problem, assuming that application doesn't
have it's own mutex. Of course in this case the application, i.e. Python,
does have a mutex, so this what might have caused the problem (probably
because Py_BEGIN_ALLOW_THREADS releases Python mutex?).




>
>                I dont know i think i saw 2 different macro.. in pjsua  .. a
> try_to_get_mutex
>                and get_mutex.. maybe problem in this.
>
>
That's just for deadlock detection, shouldn't be a problem.


>
>
>        Ok. i am not stating pjsua have a deadlock bug, maybe this was my
> bad
>        application design or i messed up somewhere else.
>        But my app design a little bit mimics design of app in example
>
> http://svn.pjsip.org/repos/pjproject/trunk/pjsip-apps/src/py_pjsua/pjsua_app.py
>        1 worker thread.. and also calls from other thread..
>        So i assume that example also have a deadlock problem..
>        it is just a metter of load.. achitecture gived in example worked
> for me
>        if i had 1 simul call.. and when i started with 16 simul calls it is
> deadlocked
>        in about 1-5 minutes.. (and as i can see from debug it is deadlocked
> in a call to py_pjsua)
>

Yeah, could be. I'm now also working on the wrapper again so I'll look into
that.


>
>        Btw in example - another thread (main in example and not main in my
> app) is not registered
>        with py_pjsua.thread_register() .. so what threads examply must be
> registed
>        i am not getting it.
>

Sorry I don't get you. But the main thread doesn't need to be register,
since calling pj_init() will implicitly registers the thread.


>
>        Now i have no worker thread... i am polling right from the thread
> where all other
>        calls to pj_pjsua located and also removed Py_BEGIN_ALLOW_THREADS
> from handle_events
>        to disallow interleave of threads while i am in handle_events().
>
>
Try putting sleep() in the other thread (the one that doesn't call pjsua),
and see if that thread block forever. ;-)



>
> So anyone working from python with pjsua can do the same if having
> unexpected problems with deadlocks :)
>
> Best wishes for developers of pjsua, nice work :)
>
>
Thanks.

I'm now working on object oriented Python wrapper on top of the C Python
module, hopefully that will make it easier to use. Stay tuned! :)

Cheers
 Benny
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20080708/fb6e3dff/attachment-0001.html 


[Index of Archives]     [Asterisk Users]     [Asterisk App Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [Linux API]
  Powered by Linux