http://trac.pjsip.org/repos/wiki/PJSUA_Locks Is this guide line useful? Indeed, it is still possible to create some SIP message flow to cause PJSUA or PJSIP-UA deadlock.But this is another topic. regards, Gang On Mon, May 18, 2009 at 4:11 PM, M.S. <hamstiede at yahoo.de> wrote: > thank you for your detailed debug summary. I am using linux but __FILE__ > and __LINE__ is working too. > For the thread tagging, i will use the linux pids (process-id). > > I thought i have a deadlock because: > - if i have 1 or 2 connections(calls) i have < 5 % processor use (small arm > system). > -if i have 4 connections if get (only for one thread) 80-100% processor > use. (after a view seconds). > (for each media stream i use a separate conference bridge. all this threads > will used only (<2% cpu)) > - after this i get some deadlock informations form the stack like: > > possible deadlock ........ > > Questions: > it is possible that this could happened if any stack state machine checks > his own mutex like "bool IsPjsuaLocked()" ? > > I think i will build this stuff your wrote in the pjlib (thread tagging and > detaild mutex logging with __FILE__ and __LINE__) > > > thank you > > Mark > > > ------------------------------ > *Von:* David Clark <vdc1048 at tx.rr.com> > *An:* pjsip list <pjsip at lists.pjsip.org>; pjsip list < > pjsip at lists.pjsip.org> > *Gesendet:* Samstag, den 16. Mai 2009, 09:08:42 Uhr > *Betreff:* [pjsip] debug tips for cpu usage going wrong, deadlock issues, > within pjsip or not. win32 specific thread profiling function used. > > No 100% might well be a bug but most likely not deadlock. deadlock is > where you are locked by the OS waiting for a mutex you will never get. > This locked state uses 0% cpu. The usual symptom I see is after x number > of calls. pjsip just does not communicate with anyone. Even SJPhone > the sip phone application is ignored. > > But how to resolve your cpu goes to 100% issue. Yea not putting sleep(1) > in some of your loops might be the cause of it. But if you have a > multithread > program in windows that is using 100% of the cpu and you don't know why. I > got some tricks to tell you where. Tell you which thread is using > that cpu and rule out others. > > The solution is to create a linked list of every thread in the box yorus > and pjsip threads. You can do this with a wrapper function around > CreateThread() > for your stuff. You can do this by having pjsip's win32 create thread > function call a function which simply adds the thread pjsip created to your > linked list. > > Ok got your linked list. Next step. > At regular intervals like say once a minute call a function which for each > thread in your linked list calls this function: > /************************************************************************/ > /* */ > /* Given a thread handle this function will determine the percent of */ > /* time the thread has spent in user mode and kernal mode. */ > /* */ > /************************************************************************/ > int thread_percent(HANDLE thread, int *user_mode, int *kernel_mode) > { > __int64 user_percent; > __int64 kernel_percent; > > LARGE_INTEGER process_utime; > LARGE_INTEGER process_ktime; > LARGE_INTEGER thread_utime; > LARGE_INTEGER thread_ktime; > > FILETIME process_creation_time; > FILETIME process_exit_time; > FILETIME process_user_time; > FILETIME process_kernel_time; > > FILETIME thread_creation_time; > FILETIME thread_exit_time; > FILETIME thread_user_time; > FILETIME thread_kernel_time; > > BOOL retval1, retval2; > int retval=0; // assume failure. > > > retval1=GetProcessTimes(GetCurrentProcess(), > &process_creation_time, > &process_exit_time, > &process_kernel_time, > &process_user_time); > > retval2=GetThreadTimes(thread, > &thread_creation_time, > &thread_exit_time, > &thread_kernel_time, > &thread_user_time); > > if ((retval1) && (retval2)) // if both functions worked. > { > memcpy(&process_utime, &process_user_time, sizeof(FILETIME)); > memcpy(&process_ktime, &process_kernel_time, sizeof(FILETIME)); > memcpy(&thread_utime, &thread_user_time, sizeof(FILETIME)); > memcpy(&thread_ktime, &thread_kernel_time, sizeof(FILETIME)); > > if (process_utime.QuadPart==0) > user_percent=0; > else > > user_percent=thread_utime.QuadPart*100/process_utime.QuadPart; > if (process_ktime.QuadPart==0) > kernel_percent=0; > else > > kernel_percent=thread_ktime.QuadPart*100/process_ktime.QuadPart; > > *user_mode=(int)user_percent; > *kernel_mode=(int)kernel_percent; > retval=1; > } > return(retval); > } > You will find most threads will be 0 and 0 and one will be significantly > higher say 30 and 40. The one that is significantly higher is the offender. > > Hope this helps and makes sense. > > On the deadlock debugging I did do. I did this. I moved pjsip debug level > at compile time and runtime to 6. > But that was still not enough to tell me what I needed to know. So I > augumented the mutex lock/unlock/try/destroy functions in pjilib > with a version that gave the callers __FILE__, and __LINE__. __FILE__ > gives the source module of the caller, and __LINE__ gives the > line number the function was called from. > > I did something like this: > #define mutex_lock(lock) _mutex_lock(lock, __FILE__, __LINE__) > > _mutex_lock(multex *lock, char *file, int line) > { > // in here I added file, and line to the level 6 debug output. > } > > Run that and you know where you blocked and where you locked prior to that > which gives the complete story. > > David Clark > ps. Note this does produce a ton of debug logs. We are talking > gigabytes. In my applications I have been able to > give mutex locking information as part of the thread usage reports. I list > what thread is blocked on which mutex > where the block happened and where ownership was last successfully > obtained. The complete story on one line. > I have not put this into pjsip yet. It involved replacing all mutex > functions, and since my application never uses > a try mutex function, I don't have that one. And pjsip uses that heavily. > So I would need to add that function. > > At 02:54 AM 5/15/2009, M.S. wrote: > > i think you have great skills to debug deadlock situations in pjsip. > can you give me a good advice to debug pjsip/pjsua mutex stuff > (log-level, debug outputs etc.) > > i have a situation that pjsip got 100% processor use. i think i have a > deadlock. > > regards > mark > > > *Von:* David Clark <vdc1048 at tx.rr.com> > *An:* pjsip list <pjsip at lists.pjsip.org>; pjsip list < > pjsip at lists.pjsip.org> > *Gesendet:* Donnerstag, den 14. Mai 2009, 22:22:56 Uhr > *Betreff:* [pjsip] pjsip version 1.1 gotcha #2 with work around. > > This is similar to gotcha #1. So I will go into less detail. If the > on_call_state() handler which gets hangup > notifications calls pjsua_recorder_destroy(). The box can stop > communicating after that for the following reason. > 1) pjsua_recorder_destroy() locks pjsua mutex. > 2) pjsua_recorder_destroy() locals conf mutex indirectly by calling > pjsua_conf_remove_port(). > > But if conf mutex is locked in port audio clock thread get_frame() at the > point of the call. > > You will block waiting for conf mutex and other threads will never get > pjsua mutex. > > Work around: don't call pjsua_recorder_destroy() in on_call_state() signal > the application thread to wait up and do it > in the application thread. > > David Clark > > _______________________________________________ > Visit our blog: http://blog.pjsip.org > > pjsip mailing list > pjsip at lists.pjsip.org > http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org > > > > _______________________________________________ > Visit our blog: http://blog.pjsip.org > > pjsip mailing list > pjsip at lists.pjsip.org > http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20090520/0ec817fd/attachment-0001.html>