No 100% might well be a bug but most likely not deadlock. deadlock is where you are locked by the OS waiting for a mutex you will never get. This locked state uses 0% cpu. The usual symptom I see is after x number of calls. pjsip just does not communicate with anyone. Even SJPhone the sip phone application is ignored. But how to resolve your cpu goes to 100% issue. Yea not putting sleep(1) in some of your loops might be the cause of it. But if you have a multithread program in windows that is using 100% of the cpu and you don't know why. I got some tricks to tell you where. Tell you which thread is using that cpu and rule out others. The solution is to create a linked list of every thread in the box yorus and pjsip threads. You can do this with a wrapper function around CreateThread() for your stuff. You can do this by having pjsip's win32 create thread function call a function which simply adds the thread pjsip created to your linked list. Ok got your linked list. Next step. At regular intervals like say once a minute call a function which for each thread in your linked list calls this function: /************************************************************************/ /* */ /* Given a thread handle this function will determine the percent of */ /* time the thread has spent in user mode and kernal mode. */ /* */ /************************************************************************/ int thread_percent(HANDLE thread, int *user_mode, int *kernel_mode) { __int64 user_percent; __int64 kernel_percent; LARGE_INTEGER process_utime; LARGE_INTEGER process_ktime; LARGE_INTEGER thread_utime; LARGE_INTEGER thread_ktime; FILETIME process_creation_time; FILETIME process_exit_time; FILETIME process_user_time; FILETIME process_kernel_time; FILETIME thread_creation_time; FILETIME thread_exit_time; FILETIME thread_user_time; FILETIME thread_kernel_time; BOOL retval1, retval2; int retval=0; // assume failure. retval1=GetProcessTimes(GetCurrentProcess(), &process_creation_time, &process_exit_time, &process_kernel_time, &process_user_time); retval2=GetThreadTimes(thread, &thread_creation_time, &thread_exit_time, &thread_kernel_time, &thread_user_time); if ((retval1) && (retval2)) // if both functions worked. { memcpy(&process_utime, &process_user_time, sizeof(FILETIME)); memcpy(&process_ktime, &process_kernel_time, sizeof(FILETIME)); memcpy(&thread_utime, &thread_user_time, sizeof(FILETIME)); memcpy(&thread_ktime, &thread_kernel_time, sizeof(FILETIME)); if (process_utime.QuadPart==0) user_percent=0; else user_percent=thread_utime.QuadPart*100/process_utime.QuadPart; if (process_ktime.QuadPart==0) kernel_percent=0; else kernel_percent=thread_ktime.QuadPart*100/process_ktime.QuadPart; *user_mode=(int)user_percent; *kernel_mode=(int)kernel_percent; retval=1; } return(retval); } You will find most threads will be 0 and 0 and one will be significantly higher say 30 and 40. The one that is significantly higher is the offender. Hope this helps and makes sense. On the deadlock debugging I did do. I did this. I moved pjsip debug level at compile time and runtime to 6. But that was still not enough to tell me what I needed to know. So I augumented the mutex lock/unlock/try/destroy functions in pjilib with a version that gave the callers __FILE__, and __LINE__. __FILE__ gives the source module of the caller, and __LINE__ gives the line number the function was called from. I did something like this: #define mutex_lock(lock) _mutex_lock(lock, __FILE__, __LINE__) _mutex_lock(multex *lock, char *file, int line) { // in here I added file, and line to the level 6 debug output. } Run that and you know where you blocked and where you locked prior to that which gives the complete story. David Clark ps. Note this does produce a ton of debug logs. We are talking gigabytes. In my applications I have been able to give mutex locking information as part of the thread usage reports. I list what thread is blocked on which mutex where the block happened and where ownership was last successfully obtained. The complete story on one line. I have not put this into pjsip yet. It involved replacing all mutex functions, and since my application never uses a try mutex function, I don't have that one. And pjsip uses that heavily. So I would need to add that function. At 02:54 AM 5/15/2009, M.S. wrote: >i think you have great skills to debug deadlock situations in pjsip. >can you give me a good advice to debug pjsip/pjsua mutex stuff >(log-level, debug outputs etc.) > >i have a situation that pjsip got 100% processor use. i think i have >a deadlock. > > regards > mark > > >Von: David Clark <vdc1048 at tx.rr.com> >An: pjsip list <pjsip at lists.pjsip.org>; pjsip list <pjsip at lists.pjsip.org> >Gesendet: Donnerstag, den 14. Mai 2009, 22:22:56 Uhr >Betreff: [pjsip] pjsip version 1.1 gotcha #2 with work around. > >This is similar to gotcha #1. So I will go into less detail. If >the on_call_state() handler which gets hangup >notifications calls pjsua_recorder_destroy(). The box can stop >communicating after that for the following reason. >1) pjsua_recorder_destroy() locks pjsua mutex. >2) pjsua_recorder_destroy() locals conf mutex indirectly by calling >pjsua_conf_remove_port(). > >But if conf mutex is locked in port audio clock thread get_frame() >at the point of the call. > >You will block waiting for conf mutex and other threads will never >get pjsua mutex. > >Work around: don't call pjsua_recorder_destroy() in on_call_state() >signal the application thread to wait up and do it >in the application thread. > >David Clark > >_______________________________________________ >Visit our blog: http://blog.pjsip.org > >pjsip mailing list >pjsip at lists.pjsip.org >http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20090516/6111d4df/attachment-0001.html>