and use gdb attach to your program to dump the thread stack maybe another helpful method. regards, Gang On Wed, May 20, 2009 at 3:47 PM, Gang Liu <gangban.lau at gmail.com> wrote: > Everyone has his own prefer method to debug, trace problem. I say nothing > at it. > > But how many guys have read this document before using pjsua API? I don't > think so. > > I will close my mouth if you don't like some advice. > > I tried Dual Quard Core Xeon server and 1000 cps, 20000 channels. Is it > enough? > > I know may be it is difference between pjsip-ua api and pjsua api.But The > key is our own program logic design must follow this guide.And spend enough > time to understand why we need follow this guide. Then the world will be > more easy. > > regards, > Gang > On Wed, May 20, 2009 at 3:26 PM, M.S. <hamstiede at yahoo.de> wrote: > >> Hello Gang, >> >> to read the wiki guideline is the first step to resolve the problems, but >> if you have a multi threaddding/hyper thredding/multi-processor system and >> many connection for one agent, the race conditions will increase. >> i think i read the whole pjsip mail archive, and every month i heard about >> deadlock situations, infinity loops e.g. >> In my own applications i often use __FILE__ and __LINE__ debug outputs for >> mutex systems. it works fine !!! >> >> >> >> regards >> mark >> >> >> ------------------------------ >> *Von:* Gang Liu <gangban.lau at gmail.com> >> *An:* pjsip list <pjsip at lists.pjsip.org> >> *Gesendet:* Mittwoch, den 20. Mai 2009, 05:30:51 Uhr >> *Betreff:* Re: [pjsip] debug tips for cpu usage going wrong, deadlock >> issues, within pjsip or not. win32 specific thread profiling function used. >> >> http://trac.pjsip.org/repos/wiki/PJSUA_Locks >> >> Is this guide line useful? >> Indeed, it is still possible to create some SIP message flow to cause >> PJSUA or PJSIP-UA deadlock.But this is another topic. >> >> regards, >> Gang >> >> On Mon, May 18, 2009 at 4:11 PM, M.S. <hamstiede at yahoo.de> wrote: >> >>> thank you for your detailed debug summary. I am using linux but >>> __FILE__ and __LINE__ is working too. >>> For the thread tagging, i will use the linux pids (process-id). >>> >>> I thought i have a deadlock because: >>> - if i have 1 or 2 connections(calls) i have < 5 % processor use (small >>> arm system). >>> -if i have 4 connections if get (only for one thread) 80-100% processor >>> use. (after a view seconds). >>> (for each media stream i use a separate conference bridge. all this >>> threads will used only (<2% cpu)) >>> - after this i get some deadlock informations form the stack like: >>> >>> possible deadlock ........ >>> >>> Questions: >>> it is possible that this could happened if any stack state machine >>> checks his own mutex like "bool IsPjsuaLocked()" ? >>> >>> I think i will build this stuff your wrote in the pjlib (thread tagging >>> and detaild mutex logging with __FILE__ and __LINE__) >>> >>> >>> thank you >>> >>> Mark >>> >>> >>> ------------------------------ >>> *Von:* David Clark <vdc1048 at tx.rr.com> >>> *An:* pjsip list <pjsip at lists.pjsip.org>; pjsip list < >>> pjsip at lists.pjsip.org> >>> *Gesendet:* Samstag, den 16. Mai 2009, 09:08:42 Uhr >>> *Betreff:* [pjsip] debug tips for cpu usage going wrong, deadlock >>> issues, within pjsip or not. win32 specific thread profiling function used. >>> >>> No 100% might well be a bug but most likely not deadlock. deadlock is >>> where you are locked by the OS waiting for a mutex you will never get. >>> This locked state uses 0% cpu. The usual symptom I see is after x number >>> of calls. pjsip just does not communicate with anyone. Even SJPhone >>> the sip phone application is ignored. >>> >>> But how to resolve your cpu goes to 100% issue. Yea not putting sleep(1) >>> in some of your loops might be the cause of it. But if you have a >>> multithread >>> program in windows that is using 100% of the cpu and you don't know why. >>> I got some tricks to tell you where. Tell you which thread is using >>> that cpu and rule out others. >>> >>> The solution is to create a linked list of every thread in the box yorus >>> and pjsip threads. You can do this with a wrapper function around >>> CreateThread() >>> for your stuff. You can do this by having pjsip's win32 create thread >>> function call a function which simply adds the thread pjsip created to your >>> linked list. >>> >>> Ok got your linked list. Next step. >>> At regular intervals like say once a minute call a function which for >>> each thread in your linked list calls this function: >>> >>> /************************************************************************/ >>> /* >>> */ >>> /* Given a thread handle this function will determine the percent of >>> */ >>> /* time the thread has spent in user mode and kernal mode. >>> */ >>> /* >>> */ >>> >>> /************************************************************************/ >>> int thread_percent(HANDLE thread, int *user_mode, int *kernel_mode) >>> { >>> __int64 user_percent; >>> __int64 kernel_percent; >>> >>> LARGE_INTEGER process_utime; >>> LARGE_INTEGER process_ktime; >>> LARGE_INTEGER thread_utime; >>> LARGE_INTEGER thread_ktime; >>> >>> FILETIME process_creation_time; >>> FILETIME process_exit_time; >>> FILETIME process_user_time; >>> FILETIME process_kernel_time; >>> >>> FILETIME thread_creation_time; >>> FILETIME thread_exit_time; >>> FILETIME thread_user_time; >>> FILETIME thread_kernel_time; >>> >>> BOOL retval1, retval2; >>> int retval=0; // assume failure. >>> >>> >>> retval1=GetProcessTimes(GetCurrentProcess(), >>> &process_creation_time, >>> &process_exit_time, >>> &process_kernel_time, >>> &process_user_time); >>> >>> retval2=GetThreadTimes(thread, >>> &thread_creation_time, >>> &thread_exit_time, >>> &thread_kernel_time, >>> &thread_user_time); >>> >>> if ((retval1) && (retval2)) // if both functions worked. >>> { >>> memcpy(&process_utime, &process_user_time, sizeof(FILETIME)); >>> memcpy(&process_ktime, &process_kernel_time, sizeof(FILETIME)); >>> memcpy(&thread_utime, &thread_user_time, sizeof(FILETIME)); >>> memcpy(&thread_ktime, &thread_kernel_time, sizeof(FILETIME)); >>> >>> if (process_utime.QuadPart==0) >>> user_percent=0; >>> else >>> >>> user_percent=thread_utime.QuadPart*100/process_utime.QuadPart; >>> if (process_ktime.QuadPart==0) >>> kernel_percent=0; >>> else >>> >>> kernel_percent=thread_ktime.QuadPart*100/process_ktime.QuadPart; >>> >>> *user_mode=(int)user_percent; >>> *kernel_mode=(int)kernel_percent; >>> retval=1; >>> } >>> return(retval); >>> } >>> You will find most threads will be 0 and 0 and one will be significantly >>> higher say 30 and 40. The one that is significantly higher is the offender. >>> >>> Hope this helps and makes sense. >>> >>> On the deadlock debugging I did do. I did this. I moved pjsip debug >>> level at compile time and runtime to 6. >>> But that was still not enough to tell me what I needed to know. So I >>> augumented the mutex lock/unlock/try/destroy functions in pjilib >>> with a version that gave the callers __FILE__, and __LINE__. __FILE__ >>> gives the source module of the caller, and __LINE__ gives the >>> line number the function was called from. >>> >>> I did something like this: >>> #define mutex_lock(lock) _mutex_lock(lock, __FILE__, __LINE__) >>> >>> _mutex_lock(multex *lock, char *file, int line) >>> { >>> // in here I added file, and line to the level 6 debug output. >>> } >>> >>> Run that and you know where you blocked and where you locked prior to >>> that which gives the complete story. >>> >>> David Clark >>> ps. Note this does produce a ton of debug logs. We are talking >>> gigabytes. In my applications I have been able to >>> give mutex locking information as part of the thread usage reports. I >>> list what thread is blocked on which mutex >>> where the block happened and where ownership was last successfully >>> obtained. The complete story on one line. >>> I have not put this into pjsip yet. It involved replacing all mutex >>> functions, and since my application never uses >>> a try mutex function, I don't have that one. And pjsip uses that >>> heavily. So I would need to add that function. >>> >>> At 02:54 AM 5/15/2009, M.S. wrote: >>> >>> i think you have great skills to debug deadlock situations in pjsip. >>> can you give me a good advice to debug pjsip/pjsua mutex stuff >>> (log-level, debug outputs etc.) >>> >>> i have a situation that pjsip got 100% processor use. i think i have a >>> deadlock. >>> >>> regards >>> mark >>> >>> >>> *Von:* David Clark <vdc1048 at tx.rr.com> >>> *An:* pjsip list <pjsip at lists.pjsip.org>; pjsip list < >>> pjsip at lists.pjsip.org> >>> *Gesendet:* Donnerstag, den 14. Mai 2009, 22:22:56 Uhr >>> *Betreff:* [pjsip] pjsip version 1.1 gotcha #2 with work around. >>> >>> This is similar to gotcha #1. So I will go into less detail. If the >>> on_call_state() handler which gets hangup >>> notifications calls pjsua_recorder_destroy(). The box can stop >>> communicating after that for the following reason. >>> 1) pjsua_recorder_destroy() locks pjsua mutex. >>> 2) pjsua_recorder_destroy() locals conf mutex indirectly by calling >>> pjsua_conf_remove_port(). >>> >>> But if conf mutex is locked in port audio clock thread get_frame() at the >>> point of the call. >>> >>> You will block waiting for conf mutex and other threads will never get >>> pjsua mutex. >>> >>> Work around: don't call pjsua_recorder_destroy() in on_call_state() >>> signal the application thread to wait up and do it >>> in the application thread. >>> >>> David Clark >>> >>> _______________________________________________ >>> Visit our blog: http://blog.pjsip.org >>> >>> pjsip mailing list >>> pjsip at lists.pjsip.org >>> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org >>> >>> >>> >>> _______________________________________________ >>> Visit our blog: http://blog.pjsip.org >>> >>> pjsip mailing list >>> pjsip at lists.pjsip.org >>> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org >>> >>> >> >> >> _______________________________________________ >> Visit our blog: http://blog.pjsip.org >> >> pjsip mailing list >> pjsip at lists.pjsip.org >> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20090520/ad2c18f6/attachment-0001.html>