I encountered deadlock in glibc's pthread_mutex_lock as below: 'pthread_mutex_lock.c:314: __pthread_mutex_lock_full: Assertion `(e) != 45 || (kind != PTHREAD_MUTEX_ERRORCHECK_NP && kind != PTHREAD_MUTEX_RECURSIVE_NP)' failed.' glibc: 2.16 linux: 3.10.87-rt80-Cavium-Octeon arch: MIPS ThreadA called __pthread_mutex_lock_full(mutex). The type of mutex is PTHREAD_MUTEX_PI_RECURSIVE_NP or PTHREAD_MUTEX_PI_ERRORCHECK_NP. ThreadA found the value of mutex->__data.__lock is another task ThreadB's tid. So it entered the linux kernel via system call. (the auto variable 'oldval' in __pthread_mutex_lock_full was stored in the stack) The linux kernel find the value mutex->__data.__lock is ThreadA itself in 'if ((unlikely((uval & FUTEX_TID_MASK) == vpid)))' in futex_lock_pi_atomic(), So return -EDEADLK. __pthread_mutex_lock_full() judge the return value and asserted. coredump file generated, and the value mutex->__data.__lock in the coredump file is 0. And the ThreadB is in the start of the entry function, for example waiting another message to be processed (i.e. has released the lock). $5 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 33, __nusers = 0, {__spins = 0, __list = {__next = 0x0}}}, __size = '\000' <repeats 15 times>, "!\000\000\000\000\000\000\000", __align = 0} ThreadA and ThreadB belong to the same process, but run on different cpus (SMP). To debug this issue, i add printing in the kernel, and it indicates the ThreadA deadlocked itself. The displayed uaddr is &(mutex->__data.__lock). @@ -997,8 +1093,13 @@ static int futex_lock_pi_atomic(u32 __us /* * Detect deadlocks. */ - if ((unlikely((uval & FUTEX_TID_MASK) == vpid))) + if ((unlikely((uval & FUTEX_TID_MASK) == vpid))) { + printk(KERN_ERR "uaddr:%p, uval:%u, vpid:%u, task:%s(%d),prio:%d,normal:%d, current:%s(%d),prio:%d,normal:%d\n", uaddr, (unsigned)uval, (unsigned)vpid, task->comm, task_pid_nr(task), task->prio, task->normal_prio, current->comm, task_pid_nr(current), current->prio, current->normal_prio); + show_stack(task, NULL); + if (current != task) + show_stack(current, NULL); return -EDEADLK; + } Fragment in __pthread_mutex_lock_full(): int newval = id; #ifdef NO_INCR newval |= FUTEX_WAITERS; #endif oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, newval, 0); if (oldval != 0) { /* The mutex is locked. The kernel will now take care of everything. */ int private = (robust ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex) : PTHREAD_MUTEX_PSHARED (mutex)); INTERNAL_SYSCALL_DECL (__err); int e = INTERNAL_SYSCALL (futex, __err, 4, &mutex->__data.__lock, __lll_private_flag (FUTEX_LOCK_PI, private), 1, 0); 0x77f4ed98 <+616>: sw zero,0(sp) 0x77f4ed9c <+620>: ll v1,0(s0) //v1: mutex->__data.__lock (Load linked (LL) and store conditional (SC)) 0x77f4eda0 <+624>: bnez v1,0x77f4edb4 <__pthread_mutex_lock_full+644> 0x77f4eda4 <+628>: move at,s1 0x77f4eda8 <+632>: sc at,0(s0) //mutex->__data.__lock = current task's tid 0x77f4edac <+636>: beqz at,0x77f4ed9c <__pthread_mutex_lock_full+620> 0x77f4edb0 <+640>: nop 0x77f4edb4 <+644>: beqz v1,0x77f4eea0 <__pthread_mutex_lock_full+880> 0x77f4edb8 <+648>: sw v1,0(sp) 0x77f4edbc <+652>: bnez a4,0x77f4edcc <__pthread_mutex_lock_full+668> 0x77f4edc0 <+656>: li v0,128 0x77f4edc4 <+660>: lw v0,12(s0) I could not image a scenario that lead to 3 different values on the same variable mutex->__data.__lock seen in 3 positions. It's very difficult to reproduce this issue (About 1 ~ several months for 1 reproducing). And we failed to reproduce it using small application. Any help is welcome! B.R. Yimin