On Wed, 30 Jan 2019, Sebastian Sewior wrote: > On 2019-01-30 18:56:54 [+0100], Thomas Gleixner wrote: > > TBH, no clue. Below are some more traceprintks which hopefully shed some > > light on that mystery. See kernel/futex.c line 30 ... > > The robust list it somehow buggy. In the last trace we had the > handle_futex_death() of uaddr 3ff9e880140 as the last action. That means > it was an entry in 56496's ->list_op_pending entry. This makes sense > because it tried to acquire the lock, failed, got killed. The robust list of the failing task seems to be correct. > According to uaddr pid 56956 is the owner. So 56956 invoked one of > pthread_mutex_lock() / pthread_mutex_timedlock() / > pthread_mutex_trylock() and should have obtained the lock in userland. > Depending on where it got killed, that mutex should be either recorded in > ->list_op_pending or the robust_list (or both if it didn't clear > ->list_op_pending yet). But it is not. > Similar for pthread_mutex_unlock(). > We don't have a trace_point if we abort processing the list. The only reason why it would abort is due a page fault because that cannot be handled in the exit code anymore. > On the other hand, it didn't trigger on x86 for hours. Could the atomic s/hours/days/ .... > ops be the culprit? The glibc code does: THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, (void *) (((uintptr_t) &mutex->__data.__list.__next) | 1)); .... lock in user space or lock in kernel space ENQUEUE_MUTEX_PI (mutex); THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL); ENQUEUE_MUTEX_PI() resolves to a THREAD_GETMEM() which reads the list head from TLS, some list manipulation operations and the final THREAD_SETMEM() which stores the new list head Now on x86 THREAD_GETMEM() and THREAD_SETMEM() are resolving to asm volatile ("movX .....") on s390 they are descr->member based operations. Now the important part of the robust list is the store sequence, i.e. the list head and final update to the TLS visible part need to come _before_ list_op_pending is cleared. I might be missing something, but there is no compiler barrier in that code which would prevent the compiler from reordering the stores. It can rightfully do so because there is no compiler visible dependency of these two operations. On x8664 the asm volatile might prevent it by chance, but it does not have a 'memory' specified which would guarantee a compiler barrier. On s390 there is certainly nothing. So assumed that clearing list_op_pending comes before the list head update, then the robust exit code in the kernel will fail to see either of them. FAIL. I might be wrong as usual, but this would definitely explain the fail very well. Thanks, tglx