On Mon, Nov 19, 2007 at 01:27:37PM -0800, Kaz Kylheku wrote: > >> From time to time, on 2.6.17.7, I see a deadlock situation go off. > >> The soft lockup tick occurs in the middle of do_futex, which is > >> heavily inlined. The system is actually hosed; it's not one of those > >> recoverable CPU busy situations that can sometimes trigger the lockup > >> detector. > > > > Can you reproduce thing hang also if you're not running in a > > binary compat > > mode, that is either running o32 binaries on a 32-bit kernel or > > 64-bit binaries on a 64-bit kernel? > > I have hacked up little a test program which hosed my board within > seconds. > The system is not completely hung. However: Cute. So looking again at the futex code this morning it was quite obvious what happened. The ll/sc loops in __futex_atomic_op() had the usual fixups necessary for memory acccesses to userspace from kernel space installed: __asm__ __volatile__( " .set push \n" " .set noat \n" " .set mips3 \n" "1: ll %1, %4 # __futex_atomic_op \n" " .set mips0 \n" " " insn " \n" " .set mips3 \n" "2: sc $1, %2 \n" " beqz $1, 1b \n" __WEAK_LLSC_MB "3: \n" " .set pop \n" " .set mips0 \n" " .section .fixup,\"ax\" \n" "4: li %0, %6 \n" " j 2b \n" <----- " .previous \n" " .section __ex_table,\"a\" \n" " "__UA_ADDR "\t1b, 4b \n" " "__UA_ADDR "\t2b, 4b \n" " .previous \n" : "=r" (ret), "=&r" (oldval), "=R" (*uaddr) : "0" (0), "R" (*uaddr), "Jr" (oparg), "i" (-EFAULT) : "memory"); Notice the branch at the end of the fixup code, it goes back to the SC instruction. The SC instruction took an exception so it will not have changed $1 so the loop will continue endless unless by coincidence the value to be stored from $1 happened to be zero. Obviously this one was MIPS specific and may hit all supported ABIs. So my initial suspicion this might be the issue David Miller recently discovered in the binary compat code isn't true. And it's a local DoS probably for all of 2.6.16 and up. Patch below. It fixes your test case on a 32-bit kernel for me. Ralf Signed-off-by: Ralf Baechle <ralf@xxxxxxxxxxxxxx> diff --git a/include/asm-mips/futex.h b/include/asm-mips/futex.h index 3e7e30d..17f082c 100644 --- a/include/asm-mips/futex.h +++ b/include/asm-mips/futex.h @@ -35,7 +35,7 @@ " .set mips0 \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %6 \n" \ - " j 2b \n" \ + " j 3b \n" \ " .previous \n" \ " .section __ex_table,\"a\" \n" \ " "__UA_ADDR "\t1b, 4b \n" \ @@ -61,7 +61,7 @@ " .set mips0 \n" \ " .section .fixup,\"ax\" \n" \ "4: li %0, %6 \n" \ - " j 2b \n" \ + " j 3b \n" \ " .previous \n" \ " .section __ex_table,\"a\" \n" \ " "__UA_ADDR "\t1b, 4b \n" \