I've been banging my head on the desk over gcc PR61538 [1] the last few months, and talking to the gcc people, I went looking through the R10000 manual again to try and see if some kind of errata sticks out. I found this bit: """ Load Linked and Store Conditional instructions (LL, LLD, SC, and SCD) do not implicitly perform SYNC operations in the R10000. Any of the following events that occur between a Load Linked and a Store Conditional will cause the Store Conditional to fail: an exception; execution of an ERET, a load, a store, a SYNC, a CacheOp, a prefetch, or an external intervention/invalidation on the block containing the linked address. Instruction cache misses do not cause the Store Conditional to fail. """ The regression happens inside glibc's __lll_lock_wait_private routine: void __lll_lock_wait_private (int *futex) { if (*futex == 2) lll_futex_wait (futex, 2, LLL_PRIVATE); while (atomic_exchange_acq (futex, 2) != 0) lll_futex_wait (futex, 2, LLL_PRIVATE); } It appears to hang forever on the "atomic_exchange_acq" function call. Disassembling a statically-built copy of the "sln" binary generated by glibc's compile phase, there are slight differences in how gcc-4.7 and gcc-4.8 are compiling the __lll_lock_wait_private function. The key differences in the output asm are this: gcc-4.7: x+4 <START> ... x+24 bne v1,v0,<x+56> ... x+32 0x7c03e83b /* rdhwr */ x+36 li a2,2 x+40 lw a1,-29832(v1) x+44 move a3,zero x+48 li v0,4238 x+52 syscall * x+56 li v0,2 * x+60 ll v1,0(s0) * x+64 move a0,v0 * x+68 sc a0,0(s0) x+72 beqzl a0,<x+56> x+76 nop x+80 sync x+84 bnez v1,<x+32> gcc-4.8: x+4 <START> ... x+24 bne v1,v0,<x+56> ... x+32 0x7c03e83b /* rdhwr */ x+36 li a2,2 x+40 lw a1,-29832(v1) x+44 move a3,zero x+48 li v0,4238 x+52 syscall * x+56 ll v0,0(s0) * x+60 li at,2 * x+64 sc at,0 x+68 beqzl at,<x+56> x+72 nop x+76 sync x+80 bnez v0,<x+32> Using gdb, if I step through 'sln', the gcc-4.7 copy never calls __lll_lock_wait_private, so I have no idea how the insns are being executed. But the 4.8 copy does get into this function, and stepping each instruction at a time yields this execution path: x+4 <START> ... x+24 bne v1,v0,<x+56> x+56 ll v0,0(s0) x+68 beqzl at,<x+56> /* beqzl check fails -> x+76 */ x+76 sync x+80 bnez v0,<x+32> x+32 0x7c03e83b /* rdhwr */ x+36 li a2,2 x+40 lw a1,-29832(v1) x+44 move a3,zero x+48 li v0,4238 x+52 syscall x+56 ll v0,0(s0) <HANG> Executing the 'bnez' insn puts us at the rdhwr insn (x+32), then stepping through, the 'syscall' (x+56) returns and leaves us at the 'll' a second time, where the program just hangs. I am guessing at a few things here: - Because ll/sc are atomic, gdb doesn't let you step through them, which is why the instruction pointer jumps over the 'li' and 'sc' insns. - The 'li' after 'll' triggers the 'sc' to fail on R10K. Does this look correct for an R10000, given the above statement from the manual? I'm not sure how or why this would cause the program to hang, but it seems to directly correlate. Anyone from Debian able to test building gcc-4.8 (or greater) and glibc-2.19 on an R10K system and see if it hangs at the end of glibc's compile phase using the 'sln' binary to generate symlinks? I've ran into this on R12000 and R14000 systems. I am assuming it'll happen on an R10000 system as well. 1: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 -- Joshua Kinard Gentoo/MIPS kumba@xxxxxxxxxx 4096R/D25D95E3 2011-03-28 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic