Regarding the issues with IP27, one thing I am noticing a lot when I disassemble an address found in epc or ErrEPC of a "stuck" kernel, is that sometimes, I keep getting directed back to arch/mips/include/asm/atomic.h:181: #define ATOMIC_OPS(op, c_op, asm_op) \ ATOMIC_OP(op, c_op, asm_op) \ ATOMIC_OP_RETURN(op, c_op, asm_op) \ ATOMIC_FETCH_OP(op, c_op, asm_op) ATOMIC_OPS(add, +=, addu) <-- HERE ATOMIC_OPS(sub, -=, subu) So looking at the code, what stands out to me is that the "(kernel_uses_llsc && R10000_LLSC_WAR)" inline asm code: if (kernel_uses_llsc && R10000_LLSC_WAR) { \ int temp; \ \ __asm__ __volatile__( \ " .set arch=r4000 \n" \ "1: ll %0, %1 # atomic_" #op " \n" \ " " #asm_op " %0, %2 \n" \ " sc %0, %1 \n" \ " beqzl %0, 1b \n" \ " .set mips0 \n" \ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \ : "Ir" (i)); \ Is substantially different from the standard "kernel_uses_llsc" inline asm: } else if (kernel_uses_llsc) { \ int temp; \ \ do { \ __asm__ __volatile__( \ " .set "MIPS_ISA_LEVEL" \n" \ " ll %0, %1 # atomic_" #op "\n" \ " " #asm_op " %0, %2 \n" \ " sc %0, %1 \n" \ " .set mips0 \n" \ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \ : "Ir" (i)); \ } while (unlikely(!temp)); \ (Above is from "atomic_##op" -> #define ATOMIC_OP, starting on line 44 in current git) My understanding of what R10000_LLSC_WAR handles, in most cases, is the use of a "beqzl" instruction over "beqz", due to a silicon bug in earlier R10000 CPUs. R10K CPUs with silicon rev >~3.0, R12K, R14K, and R16K are all unaffected and should be able to safely use the non-R10000_LLSC_WAR branch. Current upstream however, does not distinguish between different members of the R10K family, thus it forces ALL R10K CPUs to take the R10000_LLSC_WAR path. I've got a patch that splits R10000 support up into plain "R10000" and then "R12K/R14K/R16K", with the latter case //disabling// the R10000_LLSC_WAR flag. Thus, because of this change, on my systems, I am executing the standard "kernel_uses_llsc" inline asm code and this newer code probably does not play very nicely on these older CPUs. Checking through a couple of git logs, it looks like the development on later MIPS ISAs (R2+) on the newer CPUs has been tweaking the atomic ops case for standard "kernel_uses_llsc", and ignoring the R10000_LLSC_WAR block entirely. I suspect this is in an attempt by some to not break what is probably assumed to be working code for systems few people have access to. Does this sound accurate? I found that the first of these changes occurred almost 7 years ago this month between 2.6.36 and 2.6.37 w/ commit 7837314d141c: https://git.linux-mips.org/cgit/ralf/linux.git/commit/arch/mips/include/asm/atomic.h?id=7837314d141c661c70bc13c5050694413ecfe14a This raises the question of why was the standard "kernel_uses_llsc" case changed but not the R10000_LLSC_WAR case? The changes seem like they would be applicable to the older R10K CPUs regardless, since this is before a lot of the code for the newer ISAs (R2+) was added. I am getting a funny feeling that a lot of these templates need to be re-written (maybe even in plain C, given newer gcc's better intelligence) and other useful cleanups done. I am not fluent in MIPS asm enough, though, to know what to change. I'm going to experiment with backing out some of the more recent changes specific to the newer ISAs/CPUs and set it up so that the main difference between the R10000_LLSC_WAR case and the standard case is just plain "beqzl" versus "beqz" and see if this makes my issues on IP27 go away. -- Joshua Kinard Gentoo/MIPS kumba@xxxxxxxxxx 6144R/F5C6C943 2015-04-27 177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 "The past tempts us, the present confuses us, the future frightens us. And our lives slip away, moment by moment, lost in that vast, terrible in-between." --Emperor Turhan, Centauri Republic