This is a note to let you know that I've just added the patch titled ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier() to the 3.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: arm-7747-1-pcpu-ensure-__my_cpu_offset-cannot-be-re-ordered-across-barrier.patch and it can be found in the queue-3.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 509eb76ebf9771abc9fe51859382df2571f11447 Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@xxxxxxx> Date: Wed, 5 Jun 2013 11:20:33 +0100 Subject: ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier() From: Will Deacon <will.deacon@xxxxxxx> commit 509eb76ebf9771abc9fe51859382df2571f11447 upstream. __my_cpu_offset is non-volatile, since we want its value to be cached when we access several per-cpu variables in a row with preemption disabled. This means that we rely on preempt_{en,dis}able to hazard with the operation via the barrier() macro, so that we can't end up migrating CPUs without reloading the per-cpu offset. Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile asm block as a side-effect, and will happily re-order it before other memory clobbers (including those in prempt_disable()) and cache the value. This has been observed to break the cmpxchg logic in the slub allocator, leading to livelock in kmem_cache_alloc in mainline kernels. This patch adds a dummy memory input operand to __my_cpu_offset, forcing it to be ordered with respect to the barrier() macro. Reviewed-by: Nicolas Pitre <nico@xxxxxxxxxx> Cc: Rob Herring <rob.herring@xxxxxxxxxxx> Signed-off-by: Will Deacon <will.deacon@xxxxxxx> Signed-off-by: Russell King <rmk+kernel@xxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- arch/arm/include/asm/percpu.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) --- a/arch/arm/include/asm/percpu.h +++ b/arch/arm/include/asm/percpu.h @@ -30,8 +30,15 @@ static inline void set_my_cpu_offset(uns static inline unsigned long __my_cpu_offset(void) { unsigned long off; - /* Read TPIDRPRW */ - asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : : "memory"); + register unsigned long *sp asm ("sp"); + + /* + * Read TPIDRPRW. + * We want to allow caching the value, so avoid using volatile and + * instead use a fake stack read to hazard against barrier(). + */ + asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp)); + return off; } #define __my_cpu_offset __my_cpu_offset() Patches currently in stable-queue which might be from will.deacon@xxxxxxx are queue-3.9/arm-7742-1-topology-export-cpu_topology.patch queue-3.9/arm-7747-1-pcpu-ensure-__my_cpu_offset-cannot-be-re-ordered-across-barrier.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html