Le jeudi 17 mars 2011 Ã 13:42 -0500, Christoph Lameter a Ãcrit : > On Thu, 17 Mar 2011, Eric Dumazet wrote: > > > By the way, I noticed : > > > > DECLARE_PER_CPU(u64, xt_u64); > > __this_cpu_add(xt_u64, 2) translates to following x86_32 code : > > > > mov $xt_u64,%eax > > add %fs:0x0,%eax > > addl $0x2,(%eax) > > adcl $0x0,0x4(%eax) > > > > > > I wonder why we dont use : > > > > addl $0x2,%fs:xt_u64 > > addcl $0x0,%fs:xt_u64+4 > > The compiler is fed the following > > *__this_cpu_ptr(xt_u64) += 2 > > __this_cpu_ptr makes it: > > *(xt_u64 + __my_cpu_offset) += 2 > > So the compiler calculates the address first and then increments it. > > The compiler could optimize this I think. Wonder why that does not happen. Compiler is really forced to compute addr, thats why. Hmm, we should not fallback to generic ops I think, but tweak percpu_add_op() { ... case 8: #if CONFIG_X86_64_SMP if (pao_ID__ == 1) \ asm("incq "__percpu_arg(0) : "+m" (var)); \ else if (pao_ID__ == -1) \ asm("decq "__percpu_arg(0) : "+m" (var)); \ else \ asm("addq %1, "__percpu_arg(0) \ : "+m" (var) \ : "re" ((pao_T__)(val))); \ break; \ #else asm("addl %1, "__percpu_arg(0) \ : "+m" (var) \ : "ri" ((u32)(val))); \ asm("adcl %1, "__percpu_arg(0) \ : "+m" ((char *)var+4) \ : "ri" ((u32)(val>>32)); \ break; \ #endif .... } -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html