On Thu, Oct 21, 2010 at 06:55:16PM -0700, Andrew Morton wrote: > On Thu, 21 Oct 2010 17:45:36 -0700 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > this_cpu_add_return() isn't really needed in this application. > > > > { > > this_cpu_add(*fbc->counters, amount); > > if (unlikely(abs(this_cpu_read(*fbc->counters)) > fbc->batch)) > > out_of_line_stuff(); > > } > > > > will work just fine. > > Did that. Got alarmed at a few things. > > The compiler cannot CSE the above code - it has to reload the percpu > base each time. Doing it by hand: > > > { > long *p; > > p = this_cpu_ptr(fbc->counters); > *p += amount; > if (unlikely(abs(*p) > fbc->batch)) > out_of_line_stuff(); > } > > generates better code. > > So this: > > static __always_inline void percpu_counter_add_batch(struct percpu_counter *fbc, > s64 amount, long batch) > { > long *pcounter; > > preempt_disable(); > pcounter = this_cpu_ptr(fbc->counters); > *pcounter += amount; > if (unlikely(abs(*pcounter) >= batch)) > percpu_counter_handle_overflow(fbc); > preempt_enable(); > } > > when compiling this: > > --- a/lib/proportions.c~b > +++ a/lib/proportions.c > @@ -263,6 +263,11 @@ void __prop_inc_percpu(struct prop_descr > prop_put_global(pd, pg); > } > > +void foo(struct prop_local_percpu *pl) > +{ > + percpu_counter_add(&pl->events, 1); > +} > + > /* > * identical to __prop_inc_percpu, except that it limits this pl's fraction to > * @frac/PROP_FRAC_BASE by ignoring events when this limit has been exceeded. > > comes down to > > .globl foo > .type foo, @function > foo: > pushq %rbp # > movslq percpu_counter_batch(%rip),%rcx # percpu_counter_batch, batch > movq 96(%rdi), %rdx # <variable>.counters, tcp_ptr__ > movq %rsp, %rbp #, > #APP > add %gs:this_cpu_off, %rdx # this_cpu_off, tcp_ptr__ > #NO_APP > movq (%rdx), %rax #* tcp_ptr__, D.11817 > incq %rax # D.11817 > movq %rax, (%rdx) # D.11817,* tcp_ptr__ > cqto > xorq %rdx, %rax # tmp67, D.11817 > subq %rdx, %rax # tmp67, D.11817 > cmpq %rcx, %rax # batch, D.11817 > jl .L33 #, > call percpu_counter_handle_overflow # > .L33: > leave > ret > > > But what's really alarming is that the compiler (4.0.2) is cheerily > ignoring the inline directives and was generating out-of-line versions > of most of the percpu_counter.h functions into lib/proportions.s. > That's rather a worry. You you have the "ignore inlining" and "compile for size" turned on? They often suck. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html