Re: [PATCH] percpu_counter : add percpu_counter_add_fast()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 21 Oct 2010 17:45:36 -0700 Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> this_cpu_add_return() isn't really needed in this application.
> 
> {
> 	this_cpu_add(*fbc->counters, amount);
> 	if (unlikely(abs(this_cpu_read(*fbc->counters)) > fbc->batch))
> 		out_of_line_stuff();
> }	
> 
> will work just fine.

Did that.  Got alarmed at a few things.

The compiler cannot CSE the above code - it has to reload the percpu
base each time.  Doing it by hand:


{
	long *p;

	p = this_cpu_ptr(fbc->counters);
	*p += amount;
	if (unlikely(abs(*p) > fbc->batch))
		out_of_line_stuff();
}	

generates better code.

So this:

static __always_inline void percpu_counter_add_batch(struct percpu_counter *fbc,
						s64 amount, long batch)
{
	long *pcounter;

	preempt_disable();
	pcounter = this_cpu_ptr(fbc->counters);
	*pcounter += amount;
	if (unlikely(abs(*pcounter) >= batch))
		percpu_counter_handle_overflow(fbc);
	preempt_enable();
}

when compiling this:

--- a/lib/proportions.c~b
+++ a/lib/proportions.c
@@ -263,6 +263,11 @@ void __prop_inc_percpu(struct prop_descr
 	prop_put_global(pd, pg);
 }
 
+void foo(struct prop_local_percpu *pl)
+{
+	percpu_counter_add(&pl->events, 1);
+}
+
 /*
  * identical to __prop_inc_percpu, except that it limits this pl's fraction to
  * @frac/PROP_FRAC_BASE by ignoring events when this limit has been exceeded.

comes down to

.globl foo
	.type	foo, @function
foo:
	pushq	%rbp	#
	movslq	percpu_counter_batch(%rip),%rcx	# percpu_counter_batch, batch
	movq	96(%rdi), %rdx	# <variable>.counters, tcp_ptr__
	movq	%rsp, %rbp	#,
#APP
	add %gs:this_cpu_off, %rdx	# this_cpu_off, tcp_ptr__
#NO_APP
	movq	(%rdx), %rax	#* tcp_ptr__, D.11817
	incq	%rax	# D.11817
	movq	%rax, (%rdx)	# D.11817,* tcp_ptr__
	cqto
	xorq	%rdx, %rax	# tmp67, D.11817
	subq	%rdx, %rax	# tmp67, D.11817
	cmpq	%rcx, %rax	# batch, D.11817
	jl	.L33	#,
	call	percpu_counter_handle_overflow	#
.L33:
	leave
	ret


But what's really alarming is that the compiler (4.0.2) is cheerily
ignoring the inline directives and was generating out-of-line versions
of most of the percpu_counter.h functions into lib/proportions.s. 
That's rather a worry.

lib/proportions.o got rather larger as a result of inlining things and
it's not obvious that it's all a net benefit.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux