[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/18, Frederic Weisbecker wrote:
>
> On Tue, Jun 18, 2013 at 04:42:25PM +0200, Oleg Nesterov wrote:
> >
> > Simplest example,
> >
> > 	for_each_possible_cpu(cpu)
> > 		total_count = per_cpu(per_cpu_count, cpu);
> >
> > Every per_cpu() likely means the cache miss. Not to mention we need the
> > additional math to calculate the address of the local counter.
> >
> > 	for_each_possible_cpu(cpu)
> > 		total_count = bootmem_or_kmalloc_array[cpu];
> >
> > is much better in this respect.
> >
> > And note also that per_cpu_count above can share the cacheline with
> > another "hot" per-cpu variable.
>
> Ah I see, that's good to know.
>
> But these variables are supposed to only be touched from slow path
> (perf events syscall, ptrace breakpoints creation, etc...), right?
> So this is probably not a problem?

Yes, sure. But please note that this can also penalize other CPUs.
For example, toggle_bp_slot() writes to per_cpu(nr_cpu_bp_pinned),
this invalidates the cachline which can contain another per-cpu
variable.

But let me clarify. I agree, this all is minor, I am not trying to
say this change can actually improve the performance.

The main point of this patch is to make the code look a bit better,
and you seem to agree. The changelog mentions s/percpu/array/ only
as a potential change which obviously needs more discussion, I didnt
mean that we should necessarily do this.

Although yes, personally I really dislike per-cpu in this case, but
of course this is subjective and I won't argue ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe trinity" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux SCSI]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux