Re: [PATCH] cache align vm_stat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 27, 2011 at 10:50:25AM +0200, Mel Gorman wrote:
> On Mon, Oct 24, 2011 at 11:10:35AM -0500, Dimitri Sivanich wrote:
> > Avoid false sharing of the vm_stat array.
> > 
> > This was found to adversely affect tmpfs I/O performance.
> > 
> 
> I think this fix is overly simplistic. It is moving each counter into
> its own cache line. While I accept that this will help the preformance
> of the tmpfs-based workload, it will adversely affect workloads that
> touch a lot of counters because of the increased cache footprint.

To reiterate, this change is as follows:
  -atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS];
  +atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS] __cacheline_aligned_in_smp;

I think you've misunderstood the effect of the fix.  It does not move
each counter into it's own cache line.  It forces the overall array to
start on a cache line boundary.  From 'nm -n vmlinux' output:
	ffffffff81e04200 d hash_lock
	ffffffff81e04240 D vm_stat
	ffffffff81e04340 d nr_files
Unless I'm missing something, vm_stat is using 4 cachelines, not 24.
And yes, the addresses of the first 3 counters are, in fact,
ffffffff81e04240, ffffffff81e04248, and ffffffff81e04250.

Also, while this patch does make some difference in performance, it is
only an incremental step in improving vm_stat contention.  More will need
to be done.

Also, I've found that separating counters into their own cacheline has less
of an effect on performance than one might think.  It really has more to do
with the number of updates that each of these counters (especially
NR_FREE_PAGES) is receiving.

> 
> 1. Is it possible to rearrange the vmstat array such that two hot
>    counters do not share a cache line?

See my comment above.

> 2. Has Andrew's suggestion to alter the per-cpu threshold based on the
>    value of the global counter to reduce conflicts been tried?

Altering the per-cpu threshold (that returned from calculate*threshold in
mm/vmstat.c) works well, but see my previous posts about the size of
threshold required to achieve reasonable scaling performance.  Values are
much higher than the currently allowed 125 (more on the order of 1000-2000).

> 
> (I'm at Linux Con at the moment so will be even slower to respond than
> usual)
> 
> -- 
> Mel Gorman
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]