Re: [PATCH] parisc: adjust L1_CACHE_BYTES to 128 bytes on PA8800 and PA8900 CPUs

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Thu, 24 Sep 2015 07:20:27 -0700

On Tue, 2015-09-22 at 20:12 -0400, John David Anglin wrote:
> I question the the atomic hash changes as the original defines are
> taken directly from generic code.

It's about scaling.  The fewer locks, the more contention in a hash lock
system.  The interesting question is where does the line tip over so
that we see less speed up for more locks.

> Optimally, we want one spinlock per cacheline.  Why do we care about
> the size of atomic_t?

OK, so I think we're not using the word 'line size' in the same way.
When Linux says 'line size' it generally means the cache ownership line
size: the minimum block the inter cpu coherence operates on.  Most of
the architectural evidence for PA systems suggests that this is 16  We
should be able to get this definitively: it's however many lower bits of
a virtual address the LCI instruction truncates.  128 seems to be the
cache burst fill size (the number of bytes that will be pulled into the
cache by a usual operation touching any byte in the area).  For
streaming operations, the burst fill size is what we want to use, but
for coherence operations it's the ownership line size.  The reason is
that different CPUs can own adjacent lines uncontended, so one spinlock
per this region is optimal.

The disadvantage to padding things out to the cache burst fill size is
that we blow the cache footprint: data is too far apart and we use far
more cache than we should meaning the cache thrashes much sooner as you
load up the CPU.  On SMP systems, Linux uses SMP_CACHE_BYTES ==
L1_CACHE_BYTES for padding on tons of critical structures if it's too
big we'll waste a lot of cache footprint for no gain.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html