On Tue, 2015-09-22 at 20:12 -0400, John David Anglin wrote: > I question the the atomic hash changes as the original defines are > taken directly from generic code. It's about scaling. The fewer locks, the more contention in a hash lock system. The interesting question is where does the line tip over so that we see less speed up for more locks. > Optimally, we want one spinlock per cacheline. Why do we care about > the size of atomic_t? OK, so I think we're not using the word 'line size' in the same way. When Linux says 'line size' it generally means the cache ownership line size: the minimum block the inter cpu coherence operates on. Most of the architectural evidence for PA systems suggests that this is 16 We should be able to get this definitively: it's however many lower bits of a virtual address the LCI instruction truncates. 128 seems to be the cache burst fill size (the number of bytes that will be pulled into the cache by a usual operation touching any byte in the area). For streaming operations, the burst fill size is what we want to use, but for coherence operations it's the ownership line size. The reason is that different CPUs can own adjacent lines uncontended, so one spinlock per this region is optimal. The disadvantage to padding things out to the cache burst fill size is that we blow the cache footprint: data is too far apart and we use far more cache than we should meaning the cache thrashes much sooner as you load up the CPU. On SMP systems, Linux uses SMP_CACHE_BYTES == L1_CACHE_BYTES for padding on tons of critical structures if it's too big we'll waste a lot of cache footprint for no gain. James -- To unsubscribe from this list: send the line "unsubscribe linux-parisc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html