On Wed, Nov 13, 2024 at 03:53:54PM +0100, Mateusz Guzik wrote: > On Wed, Nov 13, 2024 at 02:28:16PM +0000, Lorenzo Stoakes wrote: > > On Tue, Nov 12, 2024 at 11:46:32AM -0800, Suren Baghdasaryan wrote: > > > Back when per-vma locks were introduces, vm_lock was moved out of > > > vm_area_struct in [1] because of the performance regression caused by > > > false cacheline sharing. Recent investigation [2] revealed that the > > > regressions is limited to a rather old Broadwell microarchitecture and > > > even there it can be mitigated by disabling adjacent cacheline > > > prefetching, see [3]. > > > > I don't see a motivating reason as to why we want to do this? We increase > > memory usage here which is not good, but later lock optimisation mitigates > > it, but why wouldn't we just do the lock optimisations and use less memory > > overall? > > > > Where would you put the lock in that case though? > > With the patchset it sticks with the affected vma, so no false-sharing > woes concerning other instances of the same struct. > > If you make them separately allocated and packed, they false-share > between different vmas using them (in fact this is currently happening). > If you make sure to pad them, that's 64 bytes per obj, majority of which > is empty space. 'I don't see a motivating reason' = I don't see it in the commit message. I'm saying put motivating reasons, like the above, in the commit message.