On Wed, Nov 13, 2024 at 03:45:24PM +0100, Vlastimil Babka wrote: > On 11/13/24 15:28, Lorenzo Stoakes wrote: > > On Tue, Nov 12, 2024 at 11:46:32AM -0800, Suren Baghdasaryan wrote: > >> Back when per-vma locks were introduces, vm_lock was moved out of > >> vm_area_struct in [1] because of the performance regression caused by > >> false cacheline sharing. Recent investigation [2] revealed that the > >> regressions is limited to a rather old Broadwell microarchitecture and > >> even there it can be mitigated by disabling adjacent cacheline > >> prefetching, see [3]. > > > > I don't see a motivating reason as to why we want to do this? We increase > > memory usage here which is not good, but later lock optimisation mitigates > > I'd say we don't normally split logically single structures into multiple > structures just because they might pack better in multiple slabs vs single > slab. Because that means more complicated management, extra pointer > dereferences etc. And if that split away part is a lock, it even complicates > things further. So the only motivation for doing that split was that it > appeared to perform better, but that was found to be misleading. > > But sure it can be described better, and include the new > SLAB_TYPESAFE_BY_RCU conversion part as the motivation - that would be > likely impossible to do with a split away lock. Right, my point is that there is no justification given here at all, and we should give one. I understand the provenance of why we split the lock, but there has to be a motivating reason if everything is working fine right now. The SLAB_TYPESAFE_BY_RCU one seems to be the key one, but also something along the lines of complicated management, concern about ordering of allocating/freeing things, etc. Just needs some extra explanation here. > > > it, but why wouldn't we just do the lock optimisations and use less memory > > overall? > > If the lock is made much smaller then the packing benefit by split might > disappear, as is the case here. > Yeah, but the lock would be smaller so we'd save space still right? > >> This patchset moves vm_lock back into vm_area_struct, aligning it at the > >> cacheline boundary and changing the cache to be cache-aligned as well. > >> This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 > >> (vm_lock) bytes to 256 bytes: > >> > >> slabinfo before: > >> <name> ... <objsize> <objperslab> <pagesperslab> : ... > >> vma_lock ... 40 102 1 : ... > >> vm_area_struct ... 160 51 2 : ... > > > > Pedantry, but it might be worth mentioning how much this can vary by config. > > > > For instance, on my machine: > > > > vm_area_struct 125238 138820 184 44 > > > >> > >> slabinfo after moving vm_lock: > >> <name> ... <objsize> <objperslab> <pagesperslab> : ... > >> vm_area_struct ... 256 32 2 : ... > >> > >> Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, > >> which is 5.5MB per 100000 VMAs. This memory consumption growth can be > >> addressed later by optimizing the vm_lock. > > > > Yes grabbing this back is of critical importance I'd say! :) > > I doubt it's that critical. We'll have to weight that against introducing > another non-standard locking primitive. Avoiding unnecessary VMA overhead is important, and a strong part of the motivation for the guard pages series for instance, so I don't think we should be unconcerned about unnecessary extra memory usage. I'm guessing from what you say you're not in favour of the subsequent 'non-standard' locking changes... > > > Functionally it looks ok to me but would like to see a stronger > > justification in the commit msg! :) > > > >> > >> [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@xxxxxxxxxx/ > >> [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ > >> [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@xxxxxxxxxxxxxx/ > >> > >> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > >> --- > >> include/linux/mm.h | 28 +++++++++++++---------- > >> include/linux/mm_types.h | 6 +++-- > >> kernel/fork.c | 49 ++++------------------------------------ > >> 3 files changed, 25 insertions(+), 58 deletions(-) > >>