On 2/18/22 20:21, Liam Howlett wrote: > * Jakub Matěna <matenajakub@xxxxxxxxx> [220218 07:21]: >> Motivation >> In the current kernel it is impossible to merge two anonymous VMAs >> if one of them was moved. That is because VMA's page offset is >> set according to the virtual address where it was created and in >> order to merge two VMA's page offsets need to follow up. >> Another problem when merging two VMA's is their anon_vma. In >> current kernel these anon_vmas have to be the one and the same. >> Otherwise merge is again not allowed. >> Missed merge opportunities increase the number of VMAs of a process >> and in some cases can cause problems when a max count is reached. > > Does this really happen that much? Is it worth trying even harder to Let me perhaps clarify. Maybe not in general, but some mremap() heavy workloads fragment VMA space a lot, have to increase the vma limits etc. While the original motivation was a proprietary workload, there are e.g. allocators such as jemalloc that rely on mremap(). But yes, it might turn out that the benefit is not universal and we might consider some ways to make more aggressive merging opt-in. > merge VMAs? I am not really sure the VMA merging today is worth it - we > are under a lock known to be a bottleneck while examining if it's I'd be afraid that by scaling back existing merging would break some userspace expectations inspecting e.g. /proc/pid/maps > possible to merge. Hard data about how often and the cost of merging > would be a good argument to try harder or give up earlier. > >> >> Solution >> Following series of these patches solves the first problem with >> page offsets by updating them when the VMA is moved to a >> different virtual address (patch 2). As for the second >> problem merging of VMAs with different anon_vma is allowed >> (patch 3). Patch 1 refactors function vma_merge and >> makes it easier to understand and also allows relatively >> seamless tracing of successful merges introduced by the patch 4. >> >> Limitations >> For both problems solution works only for VMAs that do not share >> physical pages with other processes (usually child or parent >> processes). This is checked by looking at anon_vma of the respective >> VMA. The reason why it is not possible or at least not easy to >> accomplish is that each physical page has a pointer to anon_vma and >> page offset. And when this physical page is shared we cannot simply >> change these parameters without affecting all of the VMAs mapping >> this physical page. Good thing is that this case amounts only for >> about 1-3% of all merges (measured for internet browsing and >> compilation use cases) that fail to merge in the current kernel. > > It sounds like you have data for some use cases on the mergers already. > Do you have any results on this change? > >> >> This series of patches and documentation of the related code will >> be part of my master's thesis. >> This patch series is based on tag v5.17-rc4. >> >> Jakub Matěna (4): >> mm: refactor of vma_merge() >> mm: adjust page offset in mremap >> mm: enable merging of VMAs with different anon_vmas >> mm: add tracing for VMA merges >> >> include/linux/rmap.h | 17 ++- >> include/trace/events/mmap.h | 55 +++++++++ >> mm/internal.h | 11 ++ >> mm/mmap.c | 232 ++++++++++++++++++++++++++---------- >> mm/rmap.c | 40 +++++++ >> 5 files changed, 290 insertions(+), 65 deletions(-) >> >> -- >> 2.34.1 >>