* Vlastimil Babka <vbabka@xxxxxxx> [220225 05:31]: > On 2/18/22 20:21, Liam Howlett wrote: > > * Jakub Matěna <matenajakub@xxxxxxxxx> [220218 07:21]: > >> Motivation > >> In the current kernel it is impossible to merge two anonymous VMAs > >> if one of them was moved. That is because VMA's page offset is > >> set according to the virtual address where it was created and in > >> order to merge two VMA's page offsets need to follow up. > >> Another problem when merging two VMA's is their anon_vma. In > >> current kernel these anon_vmas have to be the one and the same. > >> Otherwise merge is again not allowed. > >> Missed merge opportunities increase the number of VMAs of a process > >> and in some cases can cause problems when a max count is reached. > > > > Does this really happen that much? Is it worth trying even harder to > > Let me perhaps clarify. Maybe not in general, but some mremap() heavy > workloads fragment VMA space a lot, have to increase the vma limits etc. > While the original motivation was a proprietary workload, there are e.g. > allocators such as jemalloc that rely on mremap(). > > But yes, it might turn out that the benefit is not universal and we might > consider some ways to make more aggressive merging opt-in. > > > merge VMAs? I am not really sure the VMA merging today is worth it - we > > are under a lock known to be a bottleneck while examining if it's > > I'd be afraid that by scaling back existing merging would break some > userspace expectations inspecting e.g. /proc/pid/maps Is that a risk considering how many things stop the merging of VMAs? We just added another (names). Not all the information can be in /proc/pid/maps - otherwise the tracing patch wouldn't really be necessary? > > > possible to merge. Hard data about how often and the cost of merging > > would be a good argument to try harder or give up earlier. > > > >> > >> Solution > >> Following series of these patches solves the first problem with > >> page offsets by updating them when the VMA is moved to a > >> different virtual address (patch 2). As for the second > >> problem merging of VMAs with different anon_vma is allowed > >> (patch 3). Patch 1 refactors function vma_merge and > >> makes it easier to understand and also allows relatively > >> seamless tracing of successful merges introduced by the patch 4. > >> > >> Limitations > >> For both problems solution works only for VMAs that do not share > >> physical pages with other processes (usually child or parent > >> processes). This is checked by looking at anon_vma of the respective > >> VMA. The reason why it is not possible or at least not easy to > >> accomplish is that each physical page has a pointer to anon_vma and > >> page offset. And when this physical page is shared we cannot simply > >> change these parameters without affecting all of the VMAs mapping > >> this physical page. Good thing is that this case amounts only for > >> about 1-3% of all merges (measured for internet browsing and > >> compilation use cases) that fail to merge in the current kernel. > > > > It sounds like you have data for some use cases on the mergers already. > > Do you have any results on this change? > > > >> > >> This series of patches and documentation of the related code will > >> be part of my master's thesis. > >> This patch series is based on tag v5.17-rc4. > >> > >> Jakub Matěna (4): > >> mm: refactor of vma_merge() > >> mm: adjust page offset in mremap > >> mm: enable merging of VMAs with different anon_vmas > >> mm: add tracing for VMA merges > >> > >> include/linux/rmap.h | 17 ++- > >> include/trace/events/mmap.h | 55 +++++++++ > >> mm/internal.h | 11 ++ > >> mm/mmap.c | 232 ++++++++++++++++++++++++++---------- > >> mm/rmap.c | 40 +++++++ > >> 5 files changed, 290 insertions(+), 65 deletions(-) > >> > >> -- > >> 2.34.1 > >> >