The patch titled Subject: mm: prevent vm_area_struct::anon_name refcount saturation has been added to the -mm tree. Its filename is mm-prevent-vm_area_struct-anon_name-refcount-saturation.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-prevent-vm_area_struct-anon_name-refcount-saturation.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-prevent-vm_area_struct-anon_name-refcount-saturation.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Suren Baghdasaryan <surenb@xxxxxxxxxx> Subject: mm: prevent vm_area_struct::anon_name refcount saturation A deep process chain with many vmas could grow really high. With default sysctl_max_map_count (64k) and default pid_max (32k) the max number of vmas in the system is 2147450880 and the refcounter has headroom of 1073774592 before it reaches REFCOUNT_SATURATED (3221225472). Therefore it's unlikely that an anonymous name refcounter will overflow with these defaults. Currently the max for pid_max is PID_MAX_LIMIT (4194304) and for sysctl_max_map_count it's INT_MAX (2147483647). In this configuration anon_vma_name refcount overflow becomes theoretically possible (that still require heavy sharing of that anon_vma_name between processes). kref refcounting interface used in anon_vma_name structure will detect a counter overflow when it reaches REFCOUNT_SATURATED value but will only generate a warning about broken refcounter. To ensure anon_vma_name refcount does not overflow, stop anon_vma_name sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still leaves INT_MAX/2 (1073741823) values before the counter reaches REFCOUNT_SATURATED. This should provide enough headroom for raising the refcounts temporarily. Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@xxxxxxxxxx Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> Suggested-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Alexey Gladkov <legion@xxxxxxxxxx> Cc: Chris Hyser <chris.hyser@xxxxxxxxxx> Cc: Christian Brauner <brauner@xxxxxxxxxx> Cc: Colin Cross <ccross@xxxxxxxxxx> Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Kees Cook <keescook@xxxxxxxxxxxx> Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Peter Collingbourne <pcc@xxxxxxxxxx> Cc: Sasha Levin <sashal@xxxxxxxxxx> Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Xiaofeng Cao <caoxiaofeng@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/mm_inline.h | 18 ++++++++++++++---- mm/madvise.c | 3 +-- 2 files changed, 15 insertions(+), 6 deletions(-) --- a/include/linux/mm_inline.h~mm-prevent-vm_area_struct-anon_name-refcount-saturation +++ a/include/linux/mm_inline.h @@ -161,15 +161,25 @@ static inline void anon_vma_name_put(str kref_put(&anon_name->kref, anon_vma_name_free); } +static inline +struct anon_vma_name *anon_vma_name_reuse(struct anon_vma_name *anon_name) +{ + /* Prevent anon_name refcount saturation early on */ + if (kref_read(&anon_name->kref) < REFCOUNT_MAX) { + anon_vma_name_get(anon_name); + return anon_name; + + } + return anon_vma_name_alloc(anon_name->name); +} + static inline void dup_anon_vma_name(struct vm_area_struct *orig_vma, struct vm_area_struct *new_vma) { struct anon_vma_name *anon_name = anon_vma_name(orig_vma); - if (anon_name) { - anon_vma_name_get(anon_name); - new_vma->anon_name = anon_name; - } + if (anon_name) + new_vma->anon_name = anon_vma_name_reuse(anon_name); } static inline void free_anon_vma_name(struct vm_area_struct *vma) --- a/mm/madvise.c~mm-prevent-vm_area_struct-anon_name-refcount-saturation +++ a/mm/madvise.c @@ -113,8 +113,7 @@ static int replace_anon_vma_name(struct if (anon_vma_name_eq(orig_name, anon_name)) return 0; - anon_vma_name_get(anon_name); - vma->anon_name = anon_name; + vma->anon_name = anon_vma_name_reuse(anon_name); anon_vma_name_put(orig_name); return 0; _ Patches currently in -mm which might be from surenb@xxxxxxxxxx are mm-fix-use-after-free-bug-when-mm-mmap-is-reused-after-being-freed.patch mm-refactor-vm_area_struct-anon_vma_name-usage-code.patch mm-prevent-vm_area_struct-anon_name-refcount-saturation.patch mm-fix-use-after-free-when-anon-vma-name-is-used-after-vma-is-freed.patch mm-count-time-in-drain_all_pages-during-direct-reclaim-as-memory-pressure.patch