On Mon 21-02-22 21:40:24, Suren Baghdasaryan wrote: > A deep process chain with many vmas could grow really high. This would really benefit from some numbers. With default sysctl_max_map_count (64k) and default pid_max (32k) the INT_MAX could be theoretically reached but I find it impractical because not all vmas can be anonymous same as all available pids can be consumed for a theoretical attack (if my counting is proper). On the other hand any non-default configuration with any of the values increased could hit this theoretically. > kref > refcounting interface used in anon_vma_name structure will detect > a counter overflow when it reaches REFCOUNT_SATURATED value but will > only generate a warning about broken refcounting. > To ensure anon_vma_name refcount does not overflow, stop anon_vma_name > sharing when the refcount reaches INT_MAX, which still leaves INT_MAX/2 > values before the counter reaches REFCOUNT_SATURATED. This should provide > enough headroom for raising the refcounts temporarily. > > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > --- > include/linux/mm_inline.h | 18 ++++++++++++++---- > mm/madvise.c | 3 +-- > 2 files changed, 15 insertions(+), 6 deletions(-) > > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h > index 70b619442d56..b189e2638843 100644 > --- a/include/linux/mm_inline.h > +++ b/include/linux/mm_inline.h > @@ -156,15 +156,25 @@ static inline void anon_vma_name_get(struct anon_vma_name *anon_name) > > extern void anon_vma_name_put(struct anon_vma_name *anon_name); > > +static inline > +struct anon_vma_name *anon_vma_name_reuse(struct anon_vma_name *anon_name) > +{ > + /* Prevent anon_name refcount saturation early on */ > + if (kref_read(&anon_name->kref) < INT_MAX) { REFCOUNT_MAX seems to be defined by the kref framework. Other than that looks good to me. > + anon_vma_name_get(anon_name); > + return anon_name; > + > + } > + return anon_vma_name_alloc(anon_name->name); > +} > + > static inline void dup_vma_anon_name(struct vm_area_struct *orig_vma, > struct vm_area_struct *new_vma) > { > struct anon_vma_name *anon_name = vma_anon_name(orig_vma); > > - if (anon_name) { > - anon_vma_name_get(anon_name); > - new_vma->anon_name = anon_name; > - } > + if (anon_name) > + new_vma->anon_name = anon_vma_name_reuse(anon_name); > } > > static inline void free_vma_anon_name(struct vm_area_struct *vma) > diff --git a/mm/madvise.c b/mm/madvise.c > index f81d62d8ce9b..a395884aeecb 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -122,8 +122,7 @@ static int replace_vma_anon_name(struct vm_area_struct *vma, > if (anon_vma_name_eq(orig_name, anon_name)) > return 0; > > - anon_vma_name_get(anon_name); > - vma->anon_name = anon_name; > + vma->anon_name = anon_vma_name_reuse(anon_name); > anon_vma_name_put(orig_name); > > return 0; > -- > 2.35.1.473.g83b2b277ed-goog -- Michal Hocko SUSE Labs