On Tue, Feb 22, 2022 at 1:17 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Mon 21-02-22 21:40:24, Suren Baghdasaryan wrote: > > A deep process chain with many vmas could grow really high. > > This would really benefit from some numbers. With default > sysctl_max_map_count (64k) and default pid_max (32k) the INT_MAX could > be theoretically reached but I find it impractical because not all vmas > can be anonymous same as all available pids can be consumed for a > theoretical attack (if my counting is proper). > On the other hand any non-default configuration with any of the values > increased could hit this theoretically. re: This would really benefit from some numbers Should I just add the details you provided above into the description? Would that suffice? > > > kref > > refcounting interface used in anon_vma_name structure will detect > > a counter overflow when it reaches REFCOUNT_SATURATED value but will > > only generate a warning about broken refcounting. > > To ensure anon_vma_name refcount does not overflow, stop anon_vma_name > > sharing when the refcount reaches INT_MAX, which still leaves INT_MAX/2 > > values before the counter reaches REFCOUNT_SATURATED. This should provide > > enough headroom for raising the refcounts temporarily. > > > > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > --- > > include/linux/mm_inline.h | 18 ++++++++++++++---- > > mm/madvise.c | 3 +-- > > 2 files changed, 15 insertions(+), 6 deletions(-) > > > > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h > > index 70b619442d56..b189e2638843 100644 > > --- a/include/linux/mm_inline.h > > +++ b/include/linux/mm_inline.h > > @@ -156,15 +156,25 @@ static inline void anon_vma_name_get(struct anon_vma_name *anon_name) > > > > extern void anon_vma_name_put(struct anon_vma_name *anon_name); > > > > +static inline > > +struct anon_vma_name *anon_vma_name_reuse(struct anon_vma_name *anon_name) > > +{ > > + /* Prevent anon_name refcount saturation early on */ > > + if (kref_read(&anon_name->kref) < INT_MAX) { > > REFCOUNT_MAX seems to be defined by the kref framework. Ah, indeed. I missed that. Will change to use it. > > Other than that looks good to me. Thanks for the review! > > > + anon_vma_name_get(anon_name); > > + return anon_name; > > + > > + } > > + return anon_vma_name_alloc(anon_name->name); > > +} > > + > > static inline void dup_vma_anon_name(struct vm_area_struct *orig_vma, > > struct vm_area_struct *new_vma) > > { > > struct anon_vma_name *anon_name = vma_anon_name(orig_vma); > > > > - if (anon_name) { > > - anon_vma_name_get(anon_name); > > - new_vma->anon_name = anon_name; > > - } > > + if (anon_name) > > + new_vma->anon_name = anon_vma_name_reuse(anon_name); > > } > > > > static inline void free_vma_anon_name(struct vm_area_struct *vma) > > diff --git a/mm/madvise.c b/mm/madvise.c > > index f81d62d8ce9b..a395884aeecb 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -122,8 +122,7 @@ static int replace_vma_anon_name(struct vm_area_struct *vma, > > if (anon_vma_name_eq(orig_name, anon_name)) > > return 0; > > > > - anon_vma_name_get(anon_name); > > - vma->anon_name = anon_name; > > + vma->anon_name = anon_vma_name_reuse(anon_name); > > anon_vma_name_put(orig_name); > > > > return 0; > > -- > > 2.35.1.473.g83b2b277ed-goog > > -- > Michal Hocko > SUSE Labs