Re: [PATCH bpf-next v3 4/6] memcg: Use trylock to access memcg stock_lock.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 20, 2024 at 08:10:47AM -0800, Alexei Starovoitov wrote:
> On Fri, Dec 20, 2024 at 12:24 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > > +static inline bool gfpflags_allow_spinning(const gfp_t gfp_flags)
> > > +{
> > > +       /*
> > > +        * !__GFP_DIRECT_RECLAIM -> direct claim is not allowed.
> > > +        * !__GFP_KSWAPD_RECLAIM -> it's not safe to wake up kswapd.
> > > +        * All GFP_* flags including GFP_NOWAIT use one or both flags.
> > > +        * try_alloc_pages() is the only API that doesn't specify either flag.
> >
> > I wouldn't be surprised if we had other allocations like that. git grep
> > is generally not very helpful as many/most allocations use gfp argument
> > of a sort. I would slightly reword this to be more explicit.
> >           /*
> >            * This is stronger than GFP_NOWAIT or GFP_ATOMIC because
> >            * those are guaranteed to never block on a sleeping lock.
> >            * Here we are enforcing that the allaaction doesn't ever spin
> >            * on any locks (i.e. only trylocks). There is no highlevel
> >            * GFP_$FOO flag for this use try_alloc_pages as the
> >            * regular page allocator doesn't fully support this
> >            * allocation mode.
> 
> Makes sense. I like this new wording. Will incorporate.
> 
> > > +        */
> > > +       return !(gfp_flags & __GFP_RECLAIM);
> > > +}
> > > +
> > >  #ifdef CONFIG_HIGHMEM
> > >  #define OPT_ZONE_HIGHMEM ZONE_HIGHMEM
> > >  #else
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index f168d223375f..545d345c22de 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -1768,7 +1768,7 @@ static bool consume_stock(struct mem_cgroup
> > > *memcg, unsigned int nr_pages,
> > >                 return ret;
> > >
> > >         if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) {
> > > -               if (gfp_mask & __GFP_TRYLOCK)
> > > +               if (!gfpflags_allow_spinning(gfp_mask))
> > >                         return ret;
> > >                 local_lock_irqsave(&memcg_stock.stock_lock, flags);
> > >         }
> > >
> > > If that's acceptable then such an approach will work for
> > > my slub.c reentrance changes too.
> >
> > It certainly is acceptable for me.
> 
> Great.
> 
> > Do not forget to add another hunk to
> > avoid charging the full batch in this case.
> 
> Well. It looks like you spotted the existing bug ?
> 
> Instead of
> +       if (!gfpflags_allow_blockingk(gfp_mask))
> +               batch = nr_pages;
> 
> it should be unconditional:
> 
> +               batch = nr_pages;
> 
> after consume_stock() returns false.
> 
> Consider:
>         stock_pages = READ_ONCE(stock->nr_pages);
>         if (memcg == READ_ONCE(stock->cached) && stock_pages >= nr_pages) {
> 
> stock_pages == 10
> nr_pages == 20
> 
> so after consume_stock() returns false
> the batch will stay == MEMCG_CHARGE_BATCH == 64
> and
> page_counter_try_charge(&memcg->memsw, batch,...
> 
> will charge too much ?
> 
> and the bug was there for a long time.
> 
> Johaness,
> 
> looks like it's mostly your code?
> 
> Pls help us out.

I think the code is fine as the overcharge amount will be refilled into
the stock (old one will be flushed). 

	if (gfpflags_allow_spinning(gfp_mask))
		batch = nr_pages;

The above code will just avoid the refill and flushing the older stock.
Maybe Michal's suggestion is due to that reason.

BTW after the done_restock tag in try_charge_memcg(), we will another
gfpflags_allow_spinning() check to avoid schedule_work() and
mem_cgroup_handle_over_high(). Maybe simply return early for
gfpflags_allow_spinning() without checking high marks.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux