On 1/15/25 03:17, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > Teach memcg to operate under trylock conditions when spinning locks > cannot be used. > > local_trylock might fail and this would lead to charge cache bypass if > the calling context doesn't allow spinning (gfpflags_allow_spinning). > In those cases charge the memcg counter directly and fail early if > that is not possible. This might cause a pre-mature charge failing > but it will allow an opportunistic charging that is safe from > try_alloc_pages path. > > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > mm/memcontrol.c | 24 ++++++++++++++++++++---- > 1 file changed, 20 insertions(+), 4 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7b3503d12aaf..e4c7049465e0 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1756,7 +1756,8 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, > * > * returns true if successful, false otherwise. > */ > -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > + gfp_t gfp_mask) > { > struct memcg_stock_pcp *stock; > unsigned int stock_pages; > @@ -1766,7 +1767,11 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > if (nr_pages > MEMCG_CHARGE_BATCH) > return ret; > > - local_lock_irqsave(&memcg_stock.stock_lock, flags); > + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + if (!gfpflags_allow_spinning(gfp_mask)) > + return ret; > + local_lock_irqsave(&memcg_stock.stock_lock, flags); The last line can practially only happen on RT, right? On non-RT irqsave means we could only fail the trylock from a nmi and then we should have gfp_flags that don't allow spinning. So suppose we used local_trylock(), local_lock() and local_unlock() (no _irqsave) instead, as I mentioned in reply to 3/7. The RT implementation would be AFAICS the same. On !RT the trylock could now fail from a IRQ context in addition to NMI context, but that should also have a gfp_mask that does not allow spinning, so it should work fine. It would however mean converting all users of the lock, i.e. also consume_obj_stock() etc., but AFAIU that will be necessary anyway to have opportunistic slab allocations? > + } > > stock = this_cpu_ptr(&memcg_stock); > stock_pages = READ_ONCE(stock->nr_pages); > @@ -1851,7 +1856,14 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > { > unsigned long flags; > > - local_lock_irqsave(&memcg_stock.stock_lock, flags); > + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + /* > + * In case of unlikely failure to lock percpu stock_lock > + * uncharge memcg directly. > + */ > + mem_cgroup_cancel_charge(memcg, nr_pages); > + return; > + } > __refill_stock(memcg, nr_pages); > local_unlock_irqrestore(&memcg_stock.stock_lock, flags); > } > @@ -2196,9 +2208,13 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, > unsigned long pflags; > > retry: > - if (consume_stock(memcg, nr_pages)) > + if (consume_stock(memcg, nr_pages, gfp_mask)) > return 0; > > + if (!gfpflags_allow_spinning(gfp_mask)) > + /* Avoid the refill and flush of the older stock */ > + batch = nr_pages; > + > if (!do_memsw_account() || > page_counter_try_charge(&memcg->memsw, batch, &counter)) { > if (page_counter_try_charge(&memcg->memory, batch, &counter))