On Wed, Nov 8, 2023 at 2:33 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Tue 07-11-23 10:05:24, Roman Gushchin wrote: > > On Mon, Nov 06, 2023 at 06:57:05PM -0800, Christoph Lameter wrote: > > > Right.. Well lets add the cgoup folks to this. > > > > Hello! > > > > I think it's the best thing we can do now. Thoughts? > > > > >From 5ed3e88f4f052b6ce8dbec0545dfc80eb7534a1a Mon Sep 17 00:00:00 2001 > > From: Roman Gushchin <roman.gushchin@xxxxxxxxx> > > Date: Tue, 7 Nov 2023 09:18:02 -0800 > > Subject: [PATCH] mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors > > > > Objcg vectors attached to slab pages to store slab object ownership > > information are allocated using gfp flags for the original slab > > allocation. Depending on slab page order and the size of slab objects, > > objcg vector can take several pages. > > > > If the original allocation was done with the __GFP_NOFAIL flag, it > > triggered a warning in the page allocation code. Indeed, order > 1 > > pages should not been allocated with the __GFP_NOFAIL flag. > > > > Fix this by simple dropping the __GFP_NOFAIL flag when allocating > > the objcg vector. It effectively allows to skip the accounting of a > > single slab object under a heavy memory pressure. > > It would be really good to describe what happens if the memcg metadata > allocation fails. AFAICS both callers of memcg_alloc_slab_cgroups - > memcg_slab_post_alloc_hook and account_slab will simply skip the > accounting which is rather curious but probably tolerable (does this > allow to runaway from memcg limits). If that is intended then it should > be documented so that new users do not get it wrong. We do not want to > error ever propagate down to the allocator caller which doesn't expect > it. The memcg metadata allocation failure is a situation kind of similar to how we used to have per-memcg kmem caches for accounting slab memory. The first allocation from a memcg triggers kmem cache creation and lets the allocation pass through. > > Btw. if the large allocation is really necessary, which hasn't been > explained so far AFAIK, would vmalloc fallback be an option? > For this specific scenario, large allocation is kind of unexpected, like a large (multi-order) slab having tiny objects. Roman, do you know the slab settings where this failure occurs? Anyways, I think kvmalloc is a better option. Most of the time we should have order 0 allocation here and for weird settings we fallback to vmalloc.