On Fri 29-03-19 16:16:38, Catalin Marinas wrote: > On Fri, Mar 29, 2019 at 01:02:37PM +0100, Michal Hocko wrote: > > On Thu 28-03-19 14:59:17, Catalin Marinas wrote: > > [...] > > > >From 09eba8f0235eb16409931e6aad77a45a12bedc82 Mon Sep 17 00:00:00 2001 > > > From: Catalin Marinas <catalin.marinas@xxxxxxx> > > > Date: Thu, 28 Mar 2019 13:26:07 +0000 > > > Subject: [PATCH] mm: kmemleak: Use mempool allocations for kmemleak objects > > > > > > This patch adds mempool allocations for struct kmemleak_object and > > > kmemleak_scan_area as slightly more resilient than kmem_cache_alloc() > > > under memory pressure. The patch also masks out all the gfp flags passed > > > to kmemleak other than GFP_KERNEL|GFP_ATOMIC. > > > > Using mempool allocator is better than inventing its own implementation > > but there is one thing to be slightly careful/worried about. > > > > This allocator expects that somebody will refill the pool in a finit > > time. Most users are OK with that because objects in flight are going > > to return in the pool in a relatively short time (think of an IO) but > > kmemleak is not guaranteed to comply with that AFAIU. Sure ephemeral > > allocations are happening all the time so there should be some churn > > in the pool all the time but if we go to an extreme where there is a > > serious memory leak then I suspect we might get stuck here without any > > way forward. Page/slab allocator would eventually back off even though > > small allocations never fail because a user context would get killed > > sooner or later but there is no fatal_signal_pending backoff in the > > mempool alloc path. > > We could improve the mempool code slightly to refill itself (from some > workqueue or during a mempool_alloc() which allows blocking) but it's > really just a best effort for a debug tool under OOM conditions. It may > be sufficient just to make the mempool size tunable (via > /sys/kernel/debug/kmemleak). The point I've tried to make is that you really have to fail at some point but mempool is fundamentally about non-failing as long as the allocation is sleepable. And we cannot really break that assumptions because existing users really depend on it. But as I've said I would try it out and see. This is just a debugging feature and I assume that a really fatal oom caused by a real memory leak would be detected sooner than the whole thing just blows up. -- Michal Hocko SUSE Labs