Re: [PATCH] kmemleak: don't use __GFP_NOFAIL

Michal Hocko <mhocko@xxxxxxxx> · Thu, 31 May 2018 20:41:04 +0200



On Thu 31-05-18 16:22:26, Catalin Marinas wrote:
> Hi Michal,
> 
> I'm catching up with this thread.
> 
> On Wed, May 30, 2018 at 02:38:26PM +0200, Michal Hocko wrote:
> > On Wed 30-05-18 07:42:59, Chunyu Hu wrote:
> > > From: "Michal Hocko" <mhocko@xxxxxxxx>
> > > > want to have a pre-populated pool of objects for those requests. The
> > > > obvious question is how to balance such a pool. It ain't easy to track
> > > > memory by allocating more memory...
> > > 
> > > This solution is going to make kmemleak trace really nofail. We can think
> > > later.
> > > 
> > > while I'm thinking about if fault inject can be disabled via flag in task.
> > > 
> > > Actually, I'm doing something like below, the disable_fault_inject() is
> > > just setting a flag in task->make_it_fail. But this will depend on if
> > > fault injection accept a change like this. CCing Akinobu 
> > 
> > You still seem to be missing my point I am afraid (or I am ;). So say
> > that you want to track a GFP_NOWAIT allocation request. So create_object
> > will get called with that gfp mask and no matter what you try here your
> > tracking object will be allocated in a weak allocation context as well
> > and disable kmemleak. So it only takes a more heavy memory pressure and
> > the tracing is gone...
> 
> create_object() indeed gets the originating gfp but it only cares
> whether it was GFP_KERNEL or GFP_ATOMIC. gfp_kmemleak_mask() masks out
> all the other flags when allocating a struct kmemleak_object (and adds
> some of its own).
> 
> This has worked ok so far. There is a higher risk of GFP_ATOMIC
> allocations failing but I haven't seen issues with kmemleak unless you
> run it under heavy memory pressure (and kmemleak just disables itself).
> With fault injection, such pressure is simulated with the side effect of
> rendering kmemleak unusable.
> 
> Kmemleak could implement its own allocation mechanism (maybe even
> skipping the slab altogether) but I feel it's overkill just to cope with
> the specific case of fault injection. Also, it's not easy to figure out
> how to balance such pool and it may end up calling alloc_pages() which
> in turn can inject a fault.
> 
> Could we tweak gfp_kmemleak_mask() to still pass __GFP_NOFAIL but in a
> compatible way (e.g. by setting __GFP_DIRECT_RECLAIM) when fault
> injection is enabled?
> 
> Otherwise, I'd prefer some per-thread temporary fault injection
> disabling.

Well, there are two issues (which boil down to the one in the end) here.
Fault injection or a GFP_NOWAIT or any other weaker than GFP_KERNEL
context is something to care about. A weaker allocation context can and
will lead to kmemleak meta data allocation failures regardless of the
fault injection. The way how those objects are allocated directly in the
allacator context makes this really hard and inherently subtle. I can
see only two ways around. Either you have a placeholder for "this object
is not tracked so do not throw false positives" or have a preallocated
pool to use if the direct context allocation failes for whatever reason.
Abusing __GFP_NOFAIL is simply a crude hack which will lead to all kind
of problems.
-- 
Michal Hocko
SUSE Labs