Re: [PATCH] Revert "kmemleak: allow to coexist with fault injection"

Qian Cai <cai@xxxxxx> · Tue, 16 Jul 2019 16:28:21 -0400

On Tue, 2019-07-16 at 22:07 +0200, Michal Hocko wrote:
> On Tue 16-07-19 15:21:17, Qian Cai wrote:
> [...]
> > Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that
> > succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object
> > allocations.
> 
> Well, not really. Because low order allocations with
> __GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even
> without GFP_NOFAIL because that flag is actually to guarantee no
> failure. And for high order allocations the nofail mode is actively
> harmful. It completely changes the behavior of a system. A light costly
> order workload could put the system on knees and completely change the
> behavior. I am not really convinced this is a good behavior of a
> debugging feature TBH.

While I agree your general observation about GFP_NOFAIL, I am afraid the
discussion here is about "struct kmemleak_object" slab cache from a single call
site create_object(). 

> 
> > Otherwise, one kmemleak object allocation failure would kill the
> > whole kmemleak.
> 
> Which is not great but quite likely a better than an unpredictable MM
> behavior caused by NOFAIL storms. Really, this NOFAIL patch is a
> completely broken behavior. There shouldn't be much discussion about
> reverting it. I would even argue it shouldn't have been merged in the
> first place. It doesn't have any acks nor reviewed-bys while it abuses
> __GFP_NOFAIL which is generally discouraged to be used.

Again, it seems you are talking about GFP_NOFAIL in general. I don't really see
much unpredictable MM behavior which would disrupt the testing or generate
false-positive bug reports when "struct kmemleak_object" allocations with
GFP_NOFAIL apart from some warnings. All I see is that kmemleak stay alive help
find real memory leaks.