On Tue 16-07-19 16:28:21, Qian Cai wrote: > On Tue, 2019-07-16 at 22:07 +0200, Michal Hocko wrote: > > On Tue 16-07-19 15:21:17, Qian Cai wrote: > > [...] > > > Thanks to this commit, there are allocation with __GFP_DIRECT_RECLAIM that > > > succeeded would keep trying with __GFP_NOFAIL for kmemleak tracking object > > > allocations. > > > > Well, not really. Because low order allocations with > > __GFP_DIRECT_RECLAIM basically never fail (they keep retrying) even > > without GFP_NOFAIL because that flag is actually to guarantee no > > failure. And for high order allocations the nofail mode is actively > > harmful. It completely changes the behavior of a system. A light costly > > order workload could put the system on knees and completely change the > > behavior. I am not really convinced this is a good behavior of a > > debugging feature TBH. > > While I agree your general observation about GFP_NOFAIL, I am afraid the > discussion here is about "struct kmemleak_object" slab cache from a single call > site create_object(). OK, this makes it less harmfull because the order aspect doesn't really apply here. But still stretches the NOFAIL semantic a lot. The kmemleak essentially asks for NORETRY | NOFAIL which means no oom but retry for ever semantic for sleeping allocations. This can still lead to unexpected side effects. Just consider a call site that holds locks and now cannot make any forward progress without anybody else hitting the oom killer for example. As noted in other email, I would simply drop NORETRY flag as well and live with the fact that the oom killer can be invoked. It still wouldn't solve the NOWAIT contexts but those need a proper solution anyway. -- Michal Hocko SUSE Labs