Re: [PATCH] kmemleak: don't use __GFP_NOFAIL

Chunyu Hu <chuhu@xxxxxxxxxx> · Fri, 1 Jun 2018 00:53:43 -0400 (EDT)

----- Original Message -----
> From: "Chunyu Hu" <chuhu@xxxxxxxxxx>
> To: "Catalin Marinas" <catalin.marinas@xxxxxxx>
> Cc: "Michal Hocko" <mhocko@xxxxxxxx>, "Tetsuo Handa" <penguin-kernel@xxxxxxxxxxxxxxxxxxx>, malat@xxxxxxxxxx,
> dvyukov@xxxxxxxxxx, linux-mm@xxxxxxxxx, "Akinobu Mita" <akinobu.mita@xxxxxxxxx>
> Sent: Friday, June 1, 2018 9:50:20 AM
> Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL
> 
> 
> 
> ----- Original Message -----
> > From: "Michal Hocko" <mhocko@xxxxxxxx>
> > To: "Catalin Marinas" <catalin.marinas@xxxxxxx>
> > Cc: "Chunyu Hu" <chuhu@xxxxxxxxxx>, "Tetsuo Handa"
> > <penguin-kernel@xxxxxxxxxxxxxxxxxxx>, malat@xxxxxxxxxx,
> > dvyukov@xxxxxxxxxx, linux-mm@xxxxxxxxx, "Akinobu Mita"
> > <akinobu.mita@xxxxxxxxx>
> > Sent: Friday, June 1, 2018 2:41:04 AM
> > Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL
> > 
> > On Thu 31-05-18 16:22:26, Catalin Marinas wrote:
> > > Hi Michal,
> > > 
> > > I'm catching up with this thread.
> > > 
> > > On Wed, May 30, 2018 at 02:38:26PM +0200, Michal Hocko wrote:
> > > > On Wed 30-05-18 07:42:59, Chunyu Hu wrote:
> > > > > From: "Michal Hocko" <mhocko@xxxxxxxx>
> > > > > > want to have a pre-populated pool of objects for those requests.
> > > > > > The
> > > > > > obvious question is how to balance such a pool. It ain't easy to
> > > > > > track
> > > > > > memory by allocating more memory...
> > > > > 
> > > > > This solution is going to make kmemleak trace really nofail. We can
> > > > > think
> > > > > later.
> > > > > 
> > > > > while I'm thinking about if fault inject can be disabled via flag in
> > > > > task.
> > > > > 
> > > > > Actually, I'm doing something like below, the disable_fault_inject()
> > > > > is
> > > > > just setting a flag in task->make_it_fail. But this will depend on if
> > > > > fault injection accept a change like this. CCing Akinobu
> > > > 
> > > > You still seem to be missing my point I am afraid (or I am ;). So say
> > > > that you want to track a GFP_NOWAIT allocation request. So
> > > > create_object
> > > > will get called with that gfp mask and no matter what you try here your
> > > > tracking object will be allocated in a weak allocation context as well
> > > > and disable kmemleak. So it only takes a more heavy memory pressure and
> > > > the tracing is gone...
> > > 
> > > create_object() indeed gets the originating gfp but it only cares
> > > whether it was GFP_KERNEL or GFP_ATOMIC. gfp_kmemleak_mask() masks out
> > > all the other flags when allocating a struct kmemleak_object (and adds
> > > some of its own).
> > > 
> > > This has worked ok so far. There is a higher risk of GFP_ATOMIC
> > > allocations failing but I haven't seen issues with kmemleak unless you
> > > run it under heavy memory pressure (and kmemleak just disables itself).
> > > With fault injection, such pressure is simulated with the side effect of
> > > rendering kmemleak unusable.
> > > 
> > > Kmemleak could implement its own allocation mechanism (maybe even
> > > skipping the slab altogether) but I feel it's overkill just to cope with
> > > the specific case of fault injection. Also, it's not easy to figure out
> > > how to balance such pool and it may end up calling alloc_pages() which
> > > in turn can inject a fault.
> 
> 
> it would benefit kmemleak trace, I see in my test that kmemleak even can work
> in
> user pressure cases, such as in my test, stress-ng to consume
> nearly all the swap space. kmemleak is still working. but 1M objects pool
> is consuming around 400M + memory. So this is just a experiment try, as you
> said, how to balance it's size is the issue or ther issues has to be
> resolved,
> such as when to add pool, the speed, how big, and so on ...
> 
> And I fault injected 20000 times fail_page_alloc, and 2148 times happened
> in create_object, and in such a case, kmemleak is till working after the
> 2000+ calloc failures.
> 
> [root@dhcp-12-244 fail_page_alloc]# grep create_object /var/log/messages | wc
> -l
> 2148
> 
> [60498.299412] FAULT_INJECTION: forcing a failure.
> name fail_page_alloc, interval 0, probability 80, space 0, times 2
> 
> So this way is not just for fault injection, it's about making kmemleak
> a bit stronger under memory failure case. It would be an exciting experience
> we
> see if kmemleak still work even after mem pressure, as a user, I experienced
> the good usage.
> 
> 
> > > 
> > > Could we tweak gfp_kmemleak_mask() to still pass __GFP_NOFAIL but in a
> > > compatible way (e.g. by setting __GFP_DIRECT_RECLAIM) when fault
> > > injection is enabled?
> 
> Maybe I can have a try on this..

looks like I tried this in my first report thread, and it failed. As it
can sleep in irq() ..

https://marc.info/?l=linux-mm&m=152482400222806&w=2

> 
> > > 
> > > Otherwise, I'd prefer some per-thread temporary fault injection
> > > disabling.
> 
> I tried in make_it_fail flag, kmemleak can avoid fault injection, but I
> can see kmemleak diabled itself...
> 
> > 
> > Well, there are two issues (which boil down to the one in the end) here.
> > Fault injection or a GFP_NOWAIT or any other weaker than GFP_KERNEL
> > context is something to care about. A weaker allocation context can and
> > will lead to kmemleak meta data allocation failures regardless of the
> > fault injection. The way how those objects are allocated directly in the
> > allacator context makes this really hard and inherently subtle. I can
> > see only two ways around. Either you have a placeholder for "this object
> > is not tracked so do not throw false positives" or have a preallocated
> > pool to use if the direct context allocation failes for whatever reason.
> > Abusing __GFP_NOFAIL is simply a crude hack which will lead to all kind
> > of problems.
> > --
> > Michal Hocko
> > SUSE Labs
> > 
> 
> --
> Regards,
> Chunyu Hu
> 
> 

-- 
Regards,
Chunyu Hu