On Tue, Apr 24, 2018 at 7:02 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > On Tue 24-04-18 12:48:50, Chunyu Hu wrote: >> >> >> ----- Original Message ----- >> > From: "Michal Hocko" <mhocko@xxxxxxxxxx> >> > To: "Chunyu Hu" <chuhu.ncepu@xxxxxxxxx> >> > Cc: "Dmitry Vyukov" <dvyukov@xxxxxxxxxx>, "Catalin Marinas" <catalin.marinas@xxxxxxx>, "Chunyu Hu" >> > <chuhu@xxxxxxxxxx>, "LKML" <linux-kernel@xxxxxxxxxxxxxxx>, "Linux-MM" <linux-mm@xxxxxxxxx> >> > Sent: Tuesday, April 24, 2018 9:20:57 PM >> > Subject: Re: [RFC] mm: kmemleak: replace __GFP_NOFAIL to GFP_NOWAIT in gfp_kmemleak_mask >> > >> > On Mon 23-04-18 12:17:32, Chunyu Hu wrote: >> > [...] >> > > So if there is a new flag, it would be the 25th bits. >> > >> > No new flags please. Can you simply store a simple bool into fail_page_alloc >> > and have save/restore api for that? >> >> Hi Michal, >> >> I still don't get your point. The original NOFAIL added in kmemleak was >> for skipping fault injection in page/slab allocation for kmemleak object, >> since kmemleak will disable itself until next reboot, whenever it hit an >> allocation failure, in that case, it will lose effect to check kmemleak >> in errer path rose by fault injection. But NOFAULT's effect is more than >> skipping fault injection, it's also for hard allocation. So a dedicated flag >> for skipping fault injection in specified slab/page allocation was mentioned. > > I am not familiar with the kmemleak all that much, but fiddling with the > gfp_mask is a wrong way to achieve kmemleak specific action. I might be I would say this is more like slab fault injection-specific action. It can be used in other debugging facilities. Slab fault injection is a part of slab. Slab behavior is generally controlled with gfp_mask. > easilly wrong but I do not see any code that would restore the original > gfp_mask down the kmem_cache_alloc path. > >> d9570ee3bd1d ("kmemleak: allow to coexist with fault injection") >> >> Do you mean something like below, with the save/store api? But looks like >> to make it possible to skip a specified allocation, not global disabling, >> a bool is not enough, and a gfp_flag is also needed. Maybe I missed something? > > Yes, this is essentially what I meant. It is still a global thing which > is not all that great and if it matters then you can make it per > task_struct. That really depends on the code flow here. If we go this route, it definitely needs to be per task and also needs to work with interrupts: switch on interrupts and not corrupt on interrupts. A gfp flag is free of these problems.