On 4/30/24 1:49 AM, Dave Chinner wrote: > On Mon, Apr 29, 2024 at 09:59:43AM +0200, Vlastimil Babka wrote: >> On 4/29/24 7:47 AM, Christoph Hellwig wrote: >> > This loses flags like GFP_NOFS and GFP_NOIO that are important to avoid >> > deadlocks as well as GFP_NOLOCKDEP that otherwise generates lockdep false >> > positives. >> >> GFP_NOFS and GFP_NOIO translate to GFP_KERNEL without __GFP_FS/__GFP_IO so I >> don't see how this patch would have helped with those. >> __GFP_NOLOCKDEP is likely the actual issue and stackdepot solved it like this: >> >> https://lore.kernel.org/linux-xfs/20240418141133.22950-1-ryabinin.a.a@xxxxxxxxx/ >> >> So we could just do the same here. > > Yes, it is __GFP_NOLOCKDEP that is the issue here, but > cargo-cult-copying of that stackdepot fix is just whack-a-mole bug > fixing without addressing the technical debt that got us here in the > first place. Has anyone else bothered to look to see if kmemleak has > the same problem? Looks like you did :) > If anyone bothered to do an audit, they would see that > gfp_kmemleak_mask() handles the reclaim context masks correctly. > Further, it adds NOWARN, NOMEMALLOC and > NORETRY, which means the debug code is silent when it fails, it > doesn't deplete emergency reserves and doesn't bog down retrying > forever when there are sustained low memory situations. So we do have NOWARN here. __GFP_RETRY_MAYFAIL might have been slightly better than __GFP_NOWARN wrt "not retrying forever" but also not giving up too soon. If we want to be really careful about reserves, it's a question whether to keep the | GFP_ATOMIC which translates to leaving __GFP_HIGH. OTOH if we don't keep it, these allocations might fail too easily from an atomic context and we could miss the debugging data. > This also points out that the page-owner/stackdepot code that strips > GFP_ZONEMASK is completely redundant. Doing: > > gfp_flags &= GFP_KERNEL|GFP_ATOMIC|__GFP_NOLOCKDEP; > > strips everything but __GFP_RECLAIM, __GFP_FS, __GFP_IO, > __GFP_HIGH and __GFP_NOLOCKDEP. This already strips the zonemask > info, so there's no need to do it explicitly. True. > IOWs, the right way to fix this set of problems is to lift > gfp_kmemleak_mask() to include/linux/gfp.h and then use it across > all these nested allocations that occur behind the public > memory allocation API. Agree. But arguably these quick fixes adding __GFP_NOLOCKDEP were appropriate for the late rc phase we're in. > I've got a patchset under test at the moment that does this.... Great! Thanks. > -Dave.