On Mon, Sep 2, 2024 at 5:09 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Mon 02-09-24 17:01:12, Yafang Shao wrote: > > > I really do not see why GFP_NOFAIL should be any special in this > > > specific case. > > > > I believe there's no way to stop it from looping, even if you > > implement a sophisticated user space OOM killer. ;) > > User space OOM killer should be helping to replenish a free memory and > we have some heuristics to help NOFAIL users out with some portion of > memory reserves already IIRC. So we do already give them some special > treatment in the page allocator path. Not so much in the reclaim path. When setting GFP_NOFAIL, it's important to not only enable direct reclaim but also the OOM killer. In scenarios where swap is off and there is minimal page cache, setting GFP_NOFAIL without __GFP_FS can result in an infinite loop. In other words, GFP_NOFAIL should not be used with GFP_NOFS. Unfortunately, many call sites do combine them. For example: XFS: fs/xfs/libxfs/xfs_exchmaps.c: GFP_NOFS | __GFP_NOFAIL fs/xfs/xfs_attr_item.c: GFP_NOFS | __GFP_NOFAIL EXT4: fs/ext4/mballoc.c: GFP_NOFS | __GFP_NOFAIL fs/ext4/extents.c: GFP_NOFS | __GFP_NOFAIL This seems problematic, but I'm not an FS expert. Perhaps Dave or Ted could provide further insight. -- Regards Yafang