On Mon, Aug 19, 2024 at 5:39 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > > On Mon, Aug 19, 2024 at 9:25 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > On Mon, Aug 19, 2024 at 3:50 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > On Sun 18-08-24 10:55:09, Yafang Shao wrote: > > > > On Sat, Aug 17, 2024 at 2:25 PM Barry Song <21cnbao@xxxxxxxxx> wrote: > > > > > > > > > > From: Barry Song <v-songbaohua@xxxxxxxx> > > > > > > > > > > When users allocate memory with the __GFP_NOFAIL flag, they might > > > > > incorrectly use it alongside GFP_ATOMIC, GFP_NOWAIT, etc. This kind of > > > > > non-blockable __GFP_NOFAIL is not supported and is pointless. If we > > > > > attempt and still fail to allocate memory for these users, we have two > > > > > choices: > > > > > > > > > > 1. We could busy-loop and hope that some other direct reclamation or > > > > > kswapd rescues the current process. However, this is unreliable > > > > > and could ultimately lead to hard or soft lockups, > > > > > > > > That can occur even if we set both __GFP_NOFAIL and > > > > __GFP_DIRECT_RECLAIM, right? > > > > > > No, it cannot! With __GFP_DIRECT_RECLAIM the allocator might take a long > > > time to satisfy the allocation but it will reclaim to get the memory, it > > > will sleep if necessary and it will will trigger OOM killer if there is > > > no other option. __GFP_DIRECT_RECLAIM is a completely different story > > > than without it which means _no_sleeping_ is allowed and therefore only > > > a busy loop waiting for the allocation to proceed is allowed. > > > > That could be a livelock. > > From the user's perspective, there's no noticeable difference between > > a livelock, soft lockup, or hard lockup. > > This is certainly different. A lockup occurs when tasks can't be scheduled, > causing the entire system to stop functioning. When a livelock occurs, your only options are to migrate your applications to other servers or reboot the system—there’s no other resolution (except for using oomd, which is difficult for users without cgroup2 or swap). So, there's effectively no difference. > > > > > > > > > > So, I don't believe the issue is related > > > > to setting __GFP_DIRECT_RECLAIM; rather, it stems from the flawed > > > > design of __GFP_NOFAIL itself. > > > > > > Care to elaborate? > > > > I've read the documentation explaining why the busy loop is embedded > > within the page allocation process instead of letting users implement > > it based on their needs. However, the complexity and numerous issues > > suggest that this design might be fundamentally flawed. > > I don't see "numerous issues", only two issues: > > 1. allocation size overflow with __GFP_NOFAIL > 2. unsupported case: __GFP_NOWAIT/ATOMIC | __GFP_NOFAIL. > > for 1, it has been a BUG to require an overflowed size to always succeed. > > for 2, it is an unsupported case. we just need to hide __GFP_NOFAIL > and only expose GFP_NOFAIL(which definitely includes blockable) so > any unsupported case like vdpa will no longer occur. I would greatly > appreciate it if you or someone else could take over this task, as I am > currently extremely busy. > > > > > -- > > Regards > > Yafang > > Thanks > Barry -- Regards Yafang