On 7/19/24 10:01 AM, Michal Hocko wrote: > On Fri 19-07-24 19:51:06, Barry Song wrote: >> On Fri, Jul 19, 2024 at 7:42 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > [...] >> > It cannot reclaim itself and it cannot sleep to wait for the memory so >> > NOFAIL semantic is simply impossible. We have put a warning in place to >> >> this is still "right" behaviour to retry infinitely at least according >> to the doc of >> __GFP_NOFAIL. > > I do not agree that implementing busy loop in the kernel is the right > practice! I think there are multiple aspects to this. One is what's the right practice, another what's the most graceful recovery from a buggy code. We can all agree that __GFP_NOFAIL together with GFP_NOWAIT or other variants that can't sleep is wrong and unsupported. Hence we should put sufficient warnings in place to prevent such combinations from being used in the first place. But what if some callsite makes it into mainline without anyone seeing the warnings in the process and fixing them, because it's rare to hit? And then an attacker finds a way to hit it? We could fail the allocation which would probably crash afterwards, hopefully non-exploitably. We could BUG_ON() which would crash reliably, but still become a DoS vulnerability. Or we could busy loop, which might be also a DoS but maybe not so reliable because kswapd or other direct reclaiming processes could recover the situation (and the user could report the warning that was produced in the process). That wouldn't mean the busy loop is a correct and supported practice. It would just mean it's the least bad of the bad options we have to deal with an allocation that's wrong but we didn't catch soon enough in the development. Compared to the original problem at hand, kmalloc_array() of impossible size with __GFP_NOFAIL is not recoverable by busy looping so as I've already said I like most the idea of BUG_ON(). Yes it's a DoS vector if someone finds such a user triggerable allocation, but I don't see a better option. Same thing should probably happen with a __GFP_NOFAIL page allocation that requests >MAX_ORDER. >> I assume getting new memory by many retries is still >> possibly some other processes might be reclaiming or freeing memory >> then providing free memory to this one being stuck. > > No, I strongly disagree we should even pretend this is a supported > allocation strategy. NAK to any attempt to legalize it in some form. >