On Tue 27-08-24 03:05:29, Kent Overstreet wrote: > On Tue, Aug 27, 2024 at 08:58:39AM GMT, Michal Hocko wrote: > > On Tue 27-08-24 02:40:16, Kent Overstreet wrote: > > > On Tue, Aug 27, 2024 at 08:01:32AM GMT, Michal Hocko wrote: > > > > You are not really answering the main concern I have brought up though. > > > > I.e. GFP_NOFAIL being fundamentally incompatible with NORECLAIM semantic > > > > because the page allocator doesn't and will not support this allocation > > > > mode. Scoped noreclaim semantic makes such a use much less visible > > > > because it can be deep in the scoped context there more error prone to > > > > introduce thus making the code harder to maintain. > > > > > > You're too attached to GFP_NOFAIL. > > > > Unfortunatelly GFP_NOFAIL is there and we need to support it. We cannot > > just close eyes and pretend it doesn't exist and hope for the best. > > You need to notice when you're trying to do something immpossible. Agreed! And GFP_NOFAIL for allocations <= order 1 in the page allocator or kvmalloc(GFP_NOFAIL) for reasonable sizes is a supported setup. And it should work as documented and shouldn't create any surprises. Like returning unexpected failure because you have been called from withing a NORECLAIM scope which you as an author of the code are not even aware of because that has happened somewhere detached from your code and you happen to be in a callchain. > > > GFP_NOFAIL is something we very rarely use, and it's not something we > > > want to use. Furthermore, GFP_NOFAIL allocations can fail regardless of > > > this patch - e.g. if it's more than 2 pages, it's not going to be > > > GFP_NOFAIL. > > > > We can reasonably assume we do not have any of those users in the tree > > though. We know that because we have a warning to tell us about that. > > We still have legit GFP_NOFAIL users and we can safely assume we will > > have some in the future though. And they have no way to handle the > > failure. If they did they wouldn't have used GFP_NOFAIL in the first > > place. So they do not check for NULL and they would either blow up or > > worse fail in subtle and harder to detect way. > > No, because not all GFP_NOFAIL allocations are statically sized. This is a runtime check warning. rmqueue: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); > And the problem of the dynamic context overriding GFP_NOFAIL is more > general - if you use GFP_NOFAIL from nonblocking context (interrupt > context or preemption disabled) - the allocation has to fail, or > something even worse will happen. If you use __GFP_NOFAIL | GFP_KERNEL from an atomic context then you are screwed the same way as if you used GFP_KERNEL alone - sleeping while atomic or worse. The allocator doesn't even try to deal with this and protect the caller by not sleeping and returning NULL. More fundamentally, GFP_NOFAIL from non-blocking context is an incorrect an unsupported use of the flag. This is the crux of the whole discussion. GFP_NOWAIT | __GFP_NOFAIL or GFP_ATOMIC | __GFP_NOFAIL is just a bug. We can git grep for those, and surprisingly found one instance which already has a patch waiting to be merged. We cannot enforce that at a compile time and that sucks but such is a life. But we can grep for this at least. Now consider a scoped (implicit) NOWAIT context which makes even seeemingly correct GFP_NOFAIL use a bug. -- Michal Hocko SUSE Labs