On 22.08.24 11:11, Michal Hocko wrote:
On Thu 22-08-24 10:39:09, David Hildenbrand wrote:
[...]
But then the question is: does it really make sense to differentiate
difference between an NOFAIL allocation under memory pressure of MAX_ORDER
compared to MAX_ORDER+1 (Linus also touched on that)? It could well take
minutes/hours/days to satisfy a very large NOFAIL allocation. So callers
should be prepared to run into effective lockups ... :/
As pointed out in other subthread. We shouldn't really pretend we
support NOFAIL for order > 0, or at least anything > A_SMALL_ORDER and
encourage kvmalloc for those users.
A nofail order 2 allocation can kill most of the userspace on terribly
fragmented system that is kernel allocation heavy.
NOFAIL shouldn't exist, or at least not used to that degree.
Let's put whishful thinking aside. Unless somebody manages to go over
all existing NOFAIL users and fix them then we should better focus on
providing a reasonable clearly documented and enforced semantic.
Probably it would be time better spent than trying to find ways to deal
with that mess. ;)
I think the documentation is mostly there:
"The VM implementation _must_ retry infinitely: the caller cannot handle
allocation failures. The allocation could block indefinitely but will
never return with failure. Testing for failure is pointless."
To me, that implies that if you pass in MAX_ORDER+1 the VM will "retry
infinitely". if that implies just OOPSing or actually be in a busy loop,
I don't care. It could effectively happen with MAX_ORDER as well, as
stated. But certainly not BUG_ON.
"Using this flag for costly allocations is _highly_ discouraged" should
be rephrased to "Using this flag with costly allocations is _highly
dangerous_ and will likely result in the allocation never succeeding and
this function never making any progress."
I do also agree that renaming NOFAIL to make some of that clearer makes
sense.
Likely, checkpatch should be updated to warn on any new NOFAIL usage.