Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Reclamation interactions with RCU

Vlastimil Babka <vbabka@xxxxxxx> · Tue, 12 Mar 2024 15:46:32 +0100

On 3/3/24 23:45, NeilBrown wrote:
> On Sat, 02 Mar 2024, Kent Overstreet wrote:
>> 
>> *nod* 
>> 
>> > I suspect that most places where there is a non-error fallback already
>> > use NORETRY or RETRY_MAYFAIL or similar.
>> 
>> NORETRY and RETRY_MAYFAIL actually weren't on my radar, and I don't see
>> _tons_ of uses for either of them - more for NORETRY.
>> 
>> My go-to is NOWAIT in this scenario though; my common pattern is "try
>> nonblocking with locks held, then drop locks and retry GFP_KERNEL".
>>  
>> > But I agree that changing the meaning of GFP_KERNEL has a potential to
>> > cause problems.  I support promoting "GFP_NOFAIL" which should work at
>> > least up to PAGE_ALLOC_COSTLY_ORDER (8 pages).
>> 
>> I'd support this change.
>> 
>> > I'm unsure how it should be have in PF_MEMALLOC_NOFS and
>> > PF_MEMALLOC_NOIO context.  I suspect Dave would tell me it should work in
>> > these contexts, in which case I'm sure it should.
>> > 
>> > Maybe we could then deprecate GFP_KERNEL.
>> 
>> What do you have in mind?
> 
> I have in mind a more explicit statement of how much waiting is
> acceptable.
> 
> GFP_NOFAIL - wait indefinitely
> GFP_KILLABLE - wait indefinitely unless fatal signal is pending.
> GFP_RETRY - may retry but deadlock, though unlikely, is possible.  So
>             don't wait indefinitely.  May abort more quickly if fatal
>             signal is pending.
> GFP_NO_RETRY - only try things once.  This may sleep, but will give up
>             fairly quickly.  Either deadlock is a significant
>             possibility, or alternate strategy is fairly cheap.
> GFP_ATOMIC - don't sleep - same as current.
> 
> I don't see how "GFP_KERNEL" fits into that spectrum.  The definition of
> "this will try really hard, but might fail and we can't really tell you
> what circumstances it might fail in" isn't fun to work with.

The problem is if we set out to change everything from GFP_KERNEL to one of
the above, it will take many years. So I think it would be better to just
change the semantics of GFP_KERNEL too.

If we change it to remove the "too-small to fail" rule, we might suddenly
introduce crashes in unknown amount of places, so I don't think that's feasible.

But if we change it to effectively mean GFP_NOFAIL (for non-costly
allocations), there should be a manageable number of places to change to a
variant that allows failure. Also if these places are GFP_KERNEL by mistake
today, and should in fact allow failure, they would be already causing
problems today, as the circumstances where too-small-to-fail is violated are
quite rare (IIRC just being an oom victim, so somewhat close to
GFP_KILLABLE). So changing GFP_KERNEL to GFP_NOFAIL should be the lowest
risk (one could argue for GFP_KILLABLE but I'm afraid many places don't
really handle that as they assume the too-small-to-fail without exceptions
and are unaware of the oom victim loophole, and failing on any fatal signal
increases the chances of this happening).

> Thanks,
> NeilBrown
> 
> 
>> 
>> Deprecating GFP_NOFS and GFP_NOIO would be wonderful - those should
>> really just be PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO, now that we're
>> pushing for memalloc_flags_(save|restore) more.
>> 
>> Getting rid of those would be a really nice cleanup beacuse then gfp
>> flags would mostly just be:
>>  - the type of memory to allocate (highmem, zeroed, etc.)
>>  - how hard to try (don't block at all, block some, block forever)
>> 
> 
>