Re: [RFC 09/26] mm, slub: move disabling/enabling irqs to ___slab_alloc()

Vlastimil Babka <vbabka@xxxxxxx> · Tue, 25 May 2021 14:47:10 +0200

On 5/25/21 2:35 PM, Mel Gorman wrote:
> On Tue, May 25, 2021 at 01:39:29AM +0200, Vlastimil Babka wrote:
>> Currently __slab_alloc() disables irqs around the whole ___slab_alloc().  This
>> includes cases where this is not needed, such as when the allocation ends up in
>> the page allocator and has to awkwardly enable irqs back based on gfp flags.
>> Also the whole kmem_cache_alloc_bulk() is executed with irqs disabled even when
>> it hits the __slab_alloc() slow path, and long periods with disabled interrupts
>> are undesirable.
>> 
>> As a first step towards reducing irq disabled periods, move irq handling into
>> ___slab_alloc(). Callers will instead prevent the s->cpu_slab percpu pointer
>> from becoming invalid via migrate_disable(). This does not protect against
>> access preemption, which is still done by disabled irq for most of
>> ___slab_alloc(). As the small immediate benefit, slab_out_of_memory() call from
>> ___slab_alloc() is now done with irqs enabled.
>> 
>> kmem_cache_alloc_bulk() disables irqs for its fastpath and then re-enables them
>> before calling ___slab_alloc(), which then disables them at its discretion. The
>> whole kmem_cache_alloc_bulk() operation also disables cpu migration.
>> 
>> When  ___slab_alloc() calls new_slab() to allocate a new page, re-enable
>> preemption, because new_slab() will re-enable interrupts in contexts that allow
>> blocking.
>> 
>> The patch itself will thus increase overhead a bit due to disabled migration
>> and increased disabling/enabling irqs in kmem_cache_alloc_bulk(), but that will
>> be gradually improved in the following patches.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> 
> Why did you use migrate_disable instead of preempt_disable? There is a
> fairly large comment in include/linux/preempt.h on why migrate_disable
> is undesirable so new users are likely to be put under the microscope
> once Thomas or Peter notice it.

I understood it as while undesirable, there's nothing better for now.

> I think you are using it so that an allocation request can be preempted by
> a higher priority task but given that the code was disabling interrupts,
> there was already some preemption latency.

Yes, and the disabled interrupts will get progressively "smaller" in the series.

> However, migrate_disable
> is more expensive than preempt_disable (function call versus a simple
> increment).

That's true, I think perhaps it could be reimplemented so that on !PREEMPT_RT
and with no lockdep/preempt/whatnot debugging it could just translate to an
inline migrate_disable?

> On that basis, I'd recommend starting with preempt_disable
> and only using migrate_disable if necessary.

That's certainly possible and you're right it would be a less disruptive step.
My thinking was that on !PREEMPT_RT it's actually just preempt_disable (however
with the call overhead currently), but PREEMPT_RT would welcome the lack of
preempt disable. I'd be interested to hear RT guys opinion here.

> Bonus points for adding a comment where ___slab_alloc disables IRQs to
> clarify what is protected -- I assume it's protecting kmem_cache_cpu
> from being modified from interrupt context. If so, it's potentially a
> local_lock candidate.

Yeah that gets cleared up later :)