Re: [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT compatible

Vlastimil Babka <vbabka@xxxxxxx> · Thu, 29 Jul 2021 16:17:26 +0200

On 7/29/21 3:49 PM, Sebastian Andrzej Siewior wrote:

> now that I'm slowly catching up…

> 

> On 2021-07-02 22:25:05 [+0200], Vlastimil Babka wrote:

>> > - perf_5.10 stat -r 10 hackbench -g200 -s 4096 -l500

>> > Old:

>> > |         464.967,20 msec task-clock                #   27,220 CPUs utilized            ( +-  0,16% )

>> > New:

>> > |         422.865,71 msec task-clock                #    4,782 CPUs utilized            ( +-  0,34% )

>> 

>> The series shouldn't significantly change the memory allocator

>> interaction, though.

>> Seems there's less cycles, but more time elapsed, thus more sleeping -

>> is it locks becoming mutexes on RT?

> 

> yes, most likely since the !RT parts are mostly unchanged.

> 

>> My second guess - list_lock remains spinlock with my series, thus RT

>> mutex, but the current RT tree converts it to raw_spinlock. I'd hope

>> leaving that one as non-raw spinlock would still be much better for RT

>> goals, even if hackbench (which is AFAIK very slab intensive) throughput

>> regresses - hopefully not that much.

> 

> Yes, the list_lock seems to be the case. I picked your

> slub-local-lock-v3r0 and changed the list_lock (+slab_lock()) to use

> raw_spinlock_t and disable interrupts and CPUs utilisation went to

> ~23CPUs (plus a bunch of warnings which probably made it a little slower

> again).

I forgot to point that out in the cover letter, but with v3 this change to
raw_spinlock_t is AFAICS no longer possible (at least with
CONFIG_SLUB_CPU_PARTIAL) because in put_cpu_partial() we now take the local_lock
and it can be called from get_partial_node() which takes the list_lock.

> The difference between a sleeping lock (spinlock_t) and a mutex is

> that we attempt not to preempt a task that acquired a spinlock_t even if

> it is running for some time and the scheduler would preempt it (like it

> would do if the task had a mutex acquired. These are the "lazy preempt"

> bits in the RT patch).

> 

> By making the list_lock a raw_spinlock_t a lot of IRQ-flags dancing

> needs to be done as the page-allocator must be entered with enabled

> interrupts.

Hm but SLUB should never call the page allocator from under list_lock in my series?

> And then there is the possibility that you may need to free

> some memory even if you allocate memory which requires some extra steps

> on RT due to the IRQ-off part. All this vanishes by keeping list_lock a

> spinlock_t.

> The kernel-build test on /dev/shm remained unchanged so that is good.

> Unless there is a real-world use-case, that gets worse, I don't mind

> keeping the spinlock_t here. I haven't seen tglx complaining so far.

Good. IIRC hackbench is very close to being a slab microbenchmark, so
regressions there are expected, but should not translate to notable real world
regressions.

> Sebastian

>