Re: [PATCH] mm: slub: annotate kmem_cache_node->list_lock as raw_spinlock

Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> · Tue, 11 Apr 2023 22:25:06 +0800

On 2023/4/11 22:19, Vlastimil Babka wrote:
On 4/11/23 16:08, Qi Zheng wrote:

On 2023/4/11 21:40, Vlastimil Babka wrote:
On 4/11/23 15:08, Qi Zheng wrote:
The list_lock can be held in the critical section of
raw_spinlock, and then lockdep will complain about it
like below:

   =============================
   [ BUG: Invalid wait context ]
   6.3.0-rc6-next-20230411 #7 Not tainted
   -----------------------------
   swapper/0/1 is trying to lock:
   ffff888100055418 (&n->list_lock){....}-{3:3}, at: ___slab_alloc+0x73d/0x1330
   other info that might help us debug this:
   context-{5:5}
   2 locks held by swapper/0/1:
    #0: ffffffff824e8160 (rcu_tasks.cbs_gbl_lock){....}-{2:2}, at: cblist_init_generic+0x22/0x2d0
    #1: ffff888136bede50 (&ACCESS_PRIVATE(rtpcp, lock)){....}-{2:2}, at: cblist_init_generic+0x232/0x2d0
   stack backtrace:
   CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc6-next-20230411 #7
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x77/0xc0
    __lock_acquire+0xa65/0x2950
    ? arch_stack_walk+0x65/0xf0
    ? arch_stack_walk+0x65/0xf0
    ? unwind_next_frame+0x602/0x8d0
    lock_acquire+0xe0/0x300
    ? ___slab_alloc+0x73d/0x1330
    ? find_usage_forwards+0x39/0x50
    ? check_irq_usage+0x162/0xa70
    ? __bfs+0x10c/0x2c0
    _raw_spin_lock_irqsave+0x4f/0x90
    ? ___slab_alloc+0x73d/0x1330
    ___slab_alloc+0x73d/0x1330
    ? fill_pool+0x16b/0x2a0
    ? look_up_lock_class+0x5d/0x160
    ? register_lock_class+0x48/0x500
    ? __lock_acquire+0xabc/0x2950
    ? fill_pool+0x16b/0x2a0
    kmem_cache_alloc+0x358/0x3b0
    ? __lock_acquire+0xabc/0x2950
    fill_pool+0x16b/0x2a0
    ? __debug_object_init+0x292/0x560
    ? lock_acquire+0xe0/0x300
    ? cblist_init_generic+0x232/0x2d0
    __debug_object_init+0x2c/0x560
    cblist_init_generic+0x147/0x2d0
    rcu_init_tasks_generic+0x15/0x190
    kernel_init_freeable+0x6e/0x3e0
    ? rest_init+0x1e0/0x1e0
    kernel_init+0x1b/0x1d0
    ? rest_init+0x1e0/0x1e0
    ret_from_fork+0x1f/0x30
    </TASK>

The fill_pool() can only be called in the !PREEMPT_RT kernel
or in the preemptible context of the PREEMPT_RT kernel, so
the above warning is not a real issue, but it's better to
annotate kmem_cache_node->list_lock as raw_spinlock to get
rid of such issue.

+ CC some RT and RCU people

Thanks.

AFAIK raw_spinlock is not just an annotation, but on RT it changes the
implementation from preemptible mutex to actual spin lock, so it would be

Yeah.

rather unfortunate to do that for a spurious warning. Can it be somehow
fixed in a better way?

It's indeed unfortunate for the warning in the commit message. But
functions like kmem_cache_alloc(GFP_ATOMIC) may indeed be called
in the critical section of raw_spinlock or in the hardirq context, which

Hmm, I thought they may not, actually.

will cause problem in the PREEMPT_RT kernel. So I still think it is
reasonable to convert kmem_cache_node->list_lock to raw_spinlock type.

It wouldn't be the complete solution anyway. Once we allow even a GFP_ATOMIC
slab allocation for such context, it means also page allocation can happen
to refill the slabs, so lockdep will eventually complain about zone->lock,
and who knows what else.

Oh, indeed. :(

In addition, there are many fix patches for this kind of warning in the
git log, so I also think there should be a general and better solution. :)

Maybe, but given above, I doubt it's this one.

--
Thanks,
Qi