On Mon, Jul 29, 2024 at 6:37 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote: > kernel test robot noticed "WARNING:possible_circular_locking_dependency_detected" on: > > commit: 17049be0e1bcf0aa8809faf84f3ddd8529cd6c4c ("[PATCH v3 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG") > url: https://github.com/intel-lab-lkp/linux/commits/Jann-Horn/kasan-catch-invalid-free-before-SLUB-reinitializes-the-object/20240726-045709 > patch link: https://lore.kernel.org/all/20240725-kasan-tsbrcu-v3-2-51c92f8f1101@xxxxxxxxxx/ > patch subject: [PATCH v3 2/2] slub: Introduce CONFIG_SLUB_RCU_DEBUG [...] > [ 136.014616][ C1] WARNING: possible circular locking dependency detected Looking at the linked dmesg, the primary thing that actually went wrong here is something in the SLUB bulk freeing code, we got multiple messages like: ``` BUG filp (Not tainted): Bulk free expected 1 objects but found 2 ----------------------------------------------------------------------------- Slab 0xffffea0005251f00 objects=23 used=23 fp=0x0000000000000000 flags=0x8000000000000040(head|zone=2) CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.10.0-00002-g17049be0e1bc #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0xa3/0x100 slab_err+0x15a/0x200 free_to_partial_list+0x2c9/0x600 [...] slab_free_after_rcu_debug+0x169/0x280 [...] rcu_do_batch+0x4a4/0xc40 rcu_core+0x36e/0x5c0 handle_softirqs+0x211/0x800 [...] __irq_exit_rcu+0x71/0x100 irq_exit_rcu+0x5/0x80 sysvec_apic_timer_interrupt+0x68/0x80 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x16/0x40 RIP: 0010:default_idle+0xb/0x40 Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d 17 ae 32 00 fb f4 <fa> c3 cc cc cc cc cc 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 RSP: 0018:ffff888104e5feb8 EFLAGS: 00200282 RAX: 4c16e5d04752e300 RBX: ffffffff813578df RCX: 0000000000995661 RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff813578df RBP: 0000000000000001 R08: ffff8883aebf6cdb R09: 1ffff11075d7ed9b R10: dffffc0000000000 R11: ffffed1075d7ed9c R12: 0000000000000000 R13: 1ffff110209ca008 R14: ffffffff87474e68 R15: dffffc0000000000 ? do_idle+0x15f/0x400 default_idle_call+0x6e/0x100 do_idle+0x15f/0x400 cpu_startup_entry+0x40/0x80 start_secondary+0x129/0x180 common_startup_64+0x129/0x1a7 </TASK> FIX filp: Object at 0xffff88814947e400 not freed ``` Ah, the issue is that I'm NULL as the tail pointer to do_slab_free() instead of passing in the pointer to the object again. That's the result of not being careful enough while forward-porting my patch from last year, it conflicted with vbabka's commit 284f17ac13fe ("mm/slub: handle bulk and single object freeing separately")... I'll fix that up in the next version. I don't think the lockdep warning is caused by code I introduced, it's just that you can only hit that warning when SLUB does printk... > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240729/202407291014.2ead1e72-oliver.sang@xxxxxxxxx