On 12/14/22 00:09, Wei Chen wrote: > Dear Linux Developer, > > Recently, when using our tool to fuzz kernel, the following crash was triggered. > > HEAD commit: 094226ad94f4 Linux v6.1-rc5 > git tree: upstream > compiler: clang 12.0.1 > console output: > https://drive.google.com/file/d/1QZttkbuLed4wp6U32UR6TpxfY_HHCIqQ/view?usp=share_link > kernel config: https://drive.google.com/file/d/1TdPsg_5Zon8S2hEFpLBWjb8Tnd2KA5WJ/view?usp=share_link > > Unfortunately, I didn't have a reproducer for this crash yet. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: Wei Chen <harperchen1110@xxxxxxxxx> > > ===================================================== > WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected > 6.1.0-rc5 #40 Not tainted > ----------------------------------------------------- > syz-executor.0/27911 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: > ffff888076cc4f30 (&new->fa_lock){....}-{2:2}, at: kill_fasync_rcu > fs/fcntl.c:996 [inline] > ffff888076cc4f30 (&new->fa_lock){....}-{2:2}, at: > kill_fasync+0x13b/0x430 fs/fcntl.c:1017 [...] > stack backtrace: > CPU: 0 PID: 27911 Comm: syz-executor.0 Not tainted 6.1.0-rc5 #40 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.13.0-1ubuntu1.1 04/01/2014 > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106 > print_bad_irq_dependency kernel/locking/lockdep.c:2611 [inline] > check_irq_usage kernel/locking/lockdep.c:2850 [inline] > check_prev_add kernel/locking/lockdep.c:3101 [inline] > check_prevs_add+0x4e5f/0x5b70 kernel/locking/lockdep.c:3216 > validate_chain kernel/locking/lockdep.c:3831 [inline] > __lock_acquire+0x4411/0x6070 kernel/locking/lockdep.c:5055 > lock_acquire+0x17f/0x430 kernel/locking/lockdep.c:5668 > __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:160 [inline] > _raw_read_lock_irqsave+0xbb/0x100 kernel/locking/spinlock.c:236 > kill_fasync_rcu fs/fcntl.c:996 [inline] > kill_fasync+0x13b/0x430 fs/fcntl.c:1017 > sg_rq_end_io+0x604/0xf50 drivers/scsi/sg.c:1403 The problem is here: sg_rq_end_io() calling kill_fasync(). But at a quick glance, this is not the only driver calling kill_fasync() with a spinlock held with irq disabled... So there may be a fundamental problem with kill_fasync() function if drivers are allowed to do that, or the reverse, all drivers calling that function with a lock held with irq disabled need to be fixed. Al, Chuck, Jeff, Any thought ? > __blk_mq_end_request+0x2c7/0x380 block/blk-mq.c:1011 > scsi_end_request+0x4ed/0x9c0 drivers/scsi/scsi_lib.c:576 > scsi_io_completion+0xc25/0x27a0 drivers/scsi/scsi_lib.c:985 > ata_scsi_simulate+0x336e/0x3dd0 drivers/ata/libata-scsi.c:4190 > __ata_scsi_queuecmd+0x20b/0x1020 drivers/ata/libata-scsi.c:4009 > ata_scsi_queuecmd+0xa0/0x130 drivers/ata/libata-scsi.c:4052 > scsi_dispatch_cmd drivers/scsi/scsi_lib.c:1524 [inline] > scsi_queue_rq+0x1ea6/0x2ec0 drivers/scsi/scsi_lib.c:1760 > blk_mq_dispatch_rq_list+0x104f/0x2ca0 block/blk-mq.c:1992 > __blk_mq_sched_dispatch_requests+0x382/0x490 block/blk-mq-sched.c:306 > blk_mq_sched_dispatch_requests+0xef/0x160 block/blk-mq-sched.c:339 > __blk_mq_run_hw_queue+0x1cf/0x260 block/blk-mq.c:2110 > blk_mq_sched_insert_request+0x1e2/0x430 block/blk-mq-sched.c:458 > blk_execute_rq_nowait+0x2e8/0x3b0 block/blk-mq.c:1305 > sg_common_write+0x8c0/0x1970 drivers/scsi/sg.c:832 > sg_new_write+0x61f/0x860 drivers/scsi/sg.c:770 > sg_ioctl_common drivers/scsi/sg.c:935 [inline] > sg_ioctl+0x1c51/0x2be0 drivers/scsi/sg.c:1159 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:870 [inline] > __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:856 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7f153dc8bded > Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007f153ede2c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 00007f153ddabf80 RCX: 00007f153dc8bded > RDX: 0000000020000440 RSI: 0000000000002285 RDI: 0000000000000006 > RBP: 00007f153dcf8ce0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 00007f153ddabf80 > R13: 00007ffc72e5108f R14: 00007ffc72e51230 R15: 00007f153ede2dc0 > </TASK> > > Best, > Wei -- Damien Le Moal Western Digital Research