On Mon, Apr 6, 2020 at 7:44 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Mon, Apr 06, 2020 at 08:21:51PM +0300, Leon Romanovsky wrote: > > + RDMA > > > > On Sun, Apr 05, 2020 at 11:37:15PM -0700, syzbot wrote: > > > Hello, > > > > > > syzbot found the following crash on: > > > > > > HEAD commit: 304e0242 net_sched: add a temporary refcnt for struct tcin.. > > > git tree: net > > > console output: https://syzkaller.appspot.com/x/log.txt?x=119dd16de00000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=8c1e98458335a7d1 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=9627a92b1f9262d5d30c > > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > > Reported-by: syzbot+9627a92b1f9262d5d30c@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > sysfs group 'power' not found for kobject 'umad1' > > > WARNING: CPU: 1 PID: 31308 at fs/sysfs/group.c:279 sysfs_remove_group fs/sysfs/group.c:279 [inline] > > > WARNING: CPU: 1 PID: 31308 at fs/sysfs/group.c:279 sysfs_remove_group+0x155/0x1b0 fs/sysfs/group.c:270 > > > Kernel panic - not syncing: panic_on_warn set ... > > > CPU: 1 PID: 31308 Comm: kworker/u4:10 Not tainted 5.6.0-syzkaller #0 > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > > Workqueue: events_unbound ib_unregister_work > > > Call Trace: > > > __dump_stack lib/dump_stack.c:77 [inline] > > > dump_stack+0x188/0x20d lib/dump_stack.c:118 > > > panic+0x2e3/0x75c kernel/panic.c:221 > > > __warn.cold+0x2f/0x35 kernel/panic.c:582 > > > report_bug+0x27b/0x2f0 lib/bug.c:195 > > > fixup_bug arch/x86/kernel/traps.c:175 [inline] > > > fixup_bug arch/x86/kernel/traps.c:170 [inline] > > > do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267 > > > do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286 > > > invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 > > > RIP: 0010:sysfs_remove_group fs/sysfs/group.c:279 [inline] > > > RIP: 0010:sysfs_remove_group+0x155/0x1b0 fs/sysfs/group.c:270 > > > Code: 48 89 d9 49 8b 14 24 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 80 3c 01 00 75 41 48 8b 33 48 c7 c7 60 c3 39 88 e8 93 c3 5f ff <0f> 0b eb 95 e8 22 62 cb ff e9 d2 fe ff ff 48 89 df e8 15 62 cb ff > > > RSP: 0018:ffffc90001d97a60 EFLAGS: 00010282 > > > RAX: 0000000000000000 RBX: ffffffff88915620 RCX: 0000000000000000 > > > RDX: 0000000000000000 RSI: ffffffff815ca861 RDI: fffff520003b2f3e > > > RBP: 0000000000000000 R08: ffff8880a78fc2c0 R09: ffffed1015ce66a1 > > > R10: ffffed1015ce66a0 R11: ffff8880ae733507 R12: ffff88808e5ba070 > > > R13: ffffffff88915bc0 R14: ffff88808e5ba008 R15: dffffc0000000000 > > > dpm_sysfs_remove+0x97/0xb0 drivers/base/power/sysfs.c:794 > > > device_del+0x18b/0xd30 drivers/base/core.c:2687 > > > cdev_device_del+0x15/0x80 fs/char_dev.c:570 > > > ib_umad_kill_port+0x45/0x250 drivers/infiniband/core/user_mad.c:1327 > > > ib_umad_remove_one+0x18a/0x220 drivers/infiniband/core/user_mad.c:1409 > > > remove_client_context+0xbe/0x110 drivers/infiniband/core/device.c:724 > > > disable_device+0x13b/0x230 drivers/infiniband/core/device.c:1270 > > > __ib_unregister_device+0x91/0x180 drivers/infiniband/core/device.c:1437 > > > ib_unregister_work+0x15/0x30 drivers/infiniband/core/device.c:1547 > > > process_one_work+0x965/0x16a0 kernel/workqueue.c:2266 > > > worker_thread+0x96/0xe20 kernel/workqueue.c:2412 > > > kthread+0x388/0x470 kernel/kthread.c:268 > > > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 > > > Kernel Offset: disabled > > > Rebooting in 86400 seconds.. > > I'm not sure what could be done wrong here to elicit this: > > sysfs group 'power' not found for kobject 'umad1' > > ?? > > I've seen another similar sysfs related trigger that we couldn't > figure out. > > Hard to investigate without a reproducer. > > Jason Based on all of the sysfs-related bugs I've seen, my bet would be on some races. E.g. one thread registers devices, while another unregisters these.