On 06.12.22 16:38, Alan Stern wrote:
Hi,
Oliver:
This looks like a bug in the anchor API.
Yes, it does.
On Tue, Dec 06, 2022 at 02:43:41AM -0800, syzbot wrote:
Hello,
syzbot found the following issue on:
HEAD commit: ef4d3ea40565 afs: Fix server->active leak in afs_put_server
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=100b244d880000
kernel config: https://syzkaller.appspot.com/x/.config?x=8e7e79f8a1e34200
dashboard link: https://syzkaller.appspot.com/bug?extid=712fd0e60dda3ba34642
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ef790e7777cd/disk-ef4d3ea4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2ed3c6bc9230/vmlinux-ef4d3ea4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/f1dbd004fa88/bzImage-ef4d3ea4.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+712fd0e60dda3ba34642@xxxxxxxxxxxxxxxxxxxxxxxxx
xpad 3-1:179.65: xpad_irq_in - usb_submit_urb failed with result -19
xpad 3-1:179.65: xpad_irq_out - usb_submit_urb failed with result -19
==================================================================
BUG: KASAN: use-after-free in register_lock_class+0x8d2/0x9b0 kernel/locking/lockdep.c:1338
Read of size 1 at addr ffff88807a58b091 by task kworker/u4:3/46
CPU: 0 PID: 46 Comm: kworker/u4:3 Not tainted 6.1.0-rc7-syzkaller-00103-gef4d3ea40565 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: bat_events batadv_nc_worker
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
print_address_description+0x74/0x340 mm/kasan/report.c:284
print_report+0x107/0x220 mm/kasan/report.c:395
kasan_report+0x139/0x170 mm/kasan/report.c:495
register_lock_class+0x8d2/0x9b0 kernel/locking/lockdep.c:1338
__lock_acquire+0xe4/0x1f60 kernel/locking/lockdep.c:4934
lock_acquire+0x1a7/0x400 kernel/locking/lockdep.c:5668
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
__wake_up_common_lock kernel/sched/wait.c:136 [inline]
__wake_up+0xf8/0x1c0 kernel/sched/wait.c:156
__usb_hcd_giveback_urb+0x3a0/0x530 drivers/usb/core/hcd.c:1674
This is the call to usb_anchor_resume_wakeups(). The call is made after
the completion handler callback. Evidently the xpad driver deallocated
the anchor during that time window. This can happen if the driver is
just waiting for its last URB to complete before freeing all its memory.
Yes, complete() had run.
I don't know what the best solution is. It may be necessary to refcount
anchors somehow.
Then we cannot embed them anymore. Many drivers would need a lot of changes.
xpad included.
As far as I can tell the order we decrease use_count is correct. But:
6ec4147e7bdbd (Hans de Goede 2013-10-09 17:01:41 +0200 1674) usb_anchor_resume_wakeups(anchor);
94dfd7edfd5c9 (Ming Lei 2013-07-03 22:53:07 +0800 1675) atomic_dec(&urb->use_count);
Do we need to guarantee memory ordering here?
Regards
Oliver