On Wed, Oct 02, 2024 at 07:50:41AM GMT, Michael S. Tsirkin wrote:
On Wed, Oct 02, 2024 at 11:34:48AM +0200, Christian Brauner wrote:
Hey,
I'm seeing the following splat on v6.12-rc1 in case this hasn't been
reported yet:
@Christian thanks for the report, any suggestions on how to replicate
it?
[ 33.662603] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:330
[ 33.667624] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 772, name: incus-agent
[ 33.671430] preempt_count: 0, expected: 0
[ 33.674664] RCU nest depth: 1, expected: 0
[ 33.676600] 3 locks held by incus-agent/772:
[ 33.678184] #0: ffff8881d1b38258 (sk_lock-AF_VSOCK){+.+.}-{0:0}, at: vsock_connectible_sendmsg+0x142/0x11c0 [vsock]
[ 33.681712] #1: ffffffff9b6f3420 (rcu_read_lock){....}-{1:2}, at: virtio_transport_send_pkt+0x7c/0x520 [vmw_vsock_virtio_transport]
[ 33.685627] #2: ffff88818ca2d160 (&vsock->tx_lock){+.+.}-{3:3}, at: virtio_transport_send_pkt+0x43a/0x520 [vmw_vsock_virtio_transport]
[ 33.688906] CPU: 23 UID: 0 PID: 772 Comm: incus-agent Not tainted 6.12.0-rc1-g9852d85ec9d4 #23
[ 33.690849] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/Incus, BIOS unknown 2/2/2022
[ 33.692746] Call Trace:
[ 33.693337] <TASK>
[ 33.693846] dump_stack_lvl+0x6f/0xa0
[ 33.694740] __might_resched.cold+0x1ed/0x253
[ 33.696031] ? perf_trace_sched_switch+0x6f0/0x6f0
[ 33.697143] ? virtqueue_add_split+0x9b2/0x1f50
[ 33.698120] __kmalloc_noprof+0x313/0x440
[ 33.698942] ? __lock_acquire+0xcf6/0x4790
[ 33.699861] virtqueue_add_split+0x9b2/0x1f50
[ 33.700864] ? find_held_lock+0x2d/0x110
[ 33.701691] ? reacquire_held_locks+0x4d0/0x4d0
[ 33.702646] ? virtqueue_kick+0xe0/0xe0
[ 33.703462] ? __virt_addr_valid+0x155/0x440
[ 33.709313] ? lock_acquire+0x31/0xc0
[ 33.710078] ? __virt_addr_valid+0x155/0x440
[ 33.710981] virtqueue_add_sgs+0x10c/0x140
[ 33.711891] virtio_transport_send_skb+0x102/0x560 [vmw_vsock_virtio_transport]
This passes GFP_KERNEL to virtqueue_add_sgs
[ 33.713480] virtio_transport_send_pkt+0x450/0x520 [vmw_vsock_virtio_transport]
[ 33.713499] virtio_transport_send_pkt_info+0x64f/0x1050 [vmw_vsock_virtio_transport_common]
This calls a mutex taking function under rcu read lock.
Plus plus ungood.
[ 33.713525] virtio_transport_stream_enqueue+0xaa/0xe0 [vmw_vsock_virtio_transport_common]
[ 33.713534] ? virtio_transport_shutdown+0xf0/0xf0 [vmw_vsock_virtio_transport_common]
[ 33.713548] ? _raw_spin_unlock_irqrestore+0x53/0x80
[ 33.724011] ? lockdep_hardirqs_on+0x78/0x100
[ 33.724948] ? _raw_spin_unlock_irqrestore+0x40/0x80
[ 33.726954] vsock_connectible_sendmsg+0xc9d/0x11c0 [vsock]
[ 33.728983] ? vsock_ioctl+0x240/0x240 [vsock]
[ 33.730554] ? __init_waitqueue_head+0x150/0x150
[ 33.732409] ? __lock_acquire+0xcf6/0x4790
[ 33.733833] ? vsock_ioctl+0x240/0x240 [vsock]
[ 33.734976] __sock_sendmsg+0xc1/0x150
[ 33.736048] sock_write_iter+0x21e/0x380
[ 33.737055] ? __sock_sendmsg+0x150/0x150
[ 33.738179] ? lockdep_hardirqs_on_prepare+0x410/0x410
[ 33.742393] ? security_file_permission+0x24/0x70
[ 33.742437] ? rw_verify_area+0x70/0x5e0
[ 33.746265] vfs_write+0xa4e/0xe40
[ 33.746291] ? reacquire_held_locks+0x4d0/0x4d0
[ 33.746303] ? trace_lock_acquire+0x141/0x1c0
[ 33.746317] ? kernel_write+0x630/0x630
[ 33.746342] ? __fget_files+0x30/0x350
[ 33.757003] ksys_write+0x178/0x1d0
[ 33.757027] ? __x64_sys_read+0xb0/0xb0
[ 33.757044] ? syscall_trace_enter+0x118/0x1e0
[ 33.757091] do_syscall_64+0x74/0x190
[ 33.763371] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 33.763390] RIP: 0033:0x40720e
[ 33.763411] Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48
[ 33.763418] RSP: 002b:000000c00021f508 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 33.763441] RAX: ffffffffffffffda RBX: 000000000000000f RCX: 000000000040720e
[ 33.763446] RDX: 00000000000003b5 RSI: 000000c00014e000 RDI: 000000000000000f
[ 33.763451] RBP: 000000c00021f548 R08: 0000000000000000 R09: 0000000000000000
[ 33.763455] R10: 0000000000000000 R11: 0000000000000202 R12: 000000c00021f688
[ 33.763460] R13: 0000000924924924 R14: 000000c0001d0820 R15: 0000000000000011
[ 33.763566] </TASK>
Caused by this:
commit efcd71af38be403fa52223092f79ada446e121ba (tag: for_linus, tag: kernel.org/tags/for_linus)
Author: Luigi Leonardi <luigi.leonardi@xxxxxxxxxxx>
Date: Tue Jul 30 21:47:32 2024 +0200
vsock/virtio: avoid queuing packets when intermediate queue is empty
When the driver needs to send new packets to the device, it always
Yep, I agree, thanks for the analysis!
I guess this is exactly the reason we have an intermediate queue.
Not sure, IIRC we added the RCU later to fix some hotplug/unplug bug.
RCU is tricky, so I am inclined to revert, but if there is
a convincing patch addressing this, I might reconsider.
What about using GFP_ATOMIC when calling virtqueue_add_sgs() ?
We already used it in several drivers, including virtio-fs
(fs/fuse/virtio_fs.c) where in virtio_fs_send_req() it does something
similar of what we did in that commit.
I checked other places where we can sleep, and the only other one could
be:
virtio_transport_deliver_tap_pkt()
vsock_deliver_tap()
virtio_transport_build_skb()
alloc_skb()
^
this is already called whit GFP_ATOMIC, and vsock_deliver_tap()
also call it under another RCU section.
WDYT?
Thanks,
Stefano