On 2/17/22 3:48 AM, Stefano Garzarella wrote: > > On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: >> >> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote: >>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: >>>> >>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote: >>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot >>>>> <syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> syzbot found the following issue on: >>>>>> >>>>>> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org.. >>>>>> git tree: upstream >>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$ >>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$ >>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$ >>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 >>>>>> >>>>>> Unfortunately, I don't have any reproducer for this issue yet. >>>>>> >>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>>> Reported-by: syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx >>>>>> >>>>>> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 >>>>>> Modules linked in: >>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0 >>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 >>>>> >>>>> Probably a hint that we are missing a flush. >>>>> >>>>> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release(): >>>>> >>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock) >>>>> { >>>>> size_t i; >>>>> int ret; >>>>> >>>>> mutex_lock(&vsock->dev.mutex); >>>>> >>>>> ret = vhost_dev_check_owner(&vsock->dev); >>>>> if (ret) >>>>> goto err; >>>>> >>>>> Where it could fail so the device is not actually stopped. >>>>> >>>>> I wonder if this is something related. >>>>> >>>>> Thanks >>>> >>>> >>>> But then if that is not the owner then no work should be running, right? >>> >>> Could it be a buggy user space that passes the fd to another process >>> and changes the owner just before the mutex_lock() above? >>> >>> Thanks >> >> Maybe, but can you be a bit more explicit? what is the set of >> conditions you see that can lead to this? > > I think the issue could be in the vhost_vsock_stop() as Jason mentioned, > but not related to fd passing, but related to the do_exit() function. > > Looking the stack trace, we are in exit_task_work(), that is called > after exit_mm(), so the vhost_dev_check_owner() can fail because > current->mm should be NULL at that point. > > It seems the fput work is queued by fput_many() in a worker queue, and > in some cases (maybe a lot of files opened?) the work is still queued > when we enter in do_exit(). It normally happens if userspace doesn't do a close() when the VM is shutdown and instead let's the kernel's reaper code cleanup. The qemu vhost-scsi code doesn't do a close() during shutdown and so this is our normal code path. It also happens when something like qemu is not gracefully shutdown like during a crash. So fire up qemu, start IO, then crash it or kill 9 it while IO is still running and you can hit it. > > That said, I don't know if we can simply remove that check in > vhost_vsock_stop(), or check if current->mm is NULL, to understand if > the process is exiting. > Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop when to check? - vhost_vsock_dev_ioctl always wants to check for ownership right? - For vhost_vsock_dev_release ownership doesn't matter because we always want to clean up or it doesn't hurt too much. For the case where we just do open then close and no ioctls then running vhost_vq_set_backend in vhost_vsock_stop is just a minor hit of extra work. If we've done ioctls, but are now in vhost_vsock_dev_release then we know for the graceful and ungraceful case that nothing is going to be accessing this device in the future and it's getting completely freed so we must completely clean it up.