On 2/18/22 11:53 AM, Mike Christie wrote: > On 2/17/22 3:48 AM, Stefano Garzarella wrote: >> >> On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: >>> >>> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote: >>>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: >>>>> >>>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote: >>>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot >>>>>> <syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> syzbot found the following issue on: >>>>>>> >>>>>>> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org.. >>>>>>> git tree: upstream >>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$ >>>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$ >>>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$ >>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 >>>>>>> >>>>>>> Unfortunately, I don't have any reproducer for this issue yet. >>>>>>> >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>>>> Reported-by: syzbot+1e3ea63db39f2b4440e0@xxxxxxxxxxxxxxxxxxxxxxxxx >>>>>>> >>>>>>> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 >>>>>>> Modules linked in: >>>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0 >>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 >>>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715 >>>>>> >>>>>> Probably a hint that we are missing a flush. >>>>>> >>>>>> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release(): >>>>>> >>>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock) >>>>>> { >>>>>> size_t i; >>>>>> int ret; >>>>>> >>>>>> mutex_lock(&vsock->dev.mutex); >>>>>> >>>>>> ret = vhost_dev_check_owner(&vsock->dev); >>>>>> if (ret) >>>>>> goto err; >>>>>> >>>>>> Where it could fail so the device is not actually stopped. >>>>>> >>>>>> I wonder if this is something related. >>>>>> >>>>>> Thanks >>>>> >>>>> >>>>> But then if that is not the owner then no work should be running, right? >>>> >>>> Could it be a buggy user space that passes the fd to another process >>>> and changes the owner just before the mutex_lock() above? >>>> >>>> Thanks >>> >>> Maybe, but can you be a bit more explicit? what is the set of >>> conditions you see that can lead to this? >> >> I think the issue could be in the vhost_vsock_stop() as Jason mentioned, >> but not related to fd passing, but related to the do_exit() function. >> >> Looking the stack trace, we are in exit_task_work(), that is called >> after exit_mm(), so the vhost_dev_check_owner() can fail because >> current->mm should be NULL at that point. >> >> It seems the fput work is queued by fput_many() in a worker queue, and >> in some cases (maybe a lot of files opened?) the work is still queued >> when we enter in do_exit(). > It normally happens if userspace doesn't do a close() when the VM Just one clarification. I meant to say it "always" happens when userspace doesn't do a close. It doesn't have anything to do with lots of files or something like that. We are actually running the vhost device's release function from do_exit->task_work_run and so all those __fputs are done from something like qemu's context (current == that process). We are *not* hitting the case: do_exit->exit_files->put_files_struct->filp_close->fput->fput_many and then in there hitting the schedule_delayed_work path. For that the last __fput would be done from a workqueue thread and so the current pointer would point to a completely different thread. > is shutdown and instead let's the kernel's reaper code cleanup. The qemu > vhost-scsi code doesn't do a close() during shutdown and so this is our > normal code path. It also happens when something like qemu is not > gracefully shutdown like during a crash. > > So fire up qemu, start IO, then crash it or kill 9 it while IO is still > running and you can hit it. > >> >> That said, I don't know if we can simply remove that check in >> vhost_vsock_stop(), or check if current->mm is NULL, to understand if >> the process is exiting. >> > > Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop > when to check? > > - vhost_vsock_dev_ioctl always wants to check for ownership right? > > - For vhost_vsock_dev_release ownership doesn't matter because we > always want to clean up or it doesn't hurt too much. > > For the case where we just do open then close and no ioctls then > running vhost_vq_set_backend in vhost_vsock_stop is just a minor > hit of extra work. If we've done ioctls, but are now in > vhost_vsock_dev_release then we know for the graceful and ungraceful > case that nothing is going to be accessing this device in the future > and it's getting completely freed so we must completely clean it up. > > > > > > _______________________________________________ > Virtualization mailing list > Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx > https://lists.linuxfoundation.org/mailman/listinfo/virtualization