On Tue, Aug 6, 2024 at 10:45 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > > > > On 06.08.24 10:18, Dragos Tatulea wrote: > > (Re-sending. I messed up the previous message, sorry about that.) > > > > On 06.08.24 04:57, Jason Wang wrote: > >> On Mon, Aug 5, 2024 at 11:59 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > >>> > >>> On 05.08.24 05:17, Jason Wang wrote: > >>>> On Fri, Aug 2, 2024 at 2:51 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > >>>>> > >>>>> On Fri, 2024-08-02 at 11:29 +0800, Jason Wang wrote: > >>>>>> On Thu, Aug 1, 2024 at 11:38 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> The following workflow triggers the crash referenced below: > >>>>>>> > >>>>>>> 1) vhost_vdpa_unsetup_vq_irq() unregisters the irq bypass producer > >>>>>>> but the producer->token is still valid. > >>>>>>> 2) vq context gets released and reassigned to another vq. > >>>>>> > >>>>>> Just to make sure I understand here, which structure is referred to as > >>>>>> "vq context" here? I guess it's not call_ctx as it is a part of the vq > >>>>>> itself. > >>>>>> > >>>>>>> 3) That other vq registers it's producer with the same vq context > >>>>>>> pointer as token in vhost_vdpa_setup_vq_irq(). > >>>>>> > >>>>>> Or did you mean when a single eventfd is shared among different vqs? > >>>>>> > >>>>> Yes, that's what I mean: vq->call_ctx.ctx which is a eventfd_ctx. > >>>>> > >>>>> But I don't think it's shared in this case, only that the old eventfd_ctx value > >>>>> is lingering in producer->token. And this old eventfd_ctx is assigned now to > >>>>> another vq. > >>>> > >>>> Just to make sure I understand the issue. The eventfd_ctx should be > >>>> still valid until a new VHOST_SET_VRING_CALL(). > >>>> > >>> I think it's not about the validity of the eventfd_ctx. More about > >>> the lingering ctx value of the producer after vhost_vdpa_unsetup_vq_irq(). > >> > >> Probably, but > >> > >>> That value is the eventfd ctx, but it could be anything else really... > >> > >> I mean we hold a refcnt of the eventfd so it should be valid until the > >> next set_vring_call() or vhost_dev_cleanup(). > >> > >> But I do spot some possible issue: > >> > >> 1) We swap and assign new ctx in vhost_vring_ioctl(): > >> > >> swap(ctx, vq->call_ctx.ctx); > >> > >> 2) and old ctx will be put there as well: > >> > >> if (!IS_ERR_OR_NULL(ctx)) > >> eventfd_ctx_put(ctx); > >> > >> 3) but in vdpa, we try to unregister the producer with the new token: > >> > >> static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, > >> void __user *argp) > >> { > >> ... > >> r = vhost_vring_ioctl(&v->vdev, cmd, argp); > >> ... > >> switch (cmd) { > >> ... > >> case VHOST_SET_VRING_CALL: > >> if (vq->call_ctx.ctx) { > >> cb.callback = vhost_vdpa_virtqueue_cb; > >> cb.private = vq; > >> cb.trigger = vq->call_ctx.ctx; > >> } else { > >> cb.callback = NULL; > >> cb.private = NULL; > >> cb.trigger = NULL; > >> } > >> ops->set_vq_cb(vdpa, idx, &cb); > >> vhost_vdpa_setup_vq_irq(v, idx); > >> > >> in vhost_vdpa_setup_vq_irq() we had: > >> > >> irq_bypass_unregister_producer(&vq->call_ctx.producer); > >> > >> here the producer->token still points to the old one... > >> > >> Is this what you have seen? > > Yup. That is the issue. The unregister already happened at > > vhost_vdpa_unsetup_vq_irq(). So this second unregister will > > work on an already unregistered element due to the token still > > being set. > > > >> > >>> > >>> > >>>> I may miss something but the only way to assign exactly the same > >>>> eventfd_ctx value to another vq is where the guest tries to share the > >>>> MSI-X vector among virtqueues, then qemu will use a single eventfd as > >>>> the callback for multiple virtqueues. If this is true: > >>>> > >>> I don't think this is the case. I see the issue happening when running qemu vdpa > >>> live migration tests on the same host. From a vdpa device it's basically a device > >>> starting on a VM over and over. > >>> > >>>> For bypass registering, only the first registering can succeed as the > >>>> following registering will fail because the irq bypass manager already > >>>> had exactly the same producer token. > >>>> For registering, all unregistering can succeed: > >>>> > >>>> 1) the first unregistering will do the real job that unregister the token > >>>> 2) the following unregistering will do nothing by iterating the > >>>> producer token list without finding a match one > >>>> > >>>> Maybe you can show me the userspace behaviour (ioctls) when you see this? > >>>> > >>> Sure, what would you need? qemu traces? > >> > >> Yes, that would be helpful. > >> > > Will try to get them. > As the traces are quite large (~5MB), I uploaded them in this location [0]. > I used the following qemu traces: > --trace vhost_vdpa* --trace virtio_net_handle* > > [0] https://drive.google.com/file/d/1XyXYyockJ_O7zMgI7vot6AxYjze9Ljju/view?usp=sharing Thanks for doing this. So it looks not like a case of eventfd sharing: """ 153@1722953531.918958:vhost_vdpa_iotlb_begin_batch vdpa:0x7f6f9cfb5190 fd: 17 msg_type: 2 type: 5 153@1722953531.918959:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70 index: 6 num: 0 svq 1 153@1722953531.918961:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70 index: 6 fd: 237 153@1722953531.918964:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70 index: 6 fd: 238 153@1722953531.918978:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17 msg_type: 2 asid: 1 iova: 0x13000 size: 0x2000 uaddr: 0x7f6f9da1a000 perm: 0x1 type: 2 153@1722953531.918984:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17 msg_type: 2 asid: 1 iova: 0x15000 size: 0x1000 uaddr: 0x7f6f9da19000 perm: 0x3 type: 2 153@1722953531.918987:vhost_vdpa_set_vring_addr dev: 0x55573cc9ca70 index: 6 flags: 0x0 desc_user_addr: 0x13000 used_user_addr: 0x15000 avail_user_addr: 0x14000 log_guest_\ addr: 0x0 153@1722953531.918989:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70 index: 7 num: 0 svq 1 153@1722953531.918991:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70 index: 7 fd: 239 153@1722953531.918993:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70 index: 7 fd: 240 """ I think a more proper way is to unregister and clean the token before calling vhost_vring_ioctl() in the case of SET_VRING_KICK. Let me try to draft a patch and see. Thanks > > Thanks, > Dragos >