On Thu, Aug 8, 2024 at 10:56 AM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Tue, Aug 6, 2024 at 10:45 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > > > > > > > > On 06.08.24 10:18, Dragos Tatulea wrote: > > > (Re-sending. I messed up the previous message, sorry about that.) > > > > > > On 06.08.24 04:57, Jason Wang wrote: > > >> On Mon, Aug 5, 2024 at 11:59 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > > >>> > > >>> On 05.08.24 05:17, Jason Wang wrote: > > >>>> On Fri, Aug 2, 2024 at 2:51 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > > >>>>> > > >>>>> On Fri, 2024-08-02 at 11:29 +0800, Jason Wang wrote: > > >>>>>> On Thu, Aug 1, 2024 at 11:38 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote: > > >>>>>>> > > >>>>>>> The following workflow triggers the crash referenced below: > > >>>>>>> > > >>>>>>> 1) vhost_vdpa_unsetup_vq_irq() unregisters the irq bypass producer > > >>>>>>> but the producer->token is still valid. > > >>>>>>> 2) vq context gets released and reassigned to another vq. > > >>>>>> > > >>>>>> Just to make sure I understand here, which structure is referred to as > > >>>>>> "vq context" here? I guess it's not call_ctx as it is a part of the vq > > >>>>>> itself. > > >>>>>> > > >>>>>>> 3) That other vq registers it's producer with the same vq context > > >>>>>>> pointer as token in vhost_vdpa_setup_vq_irq(). > > >>>>>> > > >>>>>> Or did you mean when a single eventfd is shared among different vqs? > > >>>>>> > > >>>>> Yes, that's what I mean: vq->call_ctx.ctx which is a eventfd_ctx. > > >>>>> > > >>>>> But I don't think it's shared in this case, only that the old eventfd_ctx value > > >>>>> is lingering in producer->token. And this old eventfd_ctx is assigned now to > > >>>>> another vq. > > >>>> > > >>>> Just to make sure I understand the issue. The eventfd_ctx should be > > >>>> still valid until a new VHOST_SET_VRING_CALL(). > > >>>> > > >>> I think it's not about the validity of the eventfd_ctx. More about > > >>> the lingering ctx value of the producer after vhost_vdpa_unsetup_vq_irq(). > > >> > > >> Probably, but > > >> > > >>> That value is the eventfd ctx, but it could be anything else really... > > >> > > >> I mean we hold a refcnt of the eventfd so it should be valid until the > > >> next set_vring_call() or vhost_dev_cleanup(). > > >> > > >> But I do spot some possible issue: > > >> > > >> 1) We swap and assign new ctx in vhost_vring_ioctl(): > > >> > > >> swap(ctx, vq->call_ctx.ctx); > > >> > > >> 2) and old ctx will be put there as well: > > >> > > >> if (!IS_ERR_OR_NULL(ctx)) > > >> eventfd_ctx_put(ctx); > > >> > > >> 3) but in vdpa, we try to unregister the producer with the new token: > > >> > > >> static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd, > > >> void __user *argp) > > >> { > > >> ... > > >> r = vhost_vring_ioctl(&v->vdev, cmd, argp); > > >> ... > > >> switch (cmd) { > > >> ... > > >> case VHOST_SET_VRING_CALL: > > >> if (vq->call_ctx.ctx) { > > >> cb.callback = vhost_vdpa_virtqueue_cb; > > >> cb.private = vq; > > >> cb.trigger = vq->call_ctx.ctx; > > >> } else { > > >> cb.callback = NULL; > > >> cb.private = NULL; > > >> cb.trigger = NULL; > > >> } > > >> ops->set_vq_cb(vdpa, idx, &cb); > > >> vhost_vdpa_setup_vq_irq(v, idx); > > >> > > >> in vhost_vdpa_setup_vq_irq() we had: > > >> > > >> irq_bypass_unregister_producer(&vq->call_ctx.producer); > > >> > > >> here the producer->token still points to the old one... > > >> > > >> Is this what you have seen? > > > Yup. That is the issue. The unregister already happened at > > > vhost_vdpa_unsetup_vq_irq(). So this second unregister will > > > work on an already unregistered element due to the token still > > > being set. > > > > > >> > > >>> > > >>> > > >>>> I may miss something but the only way to assign exactly the same > > >>>> eventfd_ctx value to another vq is where the guest tries to share the > > >>>> MSI-X vector among virtqueues, then qemu will use a single eventfd as > > >>>> the callback for multiple virtqueues. If this is true: > > >>>> > > >>> I don't think this is the case. I see the issue happening when running qemu vdpa > > >>> live migration tests on the same host. From a vdpa device it's basically a device > > >>> starting on a VM over and over. > > >>> > > >>>> For bypass registering, only the first registering can succeed as the > > >>>> following registering will fail because the irq bypass manager already > > >>>> had exactly the same producer token. > > >>>> For registering, all unregistering can succeed: > > >>>> > > >>>> 1) the first unregistering will do the real job that unregister the token > > >>>> 2) the following unregistering will do nothing by iterating the > > >>>> producer token list without finding a match one > > >>>> > > >>>> Maybe you can show me the userspace behaviour (ioctls) when you see this? > > >>>> > > >>> Sure, what would you need? qemu traces? > > >> > > >> Yes, that would be helpful. > > >> > > > Will try to get them. > > As the traces are quite large (~5MB), I uploaded them in this location [0]. > > I used the following qemu traces: > > --trace vhost_vdpa* --trace virtio_net_handle* > > > > [0] https://drive.google.com/file/d/1XyXYyockJ_O7zMgI7vot6AxYjze9Ljju/view?usp=sharing > > Thanks for doing this. > > So it looks not like a case of eventfd sharing: > > """ > 153@1722953531.918958:vhost_vdpa_iotlb_begin_batch vdpa:0x7f6f9cfb5190 > fd: 17 msg_type: 2 type: 5 > 153@1722953531.918959:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70 > index: 6 num: 0 svq 1 > 153@1722953531.918961:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70 > index: 6 fd: 237 > 153@1722953531.918964:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70 > index: 6 fd: 238 > 153@1722953531.918978:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17 > msg_type: 2 asid: 1 iova: 0x13000 size: 0x2000 uaddr: 0x7f6f9da1a000 > perm: 0x1 type: 2 > 153@1722953531.918984:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17 > msg_type: 2 asid: 1 iova: 0x15000 size: 0x1000 uaddr: 0x7f6f9da19000 > perm: 0x3 type: 2 > 153@1722953531.918987:vhost_vdpa_set_vring_addr dev: 0x55573cc9ca70 > index: 6 flags: 0x0 desc_user_addr: 0x13000 used_user_addr: 0x15000 > avail_user_addr: 0x14000 log_guest_\ > addr: 0x0 > 153@1722953531.918989:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70 > index: 7 num: 0 svq 1 > 153@1722953531.918991:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70 > index: 7 fd: 239 > 153@1722953531.918993:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70 > index: 7 fd: 240 > """ > > I think a more proper way is to unregister and clean the token before > calling vhost_vring_ioctl() in the case of SET_VRING_KICK. Let me try > to draft a patch and see. I've posted a RFC patch, please try to see if it works. Thanks > > Thanks > > > > > Thanks, > > Dragos > >