Re: [RFC PATCH vhost] vhost-vdpa: Fix invalid irq bypass unregister

Jason Wang <jasowang@xxxxxxxxxx> · Thu, 8 Aug 2024 16:23:00 +0800

On Thu, Aug 8, 2024 at 10:56 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
> On Tue, Aug 6, 2024 at 10:45 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 06.08.24 10:18, Dragos Tatulea wrote:
> > > (Re-sending. I messed up the previous message, sorry about that.)
> > >
> > > On 06.08.24 04:57, Jason Wang wrote:
> > >> On Mon, Aug 5, 2024 at 11:59 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote:
> > >>>
> > >>> On 05.08.24 05:17, Jason Wang wrote:
> > >>>> On Fri, Aug 2, 2024 at 2:51 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote:
> > >>>>>
> > >>>>> On Fri, 2024-08-02 at 11:29 +0800, Jason Wang wrote:
> > >>>>>> On Thu, Aug 1, 2024 at 11:38 PM Dragos Tatulea <dtatulea@xxxxxxxxxx> wrote:
> > >>>>>>>
> > >>>>>>> The following workflow triggers the crash referenced below:
> > >>>>>>>
> > >>>>>>> 1) vhost_vdpa_unsetup_vq_irq() unregisters the irq bypass producer
> > >>>>>>>    but the producer->token is still valid.
> > >>>>>>> 2) vq context gets released and reassigned to another vq.
> > >>>>>>
> > >>>>>> Just to make sure I understand here, which structure is referred to as
> > >>>>>> "vq context" here? I guess it's not call_ctx as it is a part of the vq
> > >>>>>> itself.
> > >>>>>>
> > >>>>>>> 3) That other vq registers it's producer with the same vq context
> > >>>>>>>    pointer as token in vhost_vdpa_setup_vq_irq().
> > >>>>>>
> > >>>>>> Or did you mean when a single eventfd is shared among different vqs?
> > >>>>>>
> > >>>>> Yes, that's what I mean: vq->call_ctx.ctx which is a eventfd_ctx.
> > >>>>>
> > >>>>> But I don't think it's shared in this case, only that the old eventfd_ctx value
> > >>>>> is lingering in producer->token. And this old eventfd_ctx is assigned now to
> > >>>>> another vq.
> > >>>>
> > >>>> Just to make sure I understand the issue. The eventfd_ctx should be
> > >>>> still valid until a new VHOST_SET_VRING_CALL().
> > >>>>
> > >>> I think it's not about the validity of the eventfd_ctx. More about
> > >>> the lingering ctx value of the producer after vhost_vdpa_unsetup_vq_irq().
> > >>
> > >> Probably, but
> > >>
> > >>> That value is the eventfd ctx, but it could be anything else really...
> > >>
> > >> I mean we hold a refcnt of the eventfd so it should be valid until the
> > >> next set_vring_call() or vhost_dev_cleanup().
> > >>
> > >> But I do spot some possible issue:
> > >>
> > >> 1) We swap and assign new ctx in vhost_vring_ioctl():
> > >>
> > >>                 swap(ctx, vq->call_ctx.ctx);
> > >>
> > >> 2) and old ctx will be put there as well:
> > >>
> > >>                 if (!IS_ERR_OR_NULL(ctx))
> > >>                         eventfd_ctx_put(ctx);
> > >>
> > >> 3) but in vdpa, we try to unregister the producer with the new token:
> > >>
> > >> static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
> > >>                            void __user *argp)
> > >> {
> > >> ...
> > >>         r = vhost_vring_ioctl(&v->vdev, cmd, argp);
> > >> ...
> > >>         switch (cmd) {
> > >> ...
> > >>         case VHOST_SET_VRING_CALL:
> > >>                 if (vq->call_ctx.ctx) {
> > >>                         cb.callback = vhost_vdpa_virtqueue_cb;
> > >>                         cb.private = vq;
> > >>                         cb.trigger = vq->call_ctx.ctx;
> > >>                 } else {
> > >>                         cb.callback = NULL;
> > >>                         cb.private = NULL;
> > >>                         cb.trigger = NULL;
> > >>                 }
> > >>                 ops->set_vq_cb(vdpa, idx, &cb);
> > >>                 vhost_vdpa_setup_vq_irq(v, idx);
> > >>
> > >> in vhost_vdpa_setup_vq_irq() we had:
> > >>
> > >>         irq_bypass_unregister_producer(&vq->call_ctx.producer);
> > >>
> > >> here the producer->token still points to the old one...
> > >>
> > >> Is this what you have seen?
> > > Yup. That is the issue. The unregister already happened at
> > > vhost_vdpa_unsetup_vq_irq(). So this second unregister will
> > > work on an already unregistered element due to the token still
> > > being set.
> > >
> > >>
> > >>>
> > >>>
> > >>>> I may miss something but the only way to assign exactly the same
> > >>>> eventfd_ctx value to another vq is where the guest tries to share the
> > >>>> MSI-X vector among virtqueues, then qemu will use a single eventfd as
> > >>>> the callback for multiple virtqueues. If this is true:
> > >>>>
> > >>> I don't think this is the case. I see the issue happening when running qemu vdpa
> > >>> live migration tests on the same host. From a vdpa device it's basically a device
> > >>> starting on a VM over and over.
> > >>>
> > >>>> For bypass registering, only the first registering can succeed as the
> > >>>> following registering will fail because the irq bypass manager already
> > >>>> had exactly the same producer token.
> > >>>> For registering, all unregistering can succeed:
> > >>>>
> > >>>> 1) the first unregistering will do the real job that unregister the token
> > >>>> 2) the following unregistering will do nothing by iterating the
> > >>>> producer token list without finding a match one
> > >>>>
> > >>>> Maybe you can show me the userspace behaviour (ioctls) when you see this?
> > >>>>
> > >>> Sure, what would you need? qemu traces?
> > >>
> > >> Yes, that would be helpful.
> > >>
> > > Will try to get them.
> > As the traces are quite large (~5MB), I uploaded them in this location [0].
> > I used the following qemu traces:
> > --trace vhost_vdpa* --trace virtio_net_handle*
> >
> > [0] https://drive.google.com/file/d/1XyXYyockJ_O7zMgI7vot6AxYjze9Ljju/view?usp=sharing
>
> Thanks for doing this.
>
> So it looks not like a case of eventfd sharing:
>
> """
> 153@1722953531.918958:vhost_vdpa_iotlb_begin_batch vdpa:0x7f6f9cfb5190
> fd: 17 msg_type: 2 type: 5
> 153@1722953531.918959:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70
> index: 6 num: 0 svq 1
> 153@1722953531.918961:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70
> index: 6 fd: 237
> 153@1722953531.918964:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70
> index: 6 fd: 238
> 153@1722953531.918978:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17
> msg_type: 2 asid: 1 iova: 0x13000 size: 0x2000 uaddr: 0x7f6f9da1a000
> perm: 0x1 type: 2
> 153@1722953531.918984:vhost_vdpa_dma_map vdpa:0x7f6f9cfb5190 fd: 17
> msg_type: 2 asid: 1 iova: 0x15000 size: 0x1000 uaddr: 0x7f6f9da19000
> perm: 0x3 type: 2
> 153@1722953531.918987:vhost_vdpa_set_vring_addr dev: 0x55573cc9ca70
> index: 6 flags: 0x0 desc_user_addr: 0x13000 used_user_addr: 0x15000
> avail_user_addr: 0x14000 log_guest_\
> addr: 0x0
> 153@1722953531.918989:vhost_vdpa_set_vring_base dev: 0x55573cc9ca70
> index: 7 num: 0 svq 1
> 153@1722953531.918991:vhost_vdpa_set_vring_kick dev: 0x55573cc9ca70
> index: 7 fd: 239
> 153@1722953531.918993:vhost_vdpa_set_vring_call dev: 0x55573cc9ca70
> index: 7 fd: 240
> """
>
> I think a more proper way is to unregister and clean the token before
> calling vhost_vring_ioctl() in the case of SET_VRING_KICK. Let me try
> to draft a patch and see.

I've posted a RFC patch, please try to see if it works.

Thanks

>
> Thanks
>
> >
> > Thanks,
> > Dragos
> >