Re: [PATCH v2 11/13] vdpa: block migration if dev does not have _F_SUSPEND

Jason Wang <jasowang@xxxxxxxxxx> · Fri, 24 Feb 2023 11:16:42 +0800

On Thu, Feb 23, 2023 at 7:07 PM Eugenio Perez Martin
<eperezma@xxxxxxxxxx> wrote:
>
> On Thu, Feb 23, 2023 at 3:38 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >
> >
> > 在 2023/2/22 22:25, Eugenio Perez Martin 写道:
> > > On Wed, Feb 22, 2023 at 5:05 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > >>
> > >> 在 2023/2/8 17:42, Eugenio Pérez 写道:
> > >>> Next patches enable devices to be migrated even if vdpa netdev has not
> > >>> been started with x-svq. However, not all devices are migratable, so we
> > >>> need to block migration if we detect that.
> > >>>
> > >>> Block vhost-vdpa device migration if it does not offer _F_SUSPEND and it
> > >>> has not been started with x-svq.
> > >>>
> > >>> Signed-off-by: Eugenio Pérez <eperezma@xxxxxxxxxx>
> > >>> ---
> > >>>    hw/virtio/vhost-vdpa.c | 21 +++++++++++++++++++++
> > >>>    1 file changed, 21 insertions(+)
> > >>>
> > >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >>> index 84a6b9690b..9d30cf9b3c 100644
> > >>> --- a/hw/virtio/vhost-vdpa.c
> > >>> +++ b/hw/virtio/vhost-vdpa.c
> > >>> @@ -442,6 +442,27 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp)
> > >>>            return 0;
> > >>>        }
> > >>>
> > >>> +    /*
> > >>> +     * If dev->shadow_vqs_enabled at initialization that means the device has
> > >>> +     * been started with x-svq=on, so don't block migration
> > >>> +     */
> > >>> +    if (dev->migration_blocker == NULL && !v->shadow_vqs_enabled) {
> > >>> +        uint64_t backend_features;
> > >>> +
> > >>> +        /* We don't have dev->backend_features yet */
> > >>> +        ret = vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES,
> > >>> +                              &backend_features);
> > >>> +        if (unlikely(ret)) {
> > >>> +            error_setg_errno(errp, -ret, "Could not get backend features");
> > >>> +            return ret;
> > >>> +        }
> > >>> +
> > >>> +        if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_SUSPEND))) {
> > >>> +            error_setg(&dev->migration_blocker,
> > >>> +                "vhost-vdpa backend lacks VHOST_BACKEND_F_SUSPEND feature.");
> > >>> +        }
> > >>
> > >> I wonder why not let the device to decide? For networking device, we can
> > >> live without suspend probably.
> > >>
> > > Right, but how can we know if this is a net device in init? I don't
> > > think a switch (vhost_vdpa_get_device_id(dev)) is elegant.
> >
> >
> > I meant the caller of vhost_vdpa_init() which is net_init_vhost_vdpa().
> >
>
> That's doable but I'm not sure if it is convenient.

So it's a question whether or not we try to let migration work without
suspending. If we don't, there's no need to bother. Looking at the
current vhost-net implementation, it tries to make migration work upon
the error of get_vring_base() so maybe it's worth a try if it doesn't
bother too much. But I'm fine to go either way.

>
> Since we're always offering _F_LOG I thought of the lack of _F_SUSPEND
> as the default migration blocker for other kinds of devices like blk.

Or we can have this by default and allow a specific type of device to clear?

> If we move this code to net_init_vhost_vdpa, all other devices are in
> charge of block migration by themselves.
>
> I guess the right action is to use a variable similar to
> vhost_vdpa->f_log_all. It defaults to false, and the device can choose
> if it should export it or not. This way, the device does not migrate
> by default, and the equivalent of net_init_vhost_vdpa could choose
> whether to offer _F_LOG with SVQ or not.

Looks similar to what I think above.

>
> OTOH I guess other kinds of devices already must place blockers beyond
> _F_LOG, so maybe it makes sense to always offer _F_LOG even if
> _F_SUSPEND is not offered?

I don't see any dependency between the two features. Technically,
there could be devices that have neither _F_LOG nor _F_SUSPEND.

Thanks

> Stefano G., would that break vhost-vdpa-blk
> support?
>
> Thanks!
>
> > Thanks
> >
> >
> > >
> > > If the parent device does not need to be suspended i'd go with
> > > exposing a suspend ioctl but do nothing in the parent device. After
> > > that, it could even choose to return an error for GET_VRING_BASE.
> > >
> > > If we want to implement it as a fallback in qemu, I'd go for
> > > implementing it on top of this series. There are a few operations we
> > > could move to a device-kind specific ops.
> > >
> > > Would it make sense to you?
> > >
> > > Thanks!
> > >
> > >
> > >> Thanks
> > >>
> > >>
> > >>> +    }
> > >>> +
> > >>>        /*
> > >>>         * Similar to VFIO, we end up pinning all guest memory and have to
> > >>>         * disable discarding of RAM.
> >
>

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization