On Thu, Apr 20, 2023 at 10:16 PM Maxime Coquelin <maxime.coquelin@xxxxxxxxxx> wrote: > > > > On 4/20/23 06:34, Jason Wang wrote: > > On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin > > <maxime.coquelin@xxxxxxxxxx> wrote: > >> > >> This small series enables virtio-net device type in VDUSE. > >> With it, basic operation have been tested, both with > >> virtio-vdpa and vhost-vdpa using DPDK Vhost library series > >> adding VDUSE support [0] using split rings layout. > >> > >> Control queue support (and so multiqueue) has also been > >> tested, but require a Kernel series from Jason Wang > >> relaxing control queue polling [1] to function reliably. > >> > >> Other than that, we have identified a few gaps: > >> > >> 1. Reconnection: > >> a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail > >> index, even after the virtqueue has already been > >> processed. Is that expected? I have tried instead to > >> get the driver's avail index directly from the avail > >> ring, but it does not seem reliable as I sometimes get > >> "id %u is not a head!\n" warnings. Also such solution > >> would not be possible with packed ring, as we need to > >> know the wrap counters values. > > > > Looking at the codes, it only returns the value that is set via > > set_vq_state(). I think it is expected to be called before the > > datapath runs. > > > > So when bound to virtio-vdpa, it is expected to return 0. But we need > > to fix the packed virtqueue case, I wonder if we need to call > > set_vq_state() explicitly in virtio-vdpa before starting the device. > > > > When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which > > will end up a call to set_vq_state(). Unfortunately, it doesn't > > support packed ring which needs some extension. > > > >> > >> b. Missing IOCTLs: it would be handy to have new IOCTLs to > >> query Virtio device status, > > > > What's the use case of this ioctl? It looks to me userspace is > > notified on each status change now: > > > > static int vduse_dev_set_status(struct vduse_dev *dev, u8 status) > > { > > struct vduse_dev_msg msg = { 0 }; > > > > msg.req.type = VDUSE_SET_STATUS; > > msg.req.s.status = status; > > > > return vduse_dev_msg_sync(dev, &msg); > > } > > The idea was to be able to query the status at reconnect time, and > neither having to assume its value nor having to store its value in a > file (the status could change while the VDUSE application is stopped, > but maybe it would receive the notification at reconnect). I see. > > I will prototype using a tmpfs file to save needed information, and see > if it works. It might work but then the API is not self contained. Maybe it's better to have a dedicated ioctl. > > >> and retrieve the config > >> space set at VDUSE_CREATE_DEV time. > > > > In order to be safe, VDUSE avoids writable config space. Otherwise > > drivers could block on config writing forever. That's why we don't do > > it now. > > The idea was not to make the config space writable, but just to be able > to fetch what was filled at VDUSE_CREATE_DEV time. > > With the tmpfs file, we can avoid doing that and just save the config > space there. Same as the case for status. Thanks > > > We need to harden the config write before we can proceed to this I think. > > > >> > >> 2. VDUSE application as non-root: > >> We need to run the VDUSE application as non-root. There > >> is some race between the time the UDEV rule is applied > >> and the time the device starts being used. Discussing > >> with Jason, he suggested we may have a VDUSE daemon run > >> as root that would create the VDUSE device, manages its > >> rights and then pass its file descriptor to the VDUSE > >> app. However, with current IOCTLs, it means the VDUSE > >> daemon would need to know several information that > >> belongs to the VDUSE app implementing the device such > >> as supported Virtio features, config space, etc... > >> If we go that route, maybe we should have a control > >> IOCTL to create the device which would just pass the > >> device type. Then another device IOCTL to perform the > >> initialization. Would that make sense? > > > > I think so. We can hear from others. > > > >> > >> 3. Coredump: > >> In order to be able to perform post-mortem analysis, DPDK > >> Vhost library marks pages used for vrings and descriptors > >> buffers as MADV_DODUMP using madvise(). However with > >> VDUSE it fails with -EINVAL. My understanding is that we > >> set VM_DONTEXPAND flag to the VMAs and madvise's > >> MADV_DODUMP fails if it is present. I'm not sure to > >> understand why madvise would prevent MADV_DODUMP if > >> VM_DONTEXPAND is set. Any thoughts? > > > > Adding Peter who may know the answer. > > Thanks! > Maxime > > > Thanks > > > >> > >> [0]: https://patchwork.dpdk.org/project/dpdk/list/?series=27594&state=%2A&archive=both > >> [1]: https://lore.kernel.org/lkml/CACGkMEtgrxN3PPwsDo4oOsnsSLJfEmBEZ0WvjGRr3whU+QasUg@xxxxxxxxxxxxxx/T/ > >> > >> Maxime Coquelin (2): > >> vduse: validate block features only with block devices > >> vduse: enable Virtio-net device type > >> > >> drivers/vdpa/vdpa_user/vduse_dev.c | 11 +++++++---- > >> 1 file changed, 7 insertions(+), 4 deletions(-) > >> > >> -- > >> 2.39.2 > >> > > > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization