On Wed, Jul 10, 2024 at 12:58:11PM -0700, Daniel Verkamp wrote: > On Wed, Jul 10, 2024 at 11:39 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > On Wed, Jul 10, 2024 at 11:12:34AM -0700, Daniel Verkamp wrote: > > > On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > > > > > > > virtio balloon communicates to the core that in some > > > > configurations vq #s are non-contiguous by setting name > > > > pointer to NULL. > > > > > > > > Unfortunately, core then turned around and just made them > > > > contiguous again. Result is that driver is out of spec. > > > > > > Thanks for fixing this - I think the overall approach of the patch looks good. > > > > > > > Implement what the API was supposed to do > > > > in the 1st place. Compatibility with buggy hypervisors > > > > is handled inside virtio-balloon, which is the only driver > > > > making use of this facility, so far. > > > > > > In addition to virtio-balloon, I believe the same problem also affects > > > the virtio-fs device, since queue 1 is only supposed to be present if > > > VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are > > > meant to be queue indexes 2 and up. From a look at the Linux driver > > > (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION > > > and assumes that request queues start at index 1 rather than 2, which > > > looks out of spec to me, but the current device implementations (that > > > I am aware of, anyway) are also broken in the same way, so it ends up > > > working today. Queue numbering in a spec-compliant device and the > > > current Linux driver would mismatch; what the driver considers to be > > > the first request queue (index 1) would be ignored by the device since > > > queue index 1 has no function if F_NOTIFICATION isn't negotiated. > > > > > > Oh, thanks a lot for pointing this out! > > > > I see so this patch is no good as is, we need to add a workaround for > > virtio-fs first. > > > > QEMU workaround is simple - just add an extra queue. But I did not > > reasearch how this would interact with vhost-user. > > > > From driver POV, I guess we could just ignore queue # 1 - would that be > > ok or does it have performance implications? > > As a driver workaround for non-compliant devices, I think ignoring the > first request queue would be a reasonable approach if the device's > config advertises num_request_queues > 1. Unfortunately, both > virtiofsd and crosvm's virtio-fs device have hard-coded > num_request_queues =1, so this won't help with those existing devices. Do they care what the vq # is though? We could do some magic to translate VQ #s in qemu. > Maybe there are other devices that we would need to consider as well; > commit 529395d2ae64 ("virtio-fs: add multi-queue support") quotes > benchmarks that seem to be from a different virtio-fs implementation > that does support multiple request queues, so the workaround could > possibly be used there. > > > Or do what I did for balloon here: try with spec compliant #s first, > > if that fails then assume it's the spec issue and shift by 1. > > If there is a way to "guess and check" without breaking spec-compliant > devices, that sounds reasonable too; however, I'm not sure how this > would work out in practice: an existing non-compliant device may fail > to start if the driver tries to enable queue index 2 when it only > supports one request queue, You don't try to enable queue - driver starts by checking queue size. The way my patch works is that it assumes a non existing queue has size 0 if not available. This was actually a documented way to check for PCI and MMIO: Read the virtqueue size from queue_size. This controls how big the virtqueue is (see 2.6 Virtqueues). If this field is 0, the virtqueue does not exist. MMIO: If the returned value is zero (0x0) the queue is not available. unfortunately not for CCW, but I guess CCW implementations outside of QEMU are uncommon enough that we can assume it's the same? To me the above is also a big hint that drivers are allowed to query size for queues that do not exist. > and a spec-compliant device would probably > balk if the driver tries to enable queue 1 but does not negotiate > VIRTIO_FS_F_NOTIFICATION. If there's a way to reset and retry the > whole virtio device initialization process if a device fails like > this, then maybe it's feasible. (Or can the driver tweak the virtqueue > configuration and try to set DRIVER_OK repeatedly until it works? It's > not clear to me if this is allowed by the spec, or what device > implementations actually do in practice in this scenario.) > > Thanks, > -- Daniel My patch starts with a spec compliant behaviour. If that fails, try non-compliant one as a fallback. -- MST