On Tue, Mar 26, 2024 at 12:11 PM Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Mon, Mar 25, 2024 at 5:44 PM Cornelia Huck <cohuck@xxxxxxxxxx> wrote: > > > > On Mon, Mar 25 2024, Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > > On Fri, 22 Mar 2024 22:02:27 +0100, David Hildenbrand <david@xxxxxxxxxx> wrote: > > >> On 22.03.24 20:16, Daniel Verkamp wrote: > > >> > On Thu, Mar 21, 2024 at 3:16 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > >> >> > > >> >> Currently, the init_vqs function within the virtio_balloon driver relies > > >> >> on the condition that certain names array entries are null in order to > > >> >> skip the initialization of some virtual queues (vqs). This behavior is > > >> >> unique to this part of the codebase. In an upcoming commit, we plan to > > >> >> eliminate this dependency by removing the function entirely. Therefore, > > >> >> with this change, we are ensuring that the virtio_balloon no longer > > >> >> depends on the aforementioned function. > > >> > > > >> > This is a behavior change, and I believe means that the driver no > > >> > longer follows the spec [1]. > > >> > > > >> > For example, the spec says that virtqueue 4 is reporting_vq, and > > >> > reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set, > > >> > but there is no mention of its virtqueue number changing if other > > >> > features are not set. If a device/driver combination negotiates > > >> > VIRTIO_BALLOON_F_PAGE_REPORTING but not VIRTIO_BALLOON_F_STATS_VQ or > > >> > VIRTIO_BALLOON_F_FREE_PAGE_HINT, my reading of the specification is > > >> > that reporting_vq should still be vq number 4, and vq 2 and 3 should > > >> > be unused. This patch would make the reporting_vq use vq 2 instead in > > >> > this case. > > >> > > > >> > If the new behavior is truly intended, then the spec does not match > > >> > reality, and it would need to be changed first (IMO); however, > > >> > changing the spec would mean that any devices implemented correctly > > >> > per the previous spec would now be wrong, so some kind of mechanism > > >> > for detecting the new behavior would be warranted, e.g. a new > > >> > non-device-specific virtio feature flag. > > >> > > > >> > I have brought this up previously on the virtio-comment list [2], but > > >> > it did not receive any satisfying answers at that time. > > > > I had missed it back then, but now that I read it, I realize that we > > really have a bit of a mess here :/ > > > > >> > > >> Rings a bell, but staring at this patch, I thought that there would be > > >> no behavioral change. Maybe I missed it :/ > > >> > > >> I stared at virtio_ccw_find_vqs(), and it contains: > > >> > > >> for (i = 0; i < nvqs; ++i) { > > >> if (!names[i]) { > > >> vqs[i] = NULL; > > >> continue; > > >> } > > >> > > >> vqs[i] = virtio_ccw_setup_vq(vdev, queue_idx++, callbacks[i], > > >> names[i], ctx ? ctx[i] : false, > > >> ccw); > > >> if (IS_ERR(vqs[i])) { > > >> ret = PTR_ERR(vqs[i]); > > >> vqs[i] = NULL; > > >> goto out; > > >> } > > >> } > > >> > > >> We increment queue_idx only if an entry was not NULL. SO I thought no > > >> behavioral change? (at least on s390x :) ) > > > > The code for pci behaves in the same way. > > > > >> > > >> It's late here in Germany, so maybe I'm missing something. > > > > > > I think we've encountered a tricky issue. Currently, all transports handle queue > > > id by incrementing them in order, without skipping any queue id. So, I'm quite > > > surprised that my changes would affect the spec. The fact that the > > > 'names' value is null is just a small trick in the Linux kernel implementation > > > and should not have an impact on the queue id. > > > > > > I believe that my recent modification will not affect the spec. So, let's > > > consider the issues with this patch set separately for now. Regarding the Memory > > > Balloon Device, it has been operational for many years, and perhaps we should > > > add to the spec that if a certain vq does not exist, then subsequent vqs will > > > take over its id. > > > > The changes here do not really seem to affect the spec issue that Daniel > > had noted, unless I'm reading the code wrong. > > Spec seems to be wrong here: > > 5.5.2 Virtqueues > > 0 inflateq 1 deflateq 2 statsq 3 free_page_vq4 r eporting_vq > > And this is the Qemu implementation: > > 5.5.2 Virtqueues > > s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); > s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); > > if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { > s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, > virtio_balloon_handle_free_page_vq); > precopy_add_notifier(&s->free_page_hint_notify); > > object_ref(OBJECT(s->iothread)); > s->free_page_bh = > aio_bh_new_guarded(iothread_get_aio_context(s->iothread), > > virtio_ballloon_get_free_page_hints, s, > &dev->mem_reentrancy_guard); > } > > if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) { > s->reporting_vq = virtio_add_queue(vdev, 32, > virtio_balloon_handle_report); > } > > We need to fix it. Another possible issue: FS device: 5.11.2 Virtqueues 0 hiprio 1 notification queue 2…n request queues The notification queue only exists if VIRTIO_FS_F_NOTIFICATION is set. Kernel driver had this: enum { VQ_HIPRIO, VQ_REQUEST }; Which means request starts from 1. Thanks > > > > > However, we should try to address the spec mess, where we have at least > > some of the most popular/important implementations behaving differently > > than the spec describes... I would suggest to discuss that on the virtio > > lists -- but they are still dead, and at this point I'm just hoping > > they'll come back eventually :/ > > > > Thanks