On Mon, Aug 05, 2024 at 04:27:43PM +0900, Akihiko Odaki wrote: > On 2024/08/04 22:08, Peter Xu wrote: > > On Sun, Aug 04, 2024 at 03:49:45PM +0900, Akihiko Odaki wrote: > > > On 2024/08/03 1:26, Peter Xu wrote: > > > > On Sat, Aug 03, 2024 at 12:54:51AM +0900, Akihiko Odaki wrote: > > > > > > > > I'm not sure if I read it right. Perhaps you meant something more generic > > > > > > > > than -platform but similar? > > > > > > > > > > > > > > > > For example, "-profile [PROFILE]" qemu cmdline, where PROFILE can be either > > > > > > > > "perf" or "compat", while by default to "compat"? > > > > > > > > > > > > > > "perf" would cover 4) and "compat" will cover 1). However neither of them > > > > > > > will cover 2) because an enum is not enough to know about all hosts. I > > > > > > > presented a design that will cover 2) in: > > > > > > > https://lore.kernel.org/r/2da4ebcd-2058-49c3-a4ec-8e60536e5cbb@xxxxxxxxxx > > > > > > > > > > > > "-merge-platform" shouldn't be a QEMU parameter, but should be something > > > > > > separate. > > > > > > > > > > Do you mean merging platform dumps should be done with another command? I > > > > > think we will want to know the QOM tree is in use when implementing > > > > > -merge-platform. For example, you cannot define a "platform" when e.g., you > > > > > don't know what netdev backend (e.g., user, vhost-net, vhost-vdpa) is > > > > > connected to virtio-net devices. Of course we can include those information > > > > > in dumps, but we don't do so for VMState. > > > > > > > > What I was thinking is the generated platform dump shouldn't care about > > > > what is used as backend: it should try to probe whatever is specified in > > > > the qemu cmdline, and it's the user's job to make sure the exact same qemu > > > > cmdline is used in other hosts to dump this information. > > > > > > > > IOW, the dump will only contain the information that was based on the qemu > > > > cmdline. E.g., if it doesn't include virtio device at all, and if we only > > > > support such dump for virtio, it should dump nothing. > > > > > > > > Then the -merge-platform will expect all dumps to look the same too, > > > > merging them with AND on each field. > > > > > > I think we will still need the QOM tree in that case. I think the platform > > > information will look somewhat similar to VMState, which requires the QOM > > > tree to interpret. > > > > Ah yes, I assume you meant when multiple devices can report different thing > > even if with the same frontend / device type. QOM should work, or anything > > that can identify a device, e.g. with id / instance_id attached along with > > the device class. > > > > One thing that I still don't know how it works is how it interacts with new > > hosts being added. > > > > This idea is based on the fact that the cluster is known before starting > > any VM. However in reality I think it can happen when VMs started with a > > small cluster but then cluster extended, when the -merge-platform has been > > done on the smaller set. > > > > > > > > > > > > > Said that, I actually am still not clear on how / whether it should work at > > > > last. At least my previous concern (1) didn't has a good answer yet, on > > > > what we do when profile collisions with qemu cmdlines. So far I actually > > > > still think it more straightforward that in migration we handshake on these > > > > capabilities if possible. > > > > > > > > And that's why I was thinking (where I totally agree with you on this) that > > > > whether we should settle a short term plan first to be on the safe side > > > > that we start with migration always being compatible, then we figure the > > > > other approach. That seems easier to me, and it's also a matter of whether > > > > we want to do something for 9.1, or leaving that for 9.2 for USO*. > > > > > > I suggest disabling all offload features of virtio-net with 9.2. > > > > > > I want to keep things consistent so I want to disable all at once. This > > > change will be very uncomfortable for us, who are implementing offload > > > features, but I hope it will motivate us to implement a proper solution. > > > > > > That said, it will be surely a breaking change so we should wait for 9.1 > > > before making such a change. > > > > Personally I don't worry too much on other offload bits besides USO* so far > > if we have them ON for longer time. My wish was that they're old good > > kernel features mostly supported everywhere who runs QEMU, then we're good. > > Unfortunately, we cannot expect everyone runs Linux, and the offload > features are provided by Linux. However, QEMU can run on other platforms, > and offload features may be provided by vhost-user or vhost-vdpa. I see. I am not familiar with the status quo there, so I'll leave that to you and other experts that know better on this.. Personally I do care more on Linux, as that's what we ship within RH.. > > > > > And I definitely worry about future offload features, or any feature that > > may probe host like this and auto-OFF: I hope we can do them on the safe > > side starting from day1. > > > > So I don't know whether we should do that to USO* only or all. But I agree > > with you that'll definitely be cleaner. > > > > On the details of how to turn them off properly.. Taking an example if we > > want to turn off all the offload features by default (or simply we replace > > that with USO-only).. > > > > Upstream machine type is flexible to all kinds of kernels, so we may not > > want to regress anyone using an existing machine type even on perf, > > especially if we want to turn off all. > > > > In that case we may need one more knob (I'm assuming this is virtio-net > > specific issue, but maybe not; using it as an example) to make sure the old > > machine types perfs as well, with: > > > > - x-virtio-net-offload-enforce > > > > When set, the offload features with value ON are enforced, so when > > the host doesn't support a offload feature it will fail to boot, > > showing the error that specific offload feature is not supported by the > > virtio backend. > > > > When clear, the offload features with value ON are not enforced, so > > these features can be automatically turned OFF when it's detected the > > backend doesn't support them. This may bring best perf but has the > > risk of breaking migration. > > "[PATCH v3 0/5] virtio-net: Convert feature properties to OnOffAuto" adds > "x-force-features-auto" compatibility property to virtio-net for this > purpose: > https://lore.kernel.org/r/20240714-auto-v3-0-e27401aabab3@xxxxxxxxxx Ah ok. But note that there's still a slight difference: we need to avoid AUTO being an option, at all, IMHO. It's about making qemu cmdline the ABI: when with AUTO it's still possible the user uses AUTO on both sides, then ABI may not be guaranteed. AUTO would be fine if: (1) the property doesn't affect guest ABI, or (2) the AUTO bit will always generate the same thing on both hosts. However USO* isn't such case.. so the AUTO option is IMHO not wanted. What I mentioned above "x-virtio-net-offload-enforce" shouldn't add anything new to "uso"; it still can only be ON/OFF. However it should affect "flip that to OFF automatically" or "fail the boot" behavior on missing features. > > > > > With that, > > > > - On old machine types (compat properties): > > > > - set "x-virtio-net-offload-enforce" OFF > > - set all offload features ON > > > > - On new machine types (the default values): > > > > - set "x-virtio-net-offload-enforce" ON > > - set all offload features OFF > > > > And yes, we can do that until 9.2, but with above even 9.1 should be safe > > to do. 9.2 might be still easier just to think everything through again, > > after all at least USO was introduced in 8.2 so not a regress in 9.1. > > > > > > > > By the way, I am wondering perhaps the "no-cross-migrate" scenario can be > > > implemented relatively easy in a way similar to compatibility properties. > > > The idea is to add the "no-cross-migrate" property to machines. If the > > > property is set to "on", all offload features of virtio-net will be set to > > > "auto". virtio-net will then probe the offload features and enable available > > > offloading features. > > > > If it'll become a device property, there's still the trick / concern where > > no-cross-migrate could conflict with the other offload feature that was > > selected explicilty by an user (e.g. no-cross-migrate=ON + uso=OFF). > With no-cross-migrate=ON + uso=OFF, no-cross-migrate will set uso=auto, but > the user overrides with uso=off. As the consequence, USO will be disabled > but all other available offload features will be enabled. Basically you're saying that no-cross-migrate has lower priority than specific feature bits. That's OK to me. Thanks, -- Peter Xu