On 2024/08/18 16:03, Michael S. Tsirkin wrote:
On Sun, Aug 18, 2024 at 02:04:29PM +0900, Akihiko Odaki wrote:
On 2024/08/09 21:50, Fabiano Rosas wrote:
Peter Xu <peterx@xxxxxxxxxx> writes:
On Thu, Aug 08, 2024 at 10:47:28AM -0400, Michael S. Tsirkin wrote:
On Thu, Aug 08, 2024 at 10:15:36AM -0400, Peter Xu wrote:
On Thu, Aug 08, 2024 at 07:12:14AM -0400, Michael S. Tsirkin wrote:
This is too big of a hammer. People already use what you call "cross
migrate" and have for years. We are not going to stop developing
features just because someone suddenly became aware of some such bit.
If you care, you will have to work to solve the problem properly -
nacking half baked hacks is the only tool maintainers have to make
people work on hard problems.
IMHO this is totally different thing. It's not about proposing a new
feature yet so far, it's about how we should fix a breakage first.
And that's why I think we should fix it even in the simple way first, then
we consider anything more benefitial from perf side without breaking
anything, which should be on top of that.
Thanks,
As I said, once the quick hack is merged people stop caring.
IMHO it's not a hack. It's a proper fix to me to disable it by default for
now.
OTOH, having it ON always even knowing it can break migration is a hack to
me, when we don't have anything else to guard the migration.
Mixing different kernel versions in migration is esoteric enough for
this not to matter to most people. There's no rush I think, address
it properly.
Exactly mixing kernel versions will be tricky to users to identify, but
that's, AFAICT, exactly happening everywhere. We can't urge user to always
use the exact same kernels when we're talking about a VM cluster. That's
why I think allowing migration to work across those kernels matter.
I also worry a bit about the scenario where the cluster changes slightly
and now all VMs are already restricted by some option that requires the
exact same kernel. Specifically, kernel changes in a cloud environment
also happen due to factors completely unrelated to migration. I'm not
sure the people managing the infra (who care about migration) will be
gating kernel changes just because QEMU has been configured in a
specific manner.
I have wrote a bit about the expectation on the platform earlier[1], but let
me summarize it here.
1. I expect the user will not downgrade the platform of hosts after setting
up a VM. This is essential to enable any platform feature.
2. The user is allowed to upgrade the platform of hosts gradually. This
results in a situation with mixed platforms. The oldest platform is still
not older than the platform the VM is set up for. This enables the gradual
deployment strategy.
3. the user is allowed to downgrade the platform of hosts to the version
used when setting up the VM. This enables rollbacks in case of regression.
With these expectations, we can ensure migratability by a) enabling platform
features available on all hosts when setting up the VM and b) saving the
enabled features. This is covered with my
-dump-platform/-merge-platform/-use-platform proposal[2].
I really like [2]. Do you plan to work on it? Does anyone else?
No, but I want to move "[PATCH v3 0/5] virtio-net: Convert feature
properties to OnOffAuto" forward:
https://patchew.org/QEMU/20240714-auto-v3-0-e27401aabab3@xxxxxxxxxx/
This will clarify the existence of the "auto" semantics, which is to
enable a platform feature based on availability. [2] will be regarded as
a feature to improve the handling of the "auto" semantics once this
change lands.
Regards,
Akihiko Odaki
Regards,
Akihiko Odaki
[1]
https://lore.kernel.org/r/2b62780c-a6cb-4262-beb5-81d54c14f545@xxxxxxxxxx
[2]
https://lore.kernel.org/all/2da4ebcd-2058-49c3-a4ec-8e60536e5cbb@xxxxxxxxxx/