On Sun, Aug 18, 2024 at 02:04:29PM +0900, Akihiko Odaki wrote: > On 2024/08/09 21:50, Fabiano Rosas wrote: > > Peter Xu <peterx@xxxxxxxxxx> writes: > > > > > On Thu, Aug 08, 2024 at 10:47:28AM -0400, Michael S. Tsirkin wrote: > > > > On Thu, Aug 08, 2024 at 10:15:36AM -0400, Peter Xu wrote: > > > > > On Thu, Aug 08, 2024 at 07:12:14AM -0400, Michael S. Tsirkin wrote: > > > > > > This is too big of a hammer. People already use what you call "cross > > > > > > migrate" and have for years. We are not going to stop developing > > > > > > features just because someone suddenly became aware of some such bit. > > > > > > If you care, you will have to work to solve the problem properly - > > > > > > nacking half baked hacks is the only tool maintainers have to make > > > > > > people work on hard problems. > > > > > > > > > > IMHO this is totally different thing. It's not about proposing a new > > > > > feature yet so far, it's about how we should fix a breakage first. > > > > > > > > > > And that's why I think we should fix it even in the simple way first, then > > > > > we consider anything more benefitial from perf side without breaking > > > > > anything, which should be on top of that. > > > > > > > > > > Thanks, > > > > > > > > As I said, once the quick hack is merged people stop caring. > > > > > > IMHO it's not a hack. It's a proper fix to me to disable it by default for > > > now. > > > > > > OTOH, having it ON always even knowing it can break migration is a hack to > > > me, when we don't have anything else to guard the migration. > > > > > > > Mixing different kernel versions in migration is esoteric enough for > > > > this not to matter to most people. There's no rush I think, address > > > > it properly. > > > > > > Exactly mixing kernel versions will be tricky to users to identify, but > > > that's, AFAICT, exactly happening everywhere. We can't urge user to always > > > use the exact same kernels when we're talking about a VM cluster. That's > > > why I think allowing migration to work across those kernels matter. > > > > I also worry a bit about the scenario where the cluster changes slightly > > and now all VMs are already restricted by some option that requires the > > exact same kernel. Specifically, kernel changes in a cloud environment > > also happen due to factors completely unrelated to migration. I'm not > > sure the people managing the infra (who care about migration) will be > > gating kernel changes just because QEMU has been configured in a > > specific manner. > > I have wrote a bit about the expectation on the platform earlier[1], but let > me summarize it here. > > 1. I expect the user will not downgrade the platform of hosts after setting > up a VM. This is essential to enable any platform feature. > > 2. The user is allowed to upgrade the platform of hosts gradually. This > results in a situation with mixed platforms. The oldest platform is still > not older than the platform the VM is set up for. This enables the gradual > deployment strategy. > > 3. the user is allowed to downgrade the platform of hosts to the version > used when setting up the VM. This enables rollbacks in case of regression. > > With these expectations, we can ensure migratability by a) enabling platform > features available on all hosts when setting up the VM and b) saving the > enabled features. This is covered with my > -dump-platform/-merge-platform/-use-platform proposal[2]. I really like [2]. Do you plan to work on it? Does anyone else? > Regards, > Akihiko Odaki > > [1] > https://lore.kernel.org/r/2b62780c-a6cb-4262-beb5-81d54c14f545@xxxxxxxxxx > [2] > https://lore.kernel.org/all/2da4ebcd-2058-49c3-a4ec-8e60536e5cbb@xxxxxxxxxx/