On Thu, 2020-07-09 at 11:44 +0200, Gerd Hoffmann wrote: > Hi, > > > > (CCing libvir-list, and people who were included in the OVMF > > > thread[1]) > > > > > > [1] > > > https://lore.kernel.org/qemu-devel/99779e9c-f05f-501b-b4be-ff719f140a88@xxxxxxxxxxxxx/ > > > Also, it's important that we work with libvirt and management > > > software to ensure they have appropriate APIs to choose what to > > > do when a cluster has hosts with different MAXPHYADDR. > > > > There's been so many complex discussions that it is hard to have > > any > > understanding of what we should be doing going forward. There's > > enough > > problems wrt phys bits, that I think we would benefit from a doc > > that > > outlines the big picture expectation for how to handle this in the > > virt stack. > > Well, the fundamental issue is not that hard actually. We have three > cases: > > (1) GUEST_MAXPHYADDR == HOST_MAXPHYADDR > > Everything is fine ;) > > (2) GUEST_MAXPHYADDR < HOST_MAXPHYADDR > > Mostly fine. Some edge cases, like different page fault errors > for > addresses above GUEST_MAXPHYADDR and below > HOST_MAXPHYADDR. Which I > think Mohammed fixed in the kernel recently. > > (3) GUEST_MAXPHYADDR > HOST_MAXPHYADDR > > Broken. If the guest uses addresses above HOST_MAXPHYADDR > everything > goes south. > > The (2) case isn't much of a problem. We only need to figure > whenever > we want qemu allow this unconditionally (current state) or only in > case > the kernel fixes are present (state with this patch applied if I read > it > correctly). > > The (3) case is the reason why guest firmware never ever uses > GUEST_MAXPHYADDR and goes with very conservative heuristics instead, > which in turn leads to the consequences discussed at length in the > OVMF thread linked above. > > Ideally we would simply outlaw (3), but it's hard for backward > compatibility reasons. Second best solution is a flag somewhere > (msr, cpuid, ...) telling the guest firmware "you can use > GUEST_MAXPHYADDR, we guarantee it is <= HOST_MAXPHYADDR". Problem is GUEST_MAXPHYADDR > HOST_MAXPHYADDR is actually a supported configuration on some setups. Namely when memory encryption is enabled on AMD CPUs[1]. > > > As mentioned in the thread quoted above, using host_phys_bits is a > > obvious thing to do when the user requested "-cpu host". > > > > The harder issue is how to handle other CPU models. I had suggested > > we should try associating a phys bits value with them, which would > > probably involve creating Client/Server variants for all our CPU > > models which don't currently have it. I still think that's worth > > exploring as a strategy and with versioned CPU models we should > > be ok wrt back compatibility with that approach. > > Yep, better defaults for GUEST_MAXPHYADDR would be good too, but that > is a separate (although related) discussion. > > take care, > Gerd > [1] - https://lkml.org/lkml/2020/6/19/2371