On Wed, Jul 24, 2024 at 12:38 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Wed, Jul 24, 2024, Jim Mattson wrote: > > On Wed, Jul 24, 2024 at 11:13 AM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote: > > > What if we introduce a new KVM capability, say CAP_DISABLE_UNSUPPORTED_FEATURES, > > > and when enabled, outright crash the guest when it attempts things like > > > changing APIC base, APIC IDs, and other unsupported things like that? > > > > > > Then we can make qemu set it by default, and if users have to use an > > > unsupported feature, they could always add a qemu flag that will disable > > > this capability. > > > > Alternatively, why not devise a way to inform userspace that the guest > > has exercised an unsupported feature? Unless you're a hobbyist working on > > your desktop, kernel messages are a *terrible* mechanism for communicating > > with the end user. > > A per-vCPU/VM stat would suffice in most cases, e.g. similar to the proposed > auto-EOI stat[*]. But I honestly don't see the point for APIC base relocation > and changing x2APIC IDs. We _know_ these things don't work. Having a flag might > save a bit of triage when debugging a guest issue, but I fail to see what userspace > would do with the information outside of a debug scenario. I would argue that insider knowledge about what does and doesn't work isn't particularly helpful to the end user. A non-standard flag isn't particularly helpful either. Nor is a kernel log message. Perhaps these solutions are fine for hobbyists, but they are not useful in an enterprise environment If a guest OS tries to change the APIC base address, and KVM silently ignores it, the guest is unlikely to get very far. Imagine what would happen if the guest tried to change GS_BASE, and KVM silently ignored it. Maybe KVM should return KVM_INTERNAL_ERROR_EMULATION if the guest attempts to relocate the APIC base--even without a new "pedantic" flag. What is the point in trying to continue without relocation? > And for APIC base relocation, userspace already has the ability to detect this > unuspported behavior. Trap writes to MSR_IA32_APICBASE via MSR filtering, then > reflect the value back into KVM. Performance would suck, but writes to > MSR_IA32_APICBASE should be very rare, especially if the host forces x2APIC via > guest firmware. This "unsupported behavior" should at least be documented somewhere. > As for changing x2APIC IDs, that is the architectural behavior on Intel. If a > kernel is trying to change x2APIC IDs, it's going to have a bad day regardless. > > So I guess the question is, what motivated this patch? I don't recall, but I now withdraw the patch. We really should do better. BTW, there is a KUT that supposedly verifies that APIC base relocation works. See KUT commit b6bdb3f6ab6 ("apic: Add test case for relocation and writing reserved bits"). That's the test that Nadav Amit refers to in the commit message for commit db324fe6f20b ("KVM: x86: Warn on APIC base relocation").