Re: [PATCH v6 6/9] KVM: x86: lapic: don't allow to change APIC ID unconditionally

Chao Gao <chao.gao@xxxxxxxxx> · Mon, 14 Mar 2022 12:09:42 +0800

>> > > This won't work with nested AVIC - we can't just inhibit a nested guest using its own AVIC,
>> > > because migration happens.
>> > 
>> > I mean because host decided to change its apic id, which it can in theory do any time,
>> > even after the nested guest has started. Seriously, the only reason guest has to change apic id,
>> > is to try to exploit some security hole.
>> 
>> Hi
>> 
>> Thanks for the information.  
>> 
>> IIUC, you mean KVM applies APICv inhibition only to L1 VM, leaving APICv
>> enabled for L2 VM. Shouldn't KVM disable APICv for L2 VM in this case?
>> It looks like a generic issue in dynamically toggling APICv scheme,
>> e.g., qemu can set KVM_GUESTDBG_BLOCKIRQ after nested guest has started.
>> 
>
>That is the problem - you can't disable it for L2, unless you are willing to emulate it in software.
>Or in other words, when nested guest uses a hardware feature, you can't at some point say to it:
>sorry buddy - hardware feature disappeared.

Agreed. I missed this.

>
>It is *currently* not a problem for APICv because it doesn't do IPI virtualization,
>and even with these patches, it doesn't do this for nesting.
>It does become when you allow nested guest to use this which I did in the nested AVIC code.
>
>
>and writable apic ids do pose a large problem, since nested AVIC, will target L1's apic ids,
>and when they can change under you without any notice, and even worse be duplicate,
>it is just nightmare.

OK. So the problem of disabling APICv is if we choose to disable APICv instead
of making APIC ID read-only, although it can work perfectly for VMX IPIv, it
effectively makes future cleanup to AVIC difficult/impossible because nested
AVIC is practically to implement without assuming APIC IDs of L1 is immutable.

Sean & Maxim

How about go back to use a module parameter to opt in to read-only APIC ID.
Although migration in some cases may fail but it shouldn't be a big issue as
migration VMs from a KVM with nested=on to a KVM with nested=off may also fail.