Re: [PATCH 0/2] KVM: x86: Fix and cleanup for recent AVIC changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2021-10-08 at 18:01 -0700, Sean Christopherson wrote:
> Belated "code review" for Maxim's recent series to rework the AVIC inhibit
> code.  Using the global APICv status in the page fault path is wrong as
> the correct status is always the vCPU's, since that status is accurate
> with respect to the time of the page fault.  In a similar vein, the code
> to change the inhibit can be cleaned up since KVM can't rely on ordering
> between the update and the request for anything except consumers of the
> request.
> 
> Sean Christopherson (2):
>   KVM: x86/mmu: Use vCPU's APICv status when handling APIC_ACCESS
>     memslot
>   KVM: x86: Simplify APICv update request logic
> 
>  arch/x86/kvm/mmu/mmu.c |  2 +-
>  arch/x86/kvm/x86.c     | 16 +++++++---------
>  2 files changed, 8 insertions(+), 10 deletions(-)
> 

Are you sure about it? Let me explain how the algorithm works:

- kvm_request_apicv_update:

	- take kvm->arch.apicv_update_lock

	- if inhibition state doesn't really change (kvm->arch.apicv_inhibit_reasons still zero or non zero)
		- update kvm->arch.apicv_inhibit_reasons
		- release the lock

	- raise KVM_REQ_APICV_UPDATE
		* since kvm->arch.apicv_update_lock is taken, all vCPUs will be kicked out of guest
		  mode and will be either doing someing in the KVM (like page fault) or stuck on trying to process that request
                  the important thing is that no vCPU will be able to get back to the guest mode.

	- update the kvm->arch.apicv_inhibit_reasons
		* since we hold vm->arch.apicv_update_lock vcpus can't see the new value

	- update the SPTE that covers the APIC's mmio window:

		- if we enable AVIC, then do nothing.
			
			* First vCPU to access it will page fault and populate that SPTE

			* If we race with page fault again no problem, worst case the page fault
			  doesn't populte the SPTE, and we will get another page fault later
			  and it will. 

			  -> SPTE not present + AVIC enabled is not a problem, it just causes
			  a spurious page fault, and then retried at which point AVIC is used.

			  It is nice to re-install the SPTE as fast as possible to avoid such
			  faults for performance reasons.

		- if we disable AVIC, then we zap the spte:

			* page fault should not happen just before zapping as AVIC is enabled on the vCPUs now.
			  even if it does happen, it doesn't matter if it does populate the SPTE, as we will zap it anyway.

			* during the zapping we take the mmu lock and use mmu notifier counter hack
			  to avoid racing with page fault that can happen concurrently with it.

			* if page fault on another vCPU happens after the zapping, it will see the correct 
			  kvm->arch.apicv_inhibit_reasons (but likely incorrect its own vCPU AVIC inhibit state)
			  and will not re-populate the SPTE.

			  -> and SPTE present + AVIC inhibited on this vCPU is the problem,
			  as this will cause writes to AVIC to disappear into that dummy page mapped by that SPTE.

			  That is why patch 1 IMHO is wrong.

	- release the kvm->arch.apicv_update_lock
		* at that point all vCPUs can re-enter but they all will process the KVM_REQ_APICV_UPDATE
		  prior to that, which will update their AVIC state.


Best regards,
	Maxim Levitsky






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux