Re: [PATCH v2 4/4] KVM: x86/mmu: Improve TLB flush comment in kvm_mmu_slot_remove_write_access()

Sean Christopherson <seanjc@xxxxxxxxxx> · Fri, 14 Jan 2022 23:58:32 +0000

On Thu, Jan 13, 2022, David Matlack wrote:
> Rewrite the comment in kvm_mmu_slot_remove_write_access() that explains
> why it is safe to flush TLBs outside of the MMU lock after
> write-protecting SPTEs for dirty logging. The current comment is a long
> run-on sentence that was difficult to understand. In addition it was
> specific to the shadow MMU (mentioning mmu_spte_update()) when the TDP
> MMU has to handle this as well.
> 
> The new comment explains:
>  - Why the TLB flush is necessary at all.
>  - Why it is desirable to do the TLB flush outside of the MMU lock.
>  - Why it is safe to do the TLB flush outside of the MMU lock.
> 
> No functional change intended.
> 
> Signed-off-by: David Matlack <dmatlack@xxxxxxxxxx>

One nit below,

Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>

> ---
>  arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 1d275e9d76b5..8ed2b42a7aa3 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5756,6 +5756,7 @@ static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
>  				continue;
>  
>  			flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp,
> +
>  							PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
>  							start, end - 1, true, flush);
>  		}
> @@ -5825,15 +5826,27 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
>  	}
>  
>  	/*
> -	 * We can flush all the TLBs out of the mmu lock without TLB
> -	 * corruption since we just change the spte from writable to
> -	 * readonly so that we only need to care the case of changing
> -	 * spte from present to present (changing the spte from present
> -	 * to nonpresent will flush all the TLBs immediately), in other
> -	 * words, the only case we care is mmu_spte_update() where we
> -	 * have checked Host-writable | MMU-writable instead of
> -	 * PT_WRITABLE_MASK, that means it does not depend on PT_WRITABLE_MASK
> -	 * anymore.
> +	 * Flush TLBs if any SPTEs had to be write-protected to ensure that
> +	 * guest writes are reflected in the dirty bitmap before the memslot
> +	 * update completes, i.e. before enabling dirty logging is visible to
> +	 * userspace.
> +	 *
> +	 * Perform the TLB flush outside the mmu_lock to reduce the amount of
> +	 * time the lock is held. However, this does mean that another CPU can
> +	 * now grab the mmu_lock and encounter an SPTE that is write-protected
> +	 * while CPUs still have writable versions of that SPTE in their TLB.

Uber nit on "SPTE in their TLB".  Maybe this?

	 * now grab mmu_lock and encounter a write-protected SPTE while CPUs
	 * still have a writable mapping for the associated GFN in their TLB.

> +	 *
> +	 * This is safe but requires KVM to be careful when making decisions
> +	 * based on the write-protection status of an SPTE. Specifically, KVM
> +	 * also write-protects SPTEs to monitor changes to guest page tables
> +	 * during shadow paging, and must guarantee no CPUs can write to those
> +	 * page before the lock is dropped. As mentioned in the previous
> +	 * paragraph, a write-protected SPTE is no guarantee that CPU cannot
> +	 * perform writes. So to determine if a TLB flush is truly required, KVM
> +	 * will clear a separate software-only bit (MMU-writable) and skip the
> +	 * flush if-and-only-if this bit was already clear.
> +	 *
> +	 * See DEFAULT_SPTE_MMU_WRITEABLE for more details.
>  	 */
>  	if (flush)
>  		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
> -- 
> 2.34.1.703.g22d0c6ccf7-goog
>