Re: [PATCH v10 03/39] KVM: x86: hyper-v: Introduce TLB flush fifo

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Thu, 22 Sep 2022 11:42:36 +0200

Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
>> To allow flushing individual GVAs instead of always flushing the whole
>> VPID a per-vCPU structure to pass the requests is needed. Use standard
>> 'kfifo' to queue two types of entries: individual GVA (GFN + up to 4095
>> following GFNs in the lower 12 bits) and 'flush all'.
>> 
>> The size of the fifo is arbitrary set to '16'.
>
> s/arbitrary/arbitrarily
>
>> +static void hv_tlb_flush_enqueue(struct kvm_vcpu *vcpu)
>> +{
>> +	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +	u64 flush_all_entry = KVM_HV_TLB_FLUSHALL_ENTRY;
>> +
>> +	if (!hv_vcpu)
>> +		return;
>> +
>> +	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> +	kfifo_in_spinlocked(&tlb_flush_fifo->entries, &flush_all_entry,
>> +			    1, &tlb_flush_fifo->write_lock);
>
> Unless I'm missing something, there's no need to disable IRQs, i.e. this can be
> kfifo_in_spinlocked_noirqsave() and the later patch can use spin_lock() instead
> of spin_lock_irqsave().  The only calls to hv_tlb_flush_enqueue() are from
> kvm_hv_hypercall(), i.e. it's always called from process context.
>   

Yes, no IRQ/... contexts are expected, the intention was to take the
spinlock for the shortest amount of time, not to protect against a
deadlock. This probably is not worthy and causes a confusion so I'll
remove it.

>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index 1030b1b50552..ac30091ab346 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -151,4 +151,20 @@ int kvm_vm_ioctl_hv_eventfd(struct kvm *kvm, struct kvm_hyperv_eventfd *args);
>>  int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>  		     struct kvm_cpuid_entry2 __user *entries);
>>  
>> +
>
> Unnecessary newline.
>
>> +static inline void kvm_hv_vcpu_empty_flush_tlb(struct kvm_vcpu *vcpu)
>
> What about "reset" or "purge" instead of "empty"?  "empty" is often used as query,
> e.g. list_empty(), it took me a second to realize this is a command.
>

'purge' sounds good to me!

>> +{
>> +	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
>> +	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +
>> +	if (!hv_vcpu || !kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu))
>> +		return;
>> +
>> +	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>> +
>> +	kfifo_reset_out(&tlb_flush_fifo->entries);
>> +}
>
> Missing newline.
>
>> +void kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>> +
>> +
>
> One too many newlines.
>
>>  #endif
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 86504a8bfd9a..45c35c5467f8 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3385,7 +3385,7 @@ static void kvm_vcpu_flush_tlb_all(struct kvm_vcpu *vcpu)
>>  	static_call(kvm_x86_flush_tlb_all)(vcpu);
>>  }
>>  
>> -static void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>> +void kvm_vcpu_flush_tlb_guest(struct kvm_vcpu *vcpu)
>>  {
>>  	++vcpu->stat.tlb_flush;
>>  
>> @@ -3420,14 +3420,14 @@ void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu)
>>  {
>>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu)) {
>>  		kvm_vcpu_flush_tlb_current(vcpu);
>> -		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> +		kvm_hv_vcpu_empty_flush_tlb(vcpu);
>
> It might be worth adding a comment to call out that emptying the FIFO _after_ the
> TLB flush is ok, because it's impossible for the CPU to insert TLB entries for the
> guest while running in the host.  At first glance, it looks like this (and the
> existing similar pattern in vcpu_enter_guest()) has a race where it could miss a
> TLB flush.
>
> Definitely not required, e.g. kvm_vcpu_flush_tlb_all() doesn't have a similar
> comment.  I think it's just the existence of the FIFO that made me pause.
>

Np, will add something for future generation of readers)

>>  	}
>>  
>>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
>>  		kvm_vcpu_flush_tlb_guest(vcpu);
>> -		kvm_clear_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
>> +		kvm_hv_vcpu_empty_flush_tlb(vcpu);
>>  	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
>> -		kvm_vcpu_flush_tlb_guest(vcpu);
>> +		kvm_hv_vcpu_flush_tlb(vcpu);
>
> Rather than expose kvm_vcpu_flush_tlb_guest() to Hyper-V, what about implementing
> this in a similar way to how way KVM-on-HyperV implements remote TLB flushes?  I.e.
> fall back to kvm_vcpu_flush_tlb_guest() if the precise flush "fails".
>
> I don't mind exposing kvm_vcpu_flush_tlb_guest(), but burying the calls inside
> Hyper-V code makes it difficult to see the relationship between KVM_REQ_HV_TLB_FLUSH
> and KVM_REQ_TLB_FLUSH_GUEST.
>
> And as a minor bonus, that also helps document that kvm_hv_vcpu_flush_tlb() doesn't
> yet support precise flushing.
>
> E.g.
>
> 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu)) {
> 		kvm_vcpu_flush_tlb_guest(vcpu);
> 	} else if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu)) {
> 		/*
> 		 * Fall back to a "full" guest flush if Hyper-V's precise
> 		 * flushing fails.
> 		 */
> 		if (kvm_hv_vcpu_flush_tlb(vcpu))
> 			kvm_vcpu_flush_tlb_guest(vcpu);
> 	}
>
>
> int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
> {
> 	struct kvm_vcpu_hv_tlb_flush_fifo *tlb_flush_fifo;
> 	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>
> 	if (!hv_vcpu)
> 		return -EINVAL;
>
> 	tlb_flush_fifo = &hv_vcpu->tlb_flush_fifo;
>
> 	kfifo_reset_out(&tlb_flush_fifo->entries);
>
> 	/* Precise flushing isn't implemented yet. */
> 	return -EOPNOTSUPP;
> }
>

Oh, I see, certainly can be done this way, even if just to improve the
readability. Will change.

-- 
Vitaly