Re: [PATCH v3 11/34] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks'

Sean Christopherson <seanjc@xxxxxxxxxx> · Tue, 17 May 2022 14:04:35 +0000



On Tue, May 17, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
> 
> > On Thu, Apr 14, 2022, Vitaly Kuznetsov wrote:
> >> To make kvm_hv_flush_tlb() ready to handle L2 TLB flush requests, KVM needs
> >> to allow for all 64 sparse vCPU banks regardless of KVM_MAX_VCPUs as L1
> >> may use vCPU overcommit for L2. To avoid growing on-stack allocation, make
> >> 'sparse_banks' part of per-vCPU 'struct kvm_vcpu_hv' which is allocated
> >> dynamically.
> >> 
> >> Note: sparse_set_to_vcpu_mask() keeps using on-stack allocation as it
> >> won't be used to handle L2 TLB flush requests.
> >
> > I think it's worth using stronger language; handling TLB flushes for L2 _can't_
> > use sparse_set_to_vcpu_mask() because KVM has no idea how to translate an L2
> > vCPU index to an L1 vCPU.  I found the above mildly confusing because it didn't
> > call out "vp_bitmap" and so I assumed the note referred to yet another sparse_banks
> > "allocation".  And while vp_bitmap is related to sparse_banks, it tracks something
> > entirely different.
> >
> > Something like?
> >
> > Note: sparse_set_to_vcpu_mask() can never be used to handle L2 requests as
> > KVM can't translate L2 vCPU indices to L1 vCPUs, i.e. its vp_bitmap array
> > is still bounded by the number of L1 vCPUs and so can remain an on-stack
> > allocation.
> 
> My brain is probably tainted by looking at all this for some time so I
> really appreciate such improvements, thanks :)
> 
> I wouldn't, however, say "never" ('never say never' :-)): KVM could've
> kept 2-level reverse mapping up-to-date:
> 
> KVM -> L2 VM list -> L2 vCPU ids -> L1 vCPUs which run them
> 
> making it possible for KVM to quickly translate between L2 VP IDs and L1
> vCPUs. I don't do this in the series and just record L2 VM_ID/VP_ID for
> each L1 vCPU so I have to go over them all for each request. The
> optimization is, however, possible and we may get to it if really big
> Windows VMs become a reality.

Out of curiosity, is L1 "required" to provides the L2 => L1 translation/map?