Re: [PATCH 8/9] KVM: x86: When guest set CR3, handle LAM bits semantics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 01, 2022 at 09:46:39AM +0800, Robert Hoo wrote:
> On Mon, 2022-10-31 at 05:59 +0300, Kirill A. Shutemov wrote:
> > On Mon, Oct 17, 2022 at 03:04:49PM +0800, Robert Hoo wrote:
> > > When only changes LAM bits, ask next vcpu run to load mmu pgd, so
> > > that it
> > > will build new CR3 with LAM bits updates. No TLB flush needed on
> > > this case.
> > > When changes on effective addresses, no matter LAM bits changes or
> > > not, go
> > > through normal pgd update process.
> > > 
> > > Signed-off-by: Robert Hoo <robert.hu@xxxxxxxxxxxxxxx>
> > > ---
> > >  arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++----
> > >  1 file changed, 22 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index e9b465bff8d3..fb779f88ae88 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -1228,9 +1228,9 @@ static bool kvm_is_valid_cr3(struct kvm_vcpu
> > > *vcpu, unsigned long cr3)
> > >  int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> > >  {
> > >  	bool skip_tlb_flush = false;
> > > -	unsigned long pcid = 0;
> > > +	unsigned long pcid = 0, old_cr3;
> > >  #ifdef CONFIG_X86_64
> > > -	bool pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> > > +	bool pcid_enabled = !!kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
> > >  
> > >  	if (pcid_enabled) {
> > >  		skip_tlb_flush = cr3 & X86_CR3_PCID_NOFLUSH;
> > > @@ -1243,6 +1243,10 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > > unsigned long cr3)
> > >  	if (cr3 == kvm_read_cr3(vcpu) && !is_pae_paging(vcpu))
> > >  		goto handle_tlb_flush;
> > >  
> > > +	if (!guest_cpuid_has(vcpu, X86_FEATURE_LAM) &&
> > > +	    (cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)))
> > > +		return	1;
> > > +
> > >  	/*
> > >  	 * Do not condition the GPA check on long mode, this helper is
> > > used to
> > >  	 * stuff CR3, e.g. for RSM emulation, and there is no guarantee
> > > that
> > > @@ -1254,8 +1258,22 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu,
> > > unsigned long cr3)
> > >  	if (is_pae_paging(vcpu) && !load_pdptrs(vcpu, cr3))
> > >  		return 1;
> > >  
> > > -	if (cr3 != kvm_read_cr3(vcpu))
> > > -		kvm_mmu_new_pgd(vcpu, cr3);
> > > +	old_cr3 = kvm_read_cr3(vcpu);
> > > +	if (cr3 != old_cr3) {
> > > +		if ((cr3 ^ old_cr3) & CR3_ADDR_MASK) {
> > > +			kvm_mmu_new_pgd(vcpu, cr3 & ~(X86_CR3_LAM_U48 |
> > > +					X86_CR3_LAM_U57));
> > > +		} else {
> > > +			/* Only LAM conf changes, no tlb flush needed
> > > */
> > > +			skip_tlb_flush = true;
> > 
> > I'm not sure about this.
> > 
> > Consider case when LAM_U48 gets enabled on 5-level paging machines.
> > We may
> > have valid TLB entries for addresses above 47-bit. It's kinda broken
> > case,
> > but seems valid from architectural PoV, no?
> 
> You're right, thanks Kirill.
> 
> I noticed in your Kernel enabling, because of this LAM_U48 and LA_57
> overlapping, you enabled LAM_U57 only for simplicity at this moment. I
> thought at that time, that this trickiness will be contained in Kernel
> layer, but now it turns out at least non-EPT KVM MMU is not spared.
> > 
> > I guess after enabling LAM, these entries will never match. But if
> > LAM
> > gets disabled again they will become active. Hm?
> > 
> > Maybe just flush?
> 
> Now we have 2 options
> 1. as you suggested, just flush
> 2. more precisely identify the case Guest.LA57 && (CR3.bit[62:61] 00
> -->10 switching), flush. (LAM_U57 bit take precedence over LAM_U48,
> from spec.)
> 
> Considering CR3 change is relatively hot path, and tlb flush is heavy,
> I lean towards option 2. Your opinion? 

11 in bits [62:61] is also considered LAM_U57. So your option 2 is broken.

And I don't buy argument about hot path: the case we talking about is
about enabling/disabling LAM with constant PGD. It's not hot path by any
mean.

Let's not be fancy. Just flush TLB.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux