On Tue, Apr 30, 2019 at 10:03:59PM +0200, Paolo Bonzini wrote: > On 30/04/19 19:36, Sean Christopherson wrote: > > KVM's GPR caching logic is unconditionally emitted for all GPR accesses > > (that go through the accessors), even when the register being accessed > > is fixed and always available. This bloats KVM due to the instructions > > needed to test and set the available/dirty bitmaps, and to conditionally > > invoke the .cache_reg() callback. The latter is especially painful when > > compiling with retpolines. > > > > Eliminate the unnecessary overhead by: > > > > - Adding dedicated accessors for every GPR > > - Omitting the caching logic for GPRs that are always available > > - Preventing use of the unoptimized versions for fixed accesses > > > > The last patch is an opportunistic clean up of VMX, which has gradually > > acquired a bad habit of sprinkling in direct access to GPRs. > > Another related cleanup is to replace these with the direct accessors: > > arch/x86/kvm/vmx/nested.c: vmcs12->guest_rsp = kvm_register_read(vcpu, VCPU_REGS_RSP); > arch/x86/kvm/vmx/nested.c: vmcs12->guest_rip = kvm_register_read(vcpu, VCPU_REGS_RIP); > arch/x86/kvm/x86.c: regs->rsp = kvm_register_read(vcpu, VCPU_REGS_RSP); > arch/x86/kvm/svm.c: kvm_register_write(&svm->vcpu, VCPU_REGS_RSP, hsave->save.rsp); > arch/x86/kvm/svm.c: kvm_register_write(&svm->vcpu, VCPU_REGS_RIP, hsave->save.rip); > arch/x86/kvm/svm.c: kvm_register_write(&svm->vcpu, VCPU_REGS_RSP, nested_vmcb->save.rsp); > arch/x86/kvm/svm.c: kvm_register_write(&svm->vcpu, VCPU_REGS_RIP, nested_vmcb->save.rip); > arch/x86/kvm/vmx/nested.c: kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->guest_rsp); > arch/x86/kvm/vmx/nested.c: kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->guest_rip); > arch/x86/kvm/vmx/nested.c: kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->host_rsp); > arch/x86/kvm/vmx/nested.c: kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->host_rip); > arch/x86/kvm/x86.c: kvm_register_write(vcpu, VCPU_REGS_RSP, regs->rsp); > > I can take care of this. I have applied patches 1 and 3. I didn't apply > patch 2 for the reasons I mentioned in my reply, and because I am not sure > if it works properly---it should have flagged the above occurrences, > shouldn't it? I squeezed the cleanup into patch 2. I should have called that out in the changelog but didn't for whatever reason. Probably would have been even better to do the refactor in a separate patch. Sorry :(