On 06.02.24 19:24, Sean Christopherson wrote: > On Tue, Feb 06, 2024, Mathias Krause wrote: >> On 05.02.24 20:53, Sean Christopherson wrote: >>> On Sat, Feb 03, 2024, Mathias Krause wrote: >>>> Take 'dr6' from the arch part directly as already done for 'dr7'. >>>> There's no need to take the clunky route via kvm_get_dr(). >>>> >>>> Signed-off-by: Mathias Krause <minipli@xxxxxxxxxxxxxx> >>>> --- >>>> arch/x86/kvm/x86.c | 5 +---- >>>> 1 file changed, 1 insertion(+), 4 deletions(-) >>>> >>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>> index 13ec948f3241..0f958dcf8458 100644 >>>> --- a/arch/x86/kvm/x86.c >>>> +++ b/arch/x86/kvm/x86.c >>>> @@ -5504,12 +5504,9 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, >>>> static void kvm_vcpu_ioctl_x86_get_debugregs(struct kvm_vcpu *vcpu, >>>> struct kvm_debugregs *dbgregs) >>>> { >>>> - unsigned long val; >>>> - >>>> memset(dbgregs, 0, sizeof(*dbgregs)); >>>> memcpy(dbgregs->db, vcpu->arch.db, sizeof(vcpu->arch.db)); >>>> - kvm_get_dr(vcpu, 6, &val); >>>> - dbgregs->dr6 = val; >>>> + dbgregs->dr6 = vcpu->arch.dr6; >>> >>> Blech, kvm_get_dr() is so dumb, it takes an out parameter despite have a void >>> return. >> >> Jepp, that's why I tried to get rid of it. >> >>> I would rather fix that wart and go the other direction, i.e. make dr7 >>> go through kvm_get_dr(). This obviously isn't a fast path, so the extra CALL+RET >>> is a non-issue. >> >> Okay. I thought, as this is an indirect call which is slightly more >> expensive under RETPOLINE, I'd go the other way and simply open-code the >> access, as done a few lines below in kvm_vcpu_ioctl_x86_set_debugregs(). > > It's not an indirect call. It's not even strictly guaranteed to be a function > call, e.g. within x86.c, kvm_get_dr() is in scope of kvm_vcpu_ioctl_x86_get_debugregs() > and so a very smart compiler could fully inline and optimize it to just > "xxx = vcpu->arch.dr6" through dead code elimination (I doubt gcc or clang actually > does this, but it's possible). Oh, snap! I got confused by your later patch which had all the indirect calls. But yes, for arch/x86/kvm/x86.c you're right, gcc is smart enough to inline it. I guess, clang can do that as well. Thanks, Mathias