"Linux regression tracking (Thorsten Leemhuis)" <regressions@xxxxxxxxxxxxx> writes: > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > for once, to make this easily accessible to everyone. > > Was this regression ever resolved? Doesn't look like it, but maybe I > just missed something. I'm not sure how it ended up on the regression list. IMHO it's not really a regression. It was an attempt at a performance optimisation, which is no longer needed due to changes in (unreleased) firmware. I haven't merged it because Nick's reply contained several questions for Vaibhav, so I'm expecting either a reply to those or a new version of the patch. cheers > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot poke > > On 20.03.24 14:43, Nicholas Piggin wrote: >> On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote: >>> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC: >>> Book3S HV nestedv2: Do not cancel pending decrementer exception") which >>> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It >>> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC >>> expiry TB' register which was higher compared to handling extra decrementer >>> exceptions. >>> >>> This overhead of reading 'HDEC expiry TB' register has been mitigated >>> recently by the L0 hypervisor(PowerVM) by putting the value of this >>> register in L2 guest-state output buffer on trap to L1. From there the >>> value of this register is cached, made available in kvmhv_run_single_vcpu() >>> to compare it against host(L1) timebase and cancel the pending hypervisor >>> decrementer exception if needed. >> >> Ah, I figured out the problem here. Guest entry never clears the >> queued dec, because it's level triggered on the DEC MSB so it >> doesn't go away when it's delivered. So upstream code is indeed >> buggy and I think I take the blame for suggesting this nestedv2 >> workaround. >> >> I actually don't think that is necessary though, we could treat it >> like other interrupts. I think that would solve the problem without >> having to test dec here. >> >> I am wondering though, what workload slows down that this patch >> was needed in the first place. We'd only get here after a cede >> returns, then we'd dequeue the dec and stop having to GET_STATE >> it here. >> >> Thanks, >> Nick >> >>> >>> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception") >>> Signed-off-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx> >>> --- >>> arch/powerpc/kvm/book3s_hv.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >>> index 0b921704da45..e47b954ce266 100644 >>> --- a/arch/powerpc/kvm/book3s_hv.c >>> +++ b/arch/powerpc/kvm/book3s_hv.c >>> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit, >>> * entering a nested guest in which case the decrementer is now owned >>> * by L2 and the L1 decrementer is provided in hdec_expires >>> */ >>> - if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) && >>> + if (kvmppc_core_pending_dec(vcpu) && >>> ((tb < kvmppc_dec_expires_host_tb(vcpu)) || >>> (trap == BOOK3S_INTERRUPT_SYSCALL && >>> kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED))) >>