Excerpts from Laurent Vivier's message of October 20, 2021 4:29 pm: > On 15/10/2021 04:23, Nicholas Piggin wrote: >> Excerpts from Laurent Vivier's message of October 13, 2021 7:30 pm: >>> On 13/10/2021 01:18, Michael Ellerman wrote: >>>> Laurent Vivier <lvivier@xxxxxxxxxx> writes: >>>>> Commit 112665286d08 moved guest_exit() in the interrupt protected >>>>> area to avoid wrong context warning (or worse), but the tick counter >>>>> cannot be updated and the guest time is accounted to the system time. >>>>> >>>>> To fix the problem port to POWER the x86 fix >>>>> 160457140187 ("Defer vtime accounting 'til after IRQ handling"): >>>>> >>>>> "Defer the call to account guest time until after servicing any IRQ(s) >>>>> that happened in the guest or immediately after VM-Exit. Tick-based >>>>> accounting of vCPU time relies on PF_VCPU being set when the tick IRQ >>>>> handler runs, and IRQs are blocked throughout the main sequence of >>>>> vcpu_enter_guest(), including the call into vendor code to actually >>>>> enter and exit the guest." >>>>> >>>>> Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs") >>>>> Cc: npiggin@xxxxxxxxx >>>>> Cc: <stable@xxxxxxxxxxxxxxx> # 5.12 >>>>> Signed-off-by: Laurent Vivier <lvivier@xxxxxxxxxx> >>>>> --- >>>>> >>>>> Notes: >>>>> v2: remove reference to commit 61bd0f66ff92 >>>>> cc stable 5.12 >>>>> add the same comment in the code as for x86 >>>>> >>>>> arch/powerpc/kvm/book3s_hv.c | 24 ++++++++++++++++++++---- >>>>> 1 file changed, 20 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >>>>> index 2acb1c96cfaf..a694d1a8f6ce 100644 >>>>> --- a/arch/powerpc/kvm/book3s_hv.c >>>>> +++ b/arch/powerpc/kvm/book3s_hv.c >>>> ... >>>>> @@ -4506,13 +4514,21 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit, >>>>> >>>>> srcu_read_unlock(&kvm->srcu, srcu_idx); >>>>> >>>>> + context_tracking_guest_exit(); >>>>> + >>>>> set_irq_happened(trap); >>>>> >>>>> kvmppc_set_host_core(pcpu); >>>>> >>>>> - guest_exit_irqoff(); >>>>> - >>>>> local_irq_enable(); >>>>> + /* >>>>> + * Wait until after servicing IRQs to account guest time so that any >>>>> + * ticks that occurred while running the guest are properly accounted >>>>> + * to the guest. Waiting until IRQs are enabled degrades the accuracy >>>>> + * of accounting via context tracking, but the loss of accuracy is >>>>> + * acceptable for all known use cases. >>>>> + */ >>>>> + vtime_account_guest_exit(); >>>> >>>> This pops a warning for me, running guest(s) on Power8: >>>> >>>> [ 270.745303][T16661] ------------[ cut here ]------------ >>>> [ 270.745374][T16661] WARNING: CPU: 72 PID: 16661 at arch/powerpc/kernel/time.c:311 vtime_account_kernel+0xe0/0xf0 >>> >>> Thank you, I missed that... >>> >>> My patch is wrong, I have to add vtime_account_guest_exit() before the local_irq_enable(). >> >> I thought so because if we take an interrupt after exiting the guest that >> should be accounted to kernel not guest. >> >>> >>> arch/powerpc/kernel/time.c >>> >>> 305 static unsigned long vtime_delta(struct cpu_accounting_data *acct, >>> 306 unsigned long *stime_scaled, >>> 307 unsigned long *steal_time) >>> 308 { >>> 309 unsigned long now, stime; >>> 310 >>> 311 WARN_ON_ONCE(!irqs_disabled()); >>> ... >>> >>> But I don't understand how ticks can be accounted now if irqs are still disabled. >>> >>> Not sure it is as simple as expected... >> >> I don't know all the timer stuff too well. The >> !CONFIG_VIRT_CPU_ACCOUNTING case is relying on PF_VCPU to be set when >> the host timer interrupt runs irqtime_account_process_tick runs so it >> can accumulate that tick to the guest? >> >> That probably makes sense then, but it seems like we need that in a >> different place. Timer interrupts are not guaranteed to be the first one >> to occur when interrupts are enabled. >> >> Maybe a new tick_account_guest_exit() and move PF_VCPU clearing to that >> for tick based accounting. Call it after local_irq_enable and call the >> vtime accounting before it. Would that work? > > Hi Nick, > > I think I will not have the time to have a look to fix that? > > Could you try? Hey Laurent, sure I can take a look at it. Thanks, Nick