On 27/10/2021 16:21, Nicholas Piggin wrote:
From: Laurent Vivier <lvivier@xxxxxxxxxx>
Commit 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest
context before enabling irqs") moved guest_exit() into the interrupt
protected area to avoid wrong context warning (or worse). The problem is
that tick-based time accounting has not yet been updated at this point
(because it depends on the timer interrupt firing), so the guest time
gets incorrectly accounted to system time.
To fix the problem, follow the x86 fix in commit 160457140187 ("Defer
vtime accounting 'til after IRQ handling"), and allow host IRQs to run
before accounting the guest exit time.
In the case vtime accounting is enabled, this is not required because TB
is used directly for accounting.
Before this patch, with CONFIG_TICK_CPU_ACCOUNTING=y in the host and a
guest running a kernel compile, the 'guest' fields of /proc/stat are
stuck at zero. With the patch they can be observed increasing roughly as
expected.
Fixes: e233d54d4d97 ("KVM: booke: use __kvm_guest_exit")
Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs")
Cc: <stable@xxxxxxxxxxxxxxx> # 5.12
Signed-off-by: Laurent Vivier <lvivier@xxxxxxxxxx>
[np: only required for tick accounting, add Book3E fix, tweak changelog]
Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx>
---
Since v2:
- I took over the patch with Laurent's blessing.
- Changed to avoid processing IRQs if we do have vtime accounting
enabled.
- Changed so in either case the accounting is called with irqs disabled.
- Added similar Book3E fix.
- Rebased on upstream, tested, observed bug and confirmed fix.
arch/powerpc/kvm/book3s_hv.c | 30 ++++++++++++++++++++++++++++--
arch/powerpc/kvm/booke.c | 16 +++++++++++++++-
2 files changed, 43 insertions(+), 3 deletions(-)
Tested-by: Laurent Vivier <lvivier@xxxxxxxxxx>
Checked with mpstat that time is accounted to %guest while a stress-ng test is running in
the guest. Checked there is no warning in the host kernellogs.
Thanks,
Laurent