On Fri, Feb 03, 2023, lirongqing@xxxxxxxxx wrote: > From: Li RongQing <lirongqing@xxxxxxxxx> > > pit_shutdown() in drivers/clocksource/i8253.c doesn't work because > setting the counter register to zero causes the PIT to start running > again, negating the shutdown. If this goes anywhere, the changelog needs to be rewritten to describe how KVM is violating the 8253/8254 spec, not how code in Linux-as-a-guest breaks. > > fix it by stopping pit timer and zeroing channel count > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > --- > arch/x86/kvm/i8254.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c > index e0a7a0e..c8a51f5 100644 > --- a/arch/x86/kvm/i8254.c > +++ b/arch/x86/kvm/i8254.c > @@ -358,13 +358,15 @@ static void create_pit_timer(struct kvm_pit *pit, u32 val, int is_period) > } > } > > - hrtimer_start(&ps->timer, ktime_add_ns(ktime_get(), interval), > + if (interval) > + hrtimer_start(&ps->timer, ktime_add_ns(ktime_get(), interval), > HRTIMER_MODE_ABS); > } > > static void pit_load_count(struct kvm_pit *pit, int channel, u32 val) > { > struct kvm_kpit_state *ps = &pit->pit_state; > + u32 org = val; > > pr_debug("load_count val is %u, channel is %d\n", val, channel); > > @@ -386,6 +388,9 @@ static void pit_load_count(struct kvm_pit *pit, int channel, u32 val) > * mode 1 is one shot, mode 2 is period, otherwise del timer */ > switch (ps->channels[0].mode) { > case 0: > + val = org; > + ps->channels[channel].count = val; > + fallthrough; The existing behavior is KVM ABI, e.g. KVM_SET_PIT and KVM_SET_PIT2. I'm also not convinced that KVM is in the wrong here. From the 8254 spec: The largest possible initial count is 0; this is equivalent to 216 for binary counting and 104 for BCD counting. The Counter does not stop when it reaches zero. In Modes 0, 1, 4, and 5 the Counter ‘‘wraps around’’ to the highest count, either FFFF hex for binary count- ing or 9999 for BCD counting, and continues counting. Mode 0 is typically used for event counting. After the Control Word is written, OUT is initially low, and will remain low until the Counter reaches zero. OUT then goes high and remains high until a new count or a new Mode 0 Control Word is written into the Counter. Maybe some actual hardware has a quirk where writing '0' disables the counter, but per the spec, I think Hyper-V and KVM have it right.