RE: hv_utils PTP support and hypervisor suspend/resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Thor Simon <Thor.Simon@xxxxxxxxxxxx> Sent: Wednesday, February 24, 2021 10:00 AM
> 
> The TimeSync support in hv_utils presently has a "fail safe" limit of 600 seconds.  I'm sure in
> a datacenter server context, where the hypervisor's time is expected to be tightly
> controlled - and continuous - this is sensible.
> 
> Unfortunately, this causes linux VMs to lose time sync unrecoverably in the not-uncommon
> case where the hypervisor's running on a laptop or desktop system that is suspended (or
> hibernated) and resumed.
> 
> Does Hyper-V provide any interface by which we could detect this has occurred and
> override the test for time too far out of sync?  Or, if not, would adding a module option to
> suppress the test be acceptable?

There is a known bug with 5.8 and earlier kernel versions that can cause
Linux timesync with the Hyper-V host to get hung, so that the timesync stops
happening.  The problem can occur after the Hyper-V host is hibernated and
resumed, or if the guest is paused and resumed. The known problem is fixed
by this commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/hv/hv_util.c?id=b46b4a8a57c377b72a98c7930a9f6969d2d4784e

I've just reviewed the code again, and I don't think the 600 second "fail safe"
limit is coming into play in the scenario you describe.   With the above patch in
place, after Hyper-V is resumed after hibernation, the first timesync packet sent
by Hyper-V will set the host_ts.ref_time value to a very current time.  The
ICTIMESYNCFLAG_SYNC flag will also be set, so hv_set_host_time() is called
via work_struct adj_time_work.  hv_set_host_time() will call
hv_get_adj_host_time(), which will find that host_ts.ref_time is very close to
the value from hv_read_reference_counter().  So the 600 second test won't
be triggered.

So my guess is that you experiencing the known bug that I initially described.
But let me know if I'm misunderstanding, or if you are seeing a failure path
that I'm missing.

Michael




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux