On Mon, 2022-03-21 at 00:38 +0000, Oliver Upton wrote: > On Sun, Mar 20, 2022 at 09:46:35AM -0000, David Woodhouse wrote: > > But coincidentally since then I have started having conversations with > > people who really want the guest to have an immediate knowledge of the > > adjtimex maxerror etc. on the new host immediately after the migration. > > Maybe the "if the migration isn't fast enough then let the guest know it's > > now unsynced" is OK, but I'll need to work out what "immediately" means > > when we have a guest userspace component involved in it. > > This has also been an area of interest to me. I think we've all seen the > many ways in which doing migrations behind the guest's can put software > in an extremely undesirable state on the other end. If those > conversations are taking place on the mailing lists, could you please CC > me? > > Our (Google) TSC adjustment clamping and userspace notification mechanism > was a halfway kludge to keep things happy on the other end. And it > generally has worked well, but misses a fundamental point. > > The hypervisor should tell the guest kernel about time travel and let it > cascade that information throughout the guest system. Regardless of what > we do to the TSC, we invariably destroy one of the two guest clocks along > the way. If we told the guest "you time traveled X seconds", it could > fold that into its own idea of real time. Guest kernel can then fire off > events to inform software that wants to keep up with clock changes, and > even a new event to let NTP know its probably running on different > hardware. > > Time sucks :-) So, we already have PVCLOCK_GUEST_STOPPED which tells the guest that its clock may have experienced a jump. Linux guests will use this to kick various watchdogs to prevent them whining. Shouldn't we *also* be driving the NTP reset from that same signal?
Attachment:
smime.p7s
Description: S/MIME cryptographic signature