On Sun, Mar 19, 2017 at 03:42:16PM +0100, Marc Haber wrote: > Hi, > > I am running a bunch of test VMs on a host (with an AMD Phenom II X6 > 1090T Processor [I am afraid this matters]) using KVM. Host and guest > OS is Debian unstable, and I'm running home-brewed kernel trying to > stay close to Greg's stable releases. Disks are encrypted, so > rebooting the machine from remote is a bit of a pain. > > Since those are just test VMs and the host is also my home desktop > machine, I suspend the host at night without caring for the VMs. > Usually, this works fine with the VMs just chugging away again after > waking up the host. > > However, sometimes it happens that the clock in the VMs stays stopped > after waking up the host. That means, date, wait 10 seconds, date, > will yield the same output (the last datestamp of when the host was > suspended), and a sleep call will never return to the shell. Ok, so timekeeping in the guest is not functioning: either because the services provided by the host necessary for timekeeping are not functional (such as timer interrupts), or because of a bug in the guest timekeeping code. > In this case, the VMs run just normally until they encounter a sleep > call. In this case, the affected process will just sit still and wait > for the sleep to return which never happens. If the job is still in > foreground of shell session, aborting with ctrl-C works. > > Of course this is not a desireable state of operation. The system is > usually a candidate for the MagicSysRq BUSIER routine since a normal > shutdown contains sleep calls... > > I tried reproducing this on a test box that is eaasier to reboot to be > able to bisect, but I was not able to reproduce the issue there. The > test box has a Sandy Bridge i5 processor, which is the reason that I > suspect that the CPU type matters. Sadly, I do not have a second > Phenom available. > > Has anybody ever encountered this situation? Any ideas how to debug > this? Never seen this before. To debug i would: 1) enable the following tracepoints in the host: # echo kvm_inj_virq > # /sys/kernel/debug/tracing/set_event 2) enable tracing for the following functions in the guest (for this function, don't remember from the top of my head how to do it, search for set_ftrace_filter in the ftrace documentation): update_wall_time This should let you know whether the host or the guest are at fault. What version of the host/guest is this again?