Jeremy Fitzhardinge wrote: > The softlockup watchdog is currently a nuisance in a virtual machine, > since the whole system could have the CPU stolen from it for a long > period of time. While it would be unlikely for a guest domain to be > denied timer interrupts for over 10s, it could happen and any softlockup > message would be completely spurious. > No, it is not unlikely. 4-way SMP VMs idling exhibit this behavior with NO_HZ or NO_IDLE_HZ because they get quiet enough to schedule nothing on the APs. And that can happen on native hardware as well. > Earlier I proposed that sched_clock() return time in unstolen > nanoseconds, which is how Xen and VMI currently implement it. If the > softlockup watchdog uses sched_clock() to measure time, it would > automatically ignore stolen time, and therefore only report when the > guest itself locked up. When running native, sched_clock() returns > real-time nanoseconds, so the behaviour would be unchanged. > > Does this seem sound? > > Also, softlockup.c's use of jiffies seems archaic now. Should it be > converted to use timers? Mightn't it report lockups just because there > was no timer event? > This looks good to me, as a first order approximation. But on native hardware, with NO_HZ, this is just broken to begin with. Perhaps we should make SOFTLOCKUP depend on !NO_HZ. Zach _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/virtualization