Il 16/07/2014 11:52, Igor Mammedov ha scritto:
There are buggy hosts in the wild that advertise invariant TSC and as result host uses TSC as clocksource, but TSC on such host sometimes sporadically jumps backwards. This causes kvmclock to go backwards if host advertises PVCLOCK_TSC_STABLE_BIT, which turns off aggregated clock accumulator and returns: pvclock_vcpu_time_info.system_timestamp + offset where 'offset' is calculated using TSC. Since TSC is not virtualized in KVM, it makes guest see TSC jumped backwards and leads to kvmclock going backwards as well. This is defensive patch that keeps per CPU last clock value and ensures that clock will never go backwards even with using PVCLOCK_TSC_STABLE_BIT enabled path.
I'm not sure that a per-CPU value is enough; your patch can make the problem much less frequent of course, but I'm not sure neither detection nor correction are 100% reliable. Your addition is basically a faster but less reliable version of the last_value logic.
If may be okay to have detection that is faster but not 100% reliable. However, once you find that the host is buggy I think the correct thing to do is to write last_value and kill PVCLOCK_TSC_STABLE_BIT from valid_flags.
Did you check that the affected host has the latest microcode? Alternatively, could we simply blacklist some CPU steppings? I'm not sure who we could ask at AMD :( but perhaps there is an erratum.
Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html