2016-09-16 17:06+0200, Paolo Bonzini: > On 16/09/2016 16:59, Radim Krčmář wrote: >> KVM_MSR_DEADLINE would be interface in kvmclock nanosecond values and >> MSR_IA32_TSCDEADLINE in TSC values. KVM_MSR_DEADLINE would follow >> similar rules as MSR_IA32_TSCDEADLINE -- the interrupt fires when >> kvmclock reaches the value, you read what you write, and 0 disarms it. >> >> If the TSC deadline timer was enabled, then the guest could write to >> both MSR_IA32_TSCDEADLINE and KVM_MSR_DEADLINE, but only one could be >> armed at any time (non-zero write to one will set the other to 0). >> >> The dual interface would allow unconditinal addition of the PV feature >> without regressing users that currently use MSR_IA32_TSCDEADLINE and >> adapted their stack to handle KVM's TSC shortcomings ... > > So far so good. My question is: what happens if you write to > KVM_MSR_DEADLINE and read from MSR_IA32_TSCDEADLINE, or vice versa? (The second paragraph covered it ;]) > The possibilities are: > > a) you read a 0 This one. > b) you read the value converted to the other unit Too much hassle. :) > c) you read another value such as -1 Having common "disarmed" value is nicer and MSR_IA32_TSCDEADLINE has 0. > (a) and (c) are the simplest of course. (c) may make sense when writing > to MSR_IA32_TSCDEADLINE and reading from KVM_MSR_DEADLINE, since we can > decide which values are valid or not; -1 is technically a valid TSC > deadline. > > I'm not sure about whether to allow (b). In the end KVM is going to > convert a nsec deadline to a TSC value internally, and vice versa. It is not necessary to convert nsec deadline to guest-TSC, only to host-TSC in case the VMX_PREEPTION_TIMER is used. I would only have the host-TSC internal representation, which is not exportable to the guest or migratable. > On > the other hand, if we do, userspace needs to figure out (on migration) > whether the guest set up a TSC or a nanosecond deadline. Yeah, I think the solution described below (writing 0 doesn't disarm the other one) is not bad. >>> this lets userspace decide whether to set a nsec-based >>> deadline or a TSC-based deadline after migration. >> >> Hm, isn't switching to TSC-based deadline after migration pointless? > > Yes, but I didn't mean that. I meant preserving which MSR was written > to arm the timer, and redoing the same on the destination. Ah, I see. Both MSRs read what deadline written to them (if they are armed) and at most one can be non-zero. KVM will add MSR_IA32_TSCDEADLINE to the list of emulated MSRs, so userspace will save/restore both deadline MSRs and zero writes will not disarm the other timer, so the correct timer will be armed. No special logic to try to avoid TSC-related bugs. >>>>> This still wouldn't handle old hosts of course. >>>> >>>> The question is whether we want to carry around 150 LOC because of old >>>> hosts. I'd just fix Linux to avoid deadline TSC without invariant TSC. >>>> :) >>> >>> Yes, that would automatically blacklist it on KVM. You'd also need to >>> update the recent optimization to the TSC deadline timer, to also work >>> on other APIC timer modes or at least in your new PV mode. >> >> All modes shouldn't be much harder than just the PV mode. > > The PV mode would still be a bit easier since it's still the TSC > deadline timer just with a nicer interface that is not based on the TSC. > Depends on how you code it though, I guess. Yeah, we'll see. I am planning to carry around the deadline value in nanoseconds (to avoid needless conversions), so it would have similar requirements as the APIC timer. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html