On 07/16/2012 12:18 PM, Alexander Graf wrote: >> +/* >> + * Return the number of jiffies until the next timeout. If the > timeout is >> + * longer than the NEXT_TIMER_MAX_DELTA, that > > then? > >> return NEXT_TIMER_MAX_DELTA >> + * instead. > > I can read code. Come on, it's not exactly x++; /* add one to x */ It's faster to read code (as well as know the constraints within which you can modify it without having to spend a lot of time digesting all the callers' use cases) when you have a high level description of its interface contract, and can be selective about when to zoom in to the details. Linux kernel code tends to be bad about this. > The important piece of information in the comment is > missing: The reason. The reason for what? Why you want to know the next timeout? That's the caller's business. Or why we use NEXT_TIMER_MAX_DELTA as the limit? >> +void kvmppc_watchdog_func(unsigned long data) >> +{ >> + struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; >> + u32 tsr, new_tsr; >> + int final; >> + >> + do { >> + new_tsr = tsr = vcpu->arch.tsr; >> + final = 0; >> + >> + /* Time out event */ >> + if (tsr & TSR_ENW) { >> + if (tsr & TSR_WIS) { >> + new_tsr = (tsr & ~TCR_WRC_MASK) | >> + (vcpu->arch.tcr & TCR_WRC_MASK); >> + vcpu->arch.tcr &= ~TCR_WRC_MASK; > > Can't we just poke the vcpu to exit the VM and do the above on its own? We've discussed this before. TSR updates are done via atomics, and we send a request for the vcpu to act on the result. This is how the decrementer works. http://www.spinics.net/lists/kvm-ppc/msg03169.html > This is the watdog expired case, right? Final expiration, yes. > I'd also prefer to have an > explicit event for the expiry than a special TSR check in the main loop. So check TSR[WRS] in update_timer_ints(), and have it queue a pseudoexception? That would eliminate the need to change the runnable function. > Also call me sceptic on the reset of tcr. If our user space watchdog > event is "write a message", then we essentially want to hide the fact > that the watchdog expired from the guest, right? In that case, the > second time-out wouldn't do anything guest visible. This was probably copied straight out of the hardware documentation, which explicitly says TCR[WRC] gets set to zero on final expiration (as part of reset). We should leave that part up to userspace. It definitely shouldn't be done inside the cmpxchg loop (or from interrupt context -- only TSR gets the atomic treatment). I don't think the read of TCR outside vcpu context is a problem, though. >> int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) >> { >> - return !(v->arch.shared->msr & MSR_WE) || >> - !!(v->arch.pending_exceptions) || >> - v->requests; >> + bool ret = !(v->arch.shared->msr & MSR_WE) || >> + !!(v->arch.pending_exceptions) || >> + v->requests; >> + >> + ret = ret || kvmppc_get_tsr_wrc(v); > > Why do you need to declare the cpu as non-runnable when a watchdog event > occured? It's the other way around -- it's always runnable when a watchdog exit is pending. It's like a pending exception. >> diff --git a/include/linux/kvm.h b/include/linux/kvm.h >> index 2ce09aa..b9fdb52 100644 >> --- a/include/linux/kvm.h >> +++ b/include/linux/kvm.h >> @@ -163,6 +163,7 @@ struct kvm_pit_config { >> #define KVM_EXIT_OSI 18 >> #define KVM_EXIT_PAPR_HCALL 19 >> #define KVM_EXIT_S390_UCONTROL 20 >> +#define KVM_EXIT_WDT 21 > > Please make this a more generic KVM_EXIT_WATCHDOG so that other archs > may have the chance to reuse it. WDT is generic (85 of 115 files in drivers/watchdog/ contain "wdt" in their names), just overly abbreviated. KVM_EXIT_WATCHDOG is more readable. -Scott -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html