On 03.01.2014, at 03:51, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote: > On Sun, 2013-12-29 at 16:43 +0100, Alexander Graf wrote: >> On 11.07.2013, at 00:47, Scott Wood <scottwood@xxxxxxxxxxxxx> wrote: >>> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c >>> index ddfaf56..c13caa0 100644 >>> --- a/arch/powerpc/kvm/book3s_pr.c >>> +++ b/arch/powerpc/kvm/book3s_pr.c >>> @@ -884,14 +884,11 @@ program_interrupt: >>> * and if we really did time things so badly, then we just exit >>> * again due to a host external interrupt. >>> */ >>> - local_irq_disable(); >>> s = kvmppc_prepare_to_enter(vcpu); >>> - if (s <= 0) { >>> - local_irq_enable(); >>> + if (s <= 0) >>> r = s; >>> - } else { >>> + else >>> kvmppc_fix_ee_before_entry(); >>> - } >>> } >> >> Please add a comment here stating that interrupts are hard disabled at this point. > > OK. > >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c >>> index 4e05f8c..2f7a221 100644 >>> --- a/arch/powerpc/kvm/powerpc.c >>> +++ b/arch/powerpc/kvm/powerpc.c >>> @@ -64,12 +64,14 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu) >>> { >>> int r = 1; >>> >>> - WARN_ON_ONCE(!irqs_disabled()); >>> + WARN_ON(irqs_disabled()); >>> + hard_irq_disable(); >>> + >>> while (true) { >>> if (need_resched()) { >>> local_irq_enable(); >> >> One thing I find reasonably tricky in this approach is that we run >> local_irq_enable, but hard_irq_disable. I did dig through the code and >> it should work just fine, but I think we should make sure to note that >> this is intended and doesn't just work by accident. >> >> Just add a comment above the first call to hard_irq_disable() that >> explains that local_irq_enable() will properly unroll hard_irq_disable. >> That way the next person reading this doesn't have to dig as deeply. > > There is no hard_irq_enable() -- only __hard_irq_enable() that doesn't > update local_paca->irq_happened. > > This is normal use of the API. If there does need to be a comment, it > should go in hw_irq.h. :-) Yeah, it's always confusing to me in other places too :). But there are only so many places that have to deal with really hard disabled interrupts. > >>> #ifdef CONFIG_PPC64 >>> - /* lazy EE magic */ >>> - hard_irq_disable(); >>> - if (lazy_irq_pending()) { >>> - /* Got an interrupt in between, try again */ >>> - local_irq_enable(); >>> - local_irq_disable(); >>> - kvm_guest_exit(); >>> - continue; >>> - } >>> + WARN_ON(lazy_irq_pending()); >>> #endif >>> + /* Can't use irqs_disabled() because we want hard irq state */ >>> + WARN_ON(mfmsr() & MSR_EE); >> >> The reason for lazy EE is that mfmsr() is supposed to be slow. > > Not just mtmsr? > > And when I complained about wrtee not being all that slow on our > hardware, Ben said it was also for perf coverage. :-) > >> Couldn't we check for the internal hard disable flag instead? Just create a new >> helper in hw_irq.h that tells us >> >> local_paca->irq_happened & PACA_IRQ_HARD_DIS >> >> on PPC64 and >> >> !(mfmsr() & MSR_EE) >> >> on PPC32. We can then just use that helper here and not run "expensive" mfmsr() calls. > >> I'm not sure whether it's actually measurable overhead in the context >> of the whole world switch, but why diverge from the rest of the system? >> If you think a new helper is overkill, I'd be fine with a simple #ifdef >> here just as well. > > I'd hope a mfmsr wouldn't be so slow on any hardware as to be a big deal > here, though it'd also be nice if there were a Linux-wide way of > specifying whether a particular WARN should be always checked, or only > if the kernel is built with some debug option... > > The "rest of the system" is irqs_disabled() and there's already a > comment explaining why we diverge from that. > > Maybe we should just remove that check; we'd still at least have the > 64-bit check in kvmppc_fix_ee_before_entry. Well, as I mentioned we're already so deep down in the guts of how interrupt handling works that it's perfectly fine to check paca->irq_happened directly here if we care. > >>> kvm_guest_enter(); >>> - break; >>> + return r; >> >> This must be 0 at this point, right? > > No, it'll be 1. > >> Either way, I prefer to keep the code easily understandable. How about >> you keep the break and just add an if (r) around the local_irq_enable() >> below? Then add a comment stating that the return value tells us >> whether entry is ok to proceed and interrupts are disabled (0) or >> something didn't work out and interrupts are enabled (!0). > > How is it more understandable to overload what looks like a single code > path with divergent cases that have little to nothing in common? Would > it help to add a comment on the return-to-host code path indicating that > it is only used for returning to the host? Yup :) > I'd be fine with replacing "return r" with "return 1" (and getting rid > of the initialization of r at the top of the function, unless GCC > complains inappropriately). Works fine for me. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html