Re: [PATCH 2/2] KVM: PPC: Book3S: Call into C interrupt handlers

Alexander Graf <agraf@xxxxxxx> · Fri, 27 Apr 2012 13:23:24 +0200

On 27.04.2012, at 07:48, Paul Mackerras wrote:

> On Thu, Apr 26, 2012 at 12:19:03PM +0200, Alexander Graf wrote:
> 
>> So switch the code over to call into the Linux C handlers from C code,
>> speeding up everything along the way.
> 
> I have to say this patch makes me pretty uneasy.  There are a few
> things that look wrong to me, but more than that, it seems to me that
> there would be a lot of careful thought needed to make sure that the
> approach is bullet-proof.

Yay, finally some review on it :). This method is currently used identically in booke hv, so everything we find broken here also applies there!

> The first thing is that you are filling in the registers, and in
> particular r1, in a subroutine, so you are potentially making regs.r1
> point to a stack frame that no longer exists by the time we look at it
> inside do_IRQ or timer_interrupt.  So, for example, a stack trace
> could go completely off the rails at that point.  Quite possibly gcc
> will inline the kvmppc_fill_pt_regs function, which would probably
> save you, but I don't think you should rely on that.

Ugh. Right.

> The second thing is, why do you save just r1, ip, msr, and lr?  Why
> those ones and no others?  A performance monitor interrupt might well
> decide to save all the registers away plus a stack trace, and to see
> all the GPRs as 0 could be very confusing.

Well, any other state at that point is pretty useless, since we've long deferred from the original IP the interrupt arrived at. So if a perfmon module reads out other GPRs there, they are basically guaranteed to be useless anyway, no?

> Thirdly, if preemption is enabled, it could well be that the interrupt
> will wake up a higher priority task which should run before we
> continue.  On 64-bit you are probably saved by the soft_irq_enable
> calls, which will (I think) call schedule() if a reschedule is
> pending, but on 32-bit soft_irq_enable does nothing.

At that point preemption is disabled.

> Fourthly, as Ben said, you should be setting regs->trap.

Yup :). Very good catch that one.

> Have you measured a performance improvement with this patch?  If so
> how big was it?

Yeah, I tried things on 970 in an mfsprg/mtsprg busy loop. I measured 3 different variants:

C irq handling:		1004944 exits/sec
asm irq handling:		1001774 exits/sec
asm + hsrr patch:		994719 exits/sec

So as you can see, that code change does have quite an impact. But maybe the added complexity isn't worth it? Either way, we should try and find a solution that works the same way for booke and book3s - I don't want such an integral part to differ all that much.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html