On 27.04.2012, at 07:48, Paul Mackerras wrote: > On Thu, Apr 26, 2012 at 12:19:03PM +0200, Alexander Graf wrote: > >> So switch the code over to call into the Linux C handlers from C code, >> speeding up everything along the way. > > I have to say this patch makes me pretty uneasy. There are a few > things that look wrong to me, but more than that, it seems to me that > there would be a lot of careful thought needed to make sure that the > approach is bullet-proof. Yay, finally some review on it :). This method is currently used identically in booke hv, so everything we find broken here also applies there! > The first thing is that you are filling in the registers, and in > particular r1, in a subroutine, so you are potentially making regs.r1 > point to a stack frame that no longer exists by the time we look at it > inside do_IRQ or timer_interrupt. So, for example, a stack trace > could go completely off the rails at that point. Quite possibly gcc > will inline the kvmppc_fill_pt_regs function, which would probably > save you, but I don't think you should rely on that. Ugh. Right. > The second thing is, why do you save just r1, ip, msr, and lr? Why > those ones and no others? A performance monitor interrupt might well > decide to save all the registers away plus a stack trace, and to see > all the GPRs as 0 could be very confusing. Well, any other state at that point is pretty useless, since we've long deferred from the original IP the interrupt arrived at. So if a perfmon module reads out other GPRs there, they are basically guaranteed to be useless anyway, no? > Thirdly, if preemption is enabled, it could well be that the interrupt > will wake up a higher priority task which should run before we > continue. On 64-bit you are probably saved by the soft_irq_enable > calls, which will (I think) call schedule() if a reschedule is > pending, but on 32-bit soft_irq_enable does nothing. At that point preemption is disabled. > Fourthly, as Ben said, you should be setting regs->trap. Yup :). Very good catch that one. > Have you measured a performance improvement with this patch? If so > how big was it? Yeah, I tried things on 970 in an mfsprg/mtsprg busy loop. I measured 3 different variants: C irq handling: 1004944 exits/sec asm irq handling: 1001774 exits/sec asm + hsrr patch: 994719 exits/sec So as you can see, that code change does have quite an impact. But maybe the added complexity isn't worth it? Either way, we should try and find a solution that works the same way for booke and book3s - I don't want such an integral part to differ all that much. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html