On Thu, 2015-04-23 at 15:31 +0300, Purcareata Bogdan wrote: > On 23.04.2015 03:30, Scott Wood wrote: > > On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote: > >> On 21.04.2015 03:52, Scott Wood wrote: > >>> On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote: > >>>> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner > >>>> function is kvmppc_set_epr, which is a static inline. Removing the static inline > >>>> yields a compiler crash (Segmentation fault (core dumped) - > >>>> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed), > >>>> but that's a different story, so I just let it be for now. Point is the time may > >>>> include other work after the lock has been released, but before the function > >>>> actually returned. I noticed this was the case for .kvm_set_msi, which could > >>>> work up to 90 ms, not actually under the lock. This made me change what I'm > >>>> looking at. > >>> > >>> kvm_set_msi does pretty much nothing outside the lock -- I suspect > >>> you're measuring an interrupt that happened as soon as the lock was > >>> released. > >> > >> That's exactly right. I've seen things like a timer interrupt occuring right > >> after the spinlock_irqrestore, but before kvm_set_msi actually returned. > >> > >> [...] > >> > >>>> Or perhaps a different stress scenario involving a lot of VCPUs > >>>> and external interrupts? > >>> > >>> You could instrument the MPIC code to find out how many loop iterations > >>> you maxed out on, and compare that to the theoretical maximum. > >> > >> Numbers are pretty low, and I'll try to explain based on my observations. > >> > >> The problematic section in openpic_update_irq is this [1], since it loops > >> through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops > >> through all pending interrupts for a VCPU [2]. > >> > >> The guest interfaces are virtio-vhostnet, which are based on MSI > >> (/proc/interrupts in guest shows they are MSI). For external interrupts to the > >> guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized), > >> so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop). > >> > >> I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts > >> are actually pending for the destination VCPU. At most, there were 3 interrupts > >> - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that > >> guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts. > >> > >> So worst case, in this scenario, was checking the priorities for 3 pending > >> interrupts for 1 VCPU. Something like this (some of my prints included): > >> > >> [61010.582033] openpic_update_irq: destmask 1 last_cpu 0 > >> [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ > >> [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1 > >> [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1 > >> [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1 > >> [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1 > >> > >> It would be really helpful to get your comments regarding whether these are > >> realistical number for everyday use, or they are relevant only to this > >> particular scenario. > > > > RT isn't about "realistic numbers for everyday use". It's about worst > > cases. > > > >> - Can these interrupts be used in directed delivery, so that the destination > >> mask can include multiple VCPUs? > > > > The Freescale MPIC does not support multiple destinations for most > > interrupts, but the (non-FSL-specific) emulation code appears to allow > > it. > > > >> The MPIC manual states that timer and IPI > >> interrupts are supported for directed delivery, altough I'm not sure how much of > >> this is used in the emulation. I know that kvmppc uses the decrementer outside > >> of the MPIC. > >> > >> - How are virtio interrupts cascaded over the shared MSI interrupts? > >> /proc/device-tree/soc@e0000000/msi@41600/interrupts in the guest shows 8 values > >> - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is > >> that correct? > > > > It looks like that's currently the case, but actual hardware supports > > more than that, so it's possible (albeit unlikely any time soon) that > > the emulation eventually does as well. > > > > But it's possible to have interrupts other than MSIs... > > Right. > > So given that the raw spinlock conversion is not suitable for all the scenarios > supported by the OpenPIC emulation, is it ok that my next step would be to send > a patch containing both the raw spinlock conversion and a mandatory disable of > the in-kernel MPIC? This is actually the last conclusion we came up with some > time ago, but I guess it was good to get some more insight on how things > actually work (at least for me). Fine with me. Have you given any thought to ways to restructure the code to eliminate the problem? -Scott -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html