On 2014/11/25 19:55, Marc Zyngier wrote: > On 25/11/14 11:49, Christoffer Dall wrote: >> On Tue, Nov 25, 2014 at 11:24:43AM +0000, Marc Zyngier wrote: >>> On 25/11/14 11:11, Christoffer Dall wrote: >>>> On Tue, Nov 25, 2014 at 10:54:18AM +0800, Shannon Zhao wrote: >>>>> On 2014/11/24 18:53, Christoffer Dall wrote: >>>>>> On Mon, Nov 24, 2014 at 03:53:16PM +0800, Shannon Zhao wrote: >>>>>>> Hi Marc, Christoffer, >>>>>>> >>>>>>> On 2014/11/23 4:04, Christoffer Dall wrote: >>>>>>>> On Wed, Nov 19, 2014 at 06:11:25PM +0800, Shannon Zhao wrote: >>>>>>>>> When call kvm_vgic_inject_irq to inject interrupt, we can known which >>>>>>>>> vcpu the interrupt for by the irq_num and the cpuid. So we should just >>>>>>>>> kick this vcpu to avoid iterating through all. >>>>>>>>> >>>>>>>>> Signed-off-by: Shannon Zhao <zhaoshenglong@xxxxxxxxxx> >>>>>>>> >>>>>>>> This looks reasonable to me: >>>>>>>> >>>>>>>> Reviewed-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> >>>>>>>> >>>>>>>> But as Marc said, we have to consider the churn by introducing more >>>>>>>> changes to the vgic (that file is being hammered pretty intensely >>>>>>>> these days), so if you feel this is an urgent optimization, it would >>>>>>>> be useful to see some data backing this up. >>>>>>>> >>>>>>> >>>>>>> Today I have a test which measures the cycles about kvm_vgic_inject_irq by PMU. >>>>>>> I just test the cycles of SPI using virtio-net. >>>>>>> Test steps: >>>>>>> 1) start a VM with 8 VCPUs >>>>>>> 2) In guest bind the irq of virtio to CPU8, host ping VM, get the cycles >>>>>>> >>>>>>> >>>>>>> The test shows: >>>>>>> Without this patch, the cycles is about 3700(3300-5000), and with this patch, the cycles is about 3000(2500-3200). >>>>>>> From this test, I think this patch can bring some improvements. >>>>>> >>>>>> Are these averaged numbers? >>>>>> >>>>> >>>>> Yes:-) >>>>> >>>>>>> >>>>>>> The test code like below. As it's almost no difference about vgic_update_irq_state >>>>>>> between with and without this patch. So just measure the kick's cycles. >>>>>>> >>>>>>> int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, >>>>>>> bool level) >>>>>>> { >>>>>>> unsigned long cycles_1,cycles_2; >>>>>>> if (likely(vgic_initialized(kvm)) && >>>>>>> vgic_update_irq_pending(kvm, cpuid, irq_num, level)) { >>>>>>> start_pmu(); >>>>>>> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_1)); >>>>>>> vgic_kick_vcpus(kvm); >>>>>>> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_2)); >>>>>>> } >>>>>>> >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, >>>>>>> bool level) >>>>>>> { >>>>>>> int vcpu_id; >>>>>>> unsigned long cycles_a,cycles_b; >>>>>>> if (likely(vgic_initialized(kvm))) { >>>>>>> vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level); >>>>>>> if (vcpu_id >= 0) { >>>>>>> start_pmu(); >>>>>>> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_a)); >>>>>>> /* kick the specified vcpu */ >>>>>>> kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id)); >>>>>>> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_b)); >>>>>>> } >>>>>>> } >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>> >>>>>> Can you run some IPI-intensive benchmark in your guest and let us know >>>>>> if you see improvements on that level? >>>>>> >>>>> >>>>> Cool, I'll try to find some benchmarks and run. Are there some IPI-intensive benchmarks you suggest? >>>>> >>>> >>>> Hackbench with processes sure seems to like IPIs. >>> >>> But that'd mostly be IPIs in the guest, and we're hoping for that patch >>> to result in a reduction in the number of IPIs on the host when >>> interrupts are injected. >> >> Ah right, I remembered the SGI handling register calling >> kvm_vgic_inject_irq(), but it doesn't. >> >>> >>> I guess that having a workload that generates many interrupts on a SMP >>> guest should result in a reduction of the number of IPIs on the host. >>> >>> What do you think? >>> >> That's sort of what Shannon did already, only we need to measure a drop >> in overall cpu utilization on the host instead or look at iperf numbers >> or something like that. Right? > > Yes. I guess that vmstat running in the background on the host should > give a good indication of what is going on. > I just measure the overall cpu utilization on the host using vmstat and iperf. I start a VM with 8 vcpus and use iperf to send packets from host to guest. Bind the interrupt of virtio to cpu0 and cpu7. The result is following: Without this patch: Bind to cpu0: Bandwidth : 6.60 Gbits/sec vmstat data: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 7795456 0 120568 0 0 0 0 8967 11405 2 2 96 0 Bind to cpu7: Bandwidth : 6.13 Gbits/sec vmstat data: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 0 7795016 0 120572 0 0 0 0 14633 20710 2 3 95 0 With this patch: Bind to cpu0: Bandwidth : 6.99 Gbits/sec vmstat data: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 0 7788048 0 124836 0 0 0 0 10149 11593 2 2 96 0 Bind to cpu7: Bandwidth : 6.53 Gbits/sec vmstat data: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 0 7791044 0 124832 0 0 0 0 11408 14179 2 2 96 0 >From the data, it has some improvement :-) Thanks, Shannon _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm