On Tue, Nov 25, 2014 at 10:54:18AM +0800, Shannon Zhao wrote: > On 2014/11/24 18:53, Christoffer Dall wrote: > > On Mon, Nov 24, 2014 at 03:53:16PM +0800, Shannon Zhao wrote: > >> Hi Marc, Christoffer, > >> > >> On 2014/11/23 4:04, Christoffer Dall wrote: > >>> On Wed, Nov 19, 2014 at 06:11:25PM +0800, Shannon Zhao wrote: > >>>> When call kvm_vgic_inject_irq to inject interrupt, we can known which > >>>> vcpu the interrupt for by the irq_num and the cpuid. So we should just > >>>> kick this vcpu to avoid iterating through all. > >>>> > >>>> Signed-off-by: Shannon Zhao <zhaoshenglong@xxxxxxxxxx> > >>> > >>> This looks reasonable to me: > >>> > >>> Reviewed-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> > >>> > >>> But as Marc said, we have to consider the churn by introducing more > >>> changes to the vgic (that file is being hammered pretty intensely > >>> these days), so if you feel this is an urgent optimization, it would > >>> be useful to see some data backing this up. > >>> > >> > >> Today I have a test which measures the cycles about kvm_vgic_inject_irq by PMU. > >> I just test the cycles of SPI using virtio-net. > >> Test steps: > >> 1) start a VM with 8 VCPUs > >> 2) In guest bind the irq of virtio to CPU8, host ping VM, get the cycles > >> > >> > >> The test shows: > >> Without this patch, the cycles is about 3700(3300-5000), and with this patch, the cycles is about 3000(2500-3200). > >> From this test, I think this patch can bring some improvements. > > > > Are these averaged numbers? > > > > Yes:-) > > >> > >> The test code like below. As it's almost no difference about vgic_update_irq_state > >> between with and without this patch. So just measure the kick's cycles. > >> > >> int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, > >> bool level) > >> { > >> unsigned long cycles_1,cycles_2; > >> if (likely(vgic_initialized(kvm)) && > >> vgic_update_irq_pending(kvm, cpuid, irq_num, level)) { > >> start_pmu(); > >> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_1)); > >> vgic_kick_vcpus(kvm); > >> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_2)); > >> } > >> > >> return 0; > >> } > >> > >> int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num, > >> bool level) > >> { > >> int vcpu_id; > >> unsigned long cycles_a,cycles_b; > >> if (likely(vgic_initialized(kvm))) { > >> vcpu_id = vgic_update_irq_pending(kvm, cpuid, irq_num, level); > >> if (vcpu_id >= 0) { > >> start_pmu(); > >> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_a)); > >> /* kick the specified vcpu */ > >> kvm_vcpu_kick(kvm_get_vcpu(kvm, vcpu_id)); > >> __asm__ __volatile__("MRS %0, PMCCNTR_EL0" : "=r"(cycles_b)); > >> } > >> } > >> return 0; > >> } > >> > > > > Can you run some IPI-intensive benchmark in your guest and let us know > > if you see improvements on that level? > > > > Cool, I'll try to find some benchmarks and run. Are there some IPI-intensive benchmarks you suggest? > Hackbench with processes sure seems to like IPIs. > > Not trying to be overly-pedantic here (I think your numbers suggest we > > should merge this), but if the case you're optimizing doesn't happen > > very often, we may not see this on a guest level or overall CPU > > utilization level, and it would be very interesting to know. > > > > Yeah, it would be interesting. > Thanks, -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm