Hi Christoffer, I discussed with marc before ,current implementation of vgic doesn't fully comply with ARM gic specs ,SPI distribution is one among them. In the spec it is allowed to configure one SPI to several target cpu interfaces , and let the OS to leverage the cpu workload. But in the source code , one SPI only bind to one specific cpu interface via LSB method (see function vgic_set_target_reg). for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) { int shift = i * GICD_CPUTARGETS_BITS; target = ffs((val >> shift) & 0xffU); target = target ? (target - 1) : 0; dist->irq_spi_cpu[irq + i] = target; kvm_for_each_vcpu(c, vcpu, kvm) { bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); if (c == target) set_bit(irq + i, bmap); else clear_bit(irq + i, bmap); } } That is why I create this patch for vgic_get_target_reg to get the only one cpu number we set before . I don't know whether it is a issue for SMP mode linux. But it do simplify the implementation of OS layer. Regards, Bob CSI Virtualization Architect HUAWEI TECHNOLOGIES DUESSELDORF GmbH Munich Office, European Research Center Riesstraße 25 80992 München Tel: +49 (0)89-158834 4145 Mobile: +49 (0)1622106317 Fax: +49 (0)89-158834 4447 E-mail: zhaobo@xxxxxxxxxx www.huawei.com HUAWEI TECHNOLOGIES Duesseldorf GmbH Am Seestern 24, 40547 Düsseldorf, Germany, www.huawei.com Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing Director: Jingwen TAO, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063, Geschäftsführer: Jingwen TAO, Wanzhou MENG, Lifang CHEN -----Original Message----- From: Christoffer Dall [mailto:christoffer.dall@xxxxxxxxxx] Sent: Wednesday, June 05, 2013 8:56 AM To: Marc Zyngier Cc: Zhaobo (Bob, ERC); cdall@xxxxxxxxxxxxxxx; kvmarm@xxxxxxxxxxxxxxxxxxxxx Subject: Re: [PATCH] KVM: ARM: improve vgic_get_target_reg On 4 June 2013 10:00, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 04/06/13 15:48, Bob(ERC) wrote: >> >> improve loop performance of vgic_get_target_reg and make it >> consistent with vgic_set_target_reg. In current implementation, >> vgic only sets and distributes interrupts to specific target cpu >> interface. >> >> Signed-off-by: Bo Zhao <zhaobo@xxxxxxxxxx> >> --- >> arch/arm/kvm/vgic.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c >> index 17c5ac7..e6eb36c 100644 >> --- a/arch/arm/kvm/vgic.c >> +++ b/arch/arm/kvm/vgic.c >> @@ -439,11 +439,14 @@ static u32 vgic_get_target_reg(struct kvm *kvm, int irq) >> >> irq -= VGIC_NR_PRIVATE_IRQS; >> >> - kvm_for_each_vcpu(c, vcpu, kvm) { >> - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >> - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) >> - if (test_bit(irq + i, bmap)) >> + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) { >> + kvm_for_each_vcpu(c, vcpu, kvm) { >> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >> + if (test_bit(irq + i, bmap)) { >> val |= 1 << (c + i * 8); >> + break; >> + } >> + } >> } >> >> return val; >> > > I'm not sure that's much of an improvement. > How about this (fully untested) patch instead: > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > index 9f5817d..5d9cf1a 100644 > --- a/virt/kvm/arm/vgic.c > +++ b/virt/kvm/arm/vgic.c > @@ -432,19 +432,13 @@ static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu, > static u32 vgic_get_target_reg(struct kvm *kvm, int irq) > { > struct vgic_dist *dist = &kvm->arch.vgic; > - struct kvm_vcpu *vcpu; > - int i, c; > - unsigned long *bmap; > + int i; > u32 val = 0; > > irq -= VGIC_NR_PRIVATE_IRQS; > > - kvm_for_each_vcpu(c, vcpu, kvm) { > - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); > - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) > - if (test_bit(irq + i, bmap)) > - val |= 1 << (c + i * 8); > - } > + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) > + val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8); > > return val; > } > > Cheers, > Why do we want to optimize this? It's not in the critical path of anything, code readability should be prioritized over performance in this case. (Did anybody measure the performance difference?) But this did get me thinking, Marc, can you point me to the place in the docs again where it says we can just pick the LSB for the write to the ITARGETSR ? Also, in the documentation we write that the irq_spi_target register is a formatted version of GICD_ICFGR, but shouldn't it be GICD_ITARGETSR? Oh no, the vgic code confused me again... -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm