On 4 June 2013 10:00, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 04/06/13 15:48, Bob(ERC) wrote: >> >> improve loop performance of vgic_get_target_reg and make it >> consistent with vgic_set_target_reg. In current implementation, >> vgic only sets and distributes interrupts to specific target cpu >> interface. >> >> Signed-off-by: Bo Zhao <zhaobo@xxxxxxxxxx> >> --- >> arch/arm/kvm/vgic.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c >> index 17c5ac7..e6eb36c 100644 >> --- a/arch/arm/kvm/vgic.c >> +++ b/arch/arm/kvm/vgic.c >> @@ -439,11 +439,14 @@ static u32 vgic_get_target_reg(struct kvm *kvm, int irq) >> >> irq -= VGIC_NR_PRIVATE_IRQS; >> >> - kvm_for_each_vcpu(c, vcpu, kvm) { >> - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >> - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) >> - if (test_bit(irq + i, bmap)) >> + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) { >> + kvm_for_each_vcpu(c, vcpu, kvm) { >> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >> + if (test_bit(irq + i, bmap)) { >> val |= 1 << (c + i * 8); >> + break; >> + } >> + } >> } >> >> return val; >> > > I'm not sure that's much of an improvement. > How about this (fully untested) patch instead: > > diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c > index 9f5817d..5d9cf1a 100644 > --- a/virt/kvm/arm/vgic.c > +++ b/virt/kvm/arm/vgic.c > @@ -432,19 +432,13 @@ static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu, > static u32 vgic_get_target_reg(struct kvm *kvm, int irq) > { > struct vgic_dist *dist = &kvm->arch.vgic; > - struct kvm_vcpu *vcpu; > - int i, c; > - unsigned long *bmap; > + int i; > u32 val = 0; > > irq -= VGIC_NR_PRIVATE_IRQS; > > - kvm_for_each_vcpu(c, vcpu, kvm) { > - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); > - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) > - if (test_bit(irq + i, bmap)) > - val |= 1 << (c + i * 8); > - } > + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) > + val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8); > > return val; > } > > Cheers, > Why do we want to optimize this? It's not in the critical path of anything, code readability should be prioritized over performance in this case. (Did anybody measure the performance difference?) But this did get me thinking, Marc, can you point me to the place in the docs again where it says we can just pick the LSB for the write to the ITARGETSR ? Also, in the documentation we write that the irq_spi_target register is a formatted version of GICD_ICFGR, but shouldn't it be GICD_ITARGETSR? Oh no, the vgic code confused me again... -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm