On 05/06/13 07:55, Christoffer Dall wrote: > On 4 June 2013 10:00, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >> On 04/06/13 15:48, Bob(ERC) wrote: >>> >>> improve loop performance of vgic_get_target_reg and make it >>> consistent with vgic_set_target_reg. In current implementation, >>> vgic only sets and distributes interrupts to specific target cpu >>> interface. >>> >>> Signed-off-by: Bo Zhao <zhaobo@xxxxxxxxxx> >>> --- >>> arch/arm/kvm/vgic.c | 11 +++++++---- >>> 1 file changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c >>> index 17c5ac7..e6eb36c 100644 >>> --- a/arch/arm/kvm/vgic.c >>> +++ b/arch/arm/kvm/vgic.c >>> @@ -439,11 +439,14 @@ static u32 vgic_get_target_reg(struct kvm *kvm, int irq) >>> >>> irq -= VGIC_NR_PRIVATE_IRQS; >>> >>> - kvm_for_each_vcpu(c, vcpu, kvm) { >>> - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >>> - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) >>> - if (test_bit(irq + i, bmap)) >>> + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) { >>> + kvm_for_each_vcpu(c, vcpu, kvm) { >>> + bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >>> + if (test_bit(irq + i, bmap)) { >>> val |= 1 << (c + i * 8); >>> + break; >>> + } >>> + } >>> } >>> >>> return val; >>> >> >> I'm not sure that's much of an improvement. >> How about this (fully untested) patch instead: >> >> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c >> index 9f5817d..5d9cf1a 100644 >> --- a/virt/kvm/arm/vgic.c >> +++ b/virt/kvm/arm/vgic.c >> @@ -432,19 +432,13 @@ static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu, >> static u32 vgic_get_target_reg(struct kvm *kvm, int irq) >> { >> struct vgic_dist *dist = &kvm->arch.vgic; >> - struct kvm_vcpu *vcpu; >> - int i, c; >> - unsigned long *bmap; >> + int i; >> u32 val = 0; >> >> irq -= VGIC_NR_PRIVATE_IRQS; >> >> - kvm_for_each_vcpu(c, vcpu, kvm) { >> - bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]); >> - for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) >> - if (test_bit(irq + i, bmap)) >> - val |= 1 << (c + i * 8); >> - } >> + for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) >> + val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8); >> >> return val; >> } >> >> Cheers, >> > Why do we want to optimize this? It's not in the critical path of > anything, code readability should be prioritized over performance in > this case. (Did anybody measure the performance difference?) Hmmm. Having the words "performance" and "critical path" in a sentence about the GIC distributor register emulation is an interesting concept. Touching the distributor is *always* going to be incredibly inefficient, so optimizing it really doesn't matter. Now, I think rewriting this function the way I just did makes it actually more readable. > But this did get me thinking, Marc, can you point me to the place in > the docs again where it says we can just pick the LSB for the write to > the ITARGETSR ? The documentation doesn't say we can. This is one of numerous shortcuts taken in the distributor emulation to make it a bit simpler and more efficient on the injection path. Also, the lack of guests using these features makes supporting all the GICv2 features a bit of a moot point. So, for everyone to realize what we actually don't support, here's a non-exhaustive list: - Only one target CPU per interrupt (the above) - No priority support - No split deactivation/priority drop - No SGI Clear-Pending Registers In short, anything that isn't used by Linux doesn't exist in this implementation. > Also, in the documentation we write that the irq_spi_target register > is a formatted version of GICD_ICFGR, but shouldn't it be > GICD_ITARGETSR? It should. > Oh no, the vgic code confused me again... The worse is yet to come. M. -- Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm