RE: [PATCH] KVM: ARM: improve vgic_get_target_reg

"Zhaobo (Bob, ERC)" <zhaobo@xxxxxxxxxx> · Wed, 5 Jun 2013 07:31:11 +0000

Hi Christoffer,
I discussed with marc before ,current implementation of  vgic doesn't fully comply with ARM gic specs ,SPI distribution is one among them.
In the spec it is allowed to configure one SPI to several target cpu interfaces , and let the OS to leverage the cpu workload. 
But in the source code , one SPI only bind to one specific cpu interface via LSB method (see function vgic_set_target_reg).
	for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) {
		int shift = i * GICD_CPUTARGETS_BITS;
		target = ffs((val >> shift) & 0xffU);
		target = target ? (target - 1) : 0;
		dist->irq_spi_cpu[irq + i] = target;
		kvm_for_each_vcpu(c, vcpu, kvm) {
			bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
			if (c == target)
				set_bit(irq + i, bmap);
			else
				clear_bit(irq + i, bmap);
		}
	}
That is why I create this patch for vgic_get_target_reg to get the only one cpu number we set before .

I don't know whether it is a issue for SMP mode linux. But it do simplify the implementation of OS layer.

Regards,
Bob
CSI Virtualization Architect
HUAWEI TECHNOLOGIES DUESSELDORF GmbH
Munich Office, European Research Center

Riesstraße 25
80992 München
Tel: +49 (0)89-158834 4145
Mobile: +49 (0)1622106317
Fax: +49 (0)89-158834 4447
E-mail: zhaobo@xxxxxxxxxx 
www.huawei.com 

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Am Seestern 24, 40547 Düsseldorf, Germany, www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Jingwen TAO, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Jingwen TAO, Wanzhou MENG, Lifang CHEN

-----Original Message-----
From: Christoffer Dall [mailto:christoffer.dall@xxxxxxxxxx] 
Sent: Wednesday, June 05, 2013 8:56 AM
To: Marc Zyngier
Cc: Zhaobo (Bob, ERC); cdall@xxxxxxxxxxxxxxx; kvmarm@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] KVM: ARM: improve vgic_get_target_reg

On 4 June 2013 10:00, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> On 04/06/13 15:48, Bob(ERC) wrote:
>>
>> improve loop performance of vgic_get_target_reg and make it
>> consistent with vgic_set_target_reg. In current implementation,
>> vgic only sets and distributes interrupts to specific target cpu
>> interface.
>>
>> Signed-off-by: Bo Zhao <zhaobo@xxxxxxxxxx>
>> ---
>>  arch/arm/kvm/vgic.c |   11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm/kvm/vgic.c b/arch/arm/kvm/vgic.c
>> index 17c5ac7..e6eb36c 100644
>> --- a/arch/arm/kvm/vgic.c
>> +++ b/arch/arm/kvm/vgic.c
>> @@ -439,11 +439,14 @@ static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
>>
>>       irq -= VGIC_NR_PRIVATE_IRQS;
>>
>> -     kvm_for_each_vcpu(c, vcpu, kvm) {
>> -             bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
>> -             for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++)
>> -                     if (test_bit(irq + i, bmap))
>> +     for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++) {
>> +             kvm_for_each_vcpu(c, vcpu, kvm) {
>> +                     bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
>> +                     if (test_bit(irq + i, bmap)) {
>>                               val |= 1 << (c + i * 8);
>> +                             break;
>> +                     }
>> +             }
>>       }
>>
>>       return val;
>>
>
> I'm not sure that's much of an improvement.
> How about this (fully untested) patch instead:
>
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 9f5817d..5d9cf1a 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -432,19 +432,13 @@ static bool handle_mmio_priority_reg(struct kvm_vcpu *vcpu,
>  static u32 vgic_get_target_reg(struct kvm *kvm, int irq)
>  {
>         struct vgic_dist *dist = &kvm->arch.vgic;
> -       struct kvm_vcpu *vcpu;
> -       int i, c;
> -       unsigned long *bmap;
> +       int i;
>         u32 val = 0;
>
>         irq -= VGIC_NR_PRIVATE_IRQS;
>
> -       kvm_for_each_vcpu(c, vcpu, kvm) {
> -               bmap = vgic_bitmap_get_shared_map(&dist->irq_spi_target[c]);
> -               for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++)
> -                       if (test_bit(irq + i, bmap))
> -                               val |= 1 << (c + i * 8);
> -       }
> +       for (i = 0; i < GICD_IRQS_PER_ITARGETSR; i++)
> +               val |= 1 << (dist->irq_spi_cpu[irq + i] + i * 8);
>
>         return val;
>  }
>
> Cheers,
>
Why do we want to optimize this? It's not in the critical path of
anything, code readability should be prioritized over performance in
this case. (Did anybody measure the performance difference?)

But this did get me thinking, Marc, can you point me to the place in
the docs again where it says we can just pick the LSB for the write to
the ITARGETSR ?

Also, in the documentation we write that the irq_spi_target register
is a formatted version of GICD_ICFGR, but shouldn't it be
GICD_ITARGETSR?

Oh no, the vgic code confused me again...

-Christoffer

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm