On Wed, Nov 15, 2017 at 05:50:07PM +0000, Andre Przywara wrote: > Hi, > > those last few patches are actually helpful for the Xen port ... [...] > > +static void save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base) > > +{ > > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > > + int nr_lr = kvm_vgic_global_state.nr_lr; > > + u32 elrsr0, elrsr1; > > + > > + elrsr0 = readl_relaxed(base + GICH_ELRSR0); > > + if (unlikely(nr_lr > 32)) > > + elrsr1 = readl_relaxed(base + GICH_ELRSR1); > > + else > > + elrsr1 = 0; > > + > > +#ifdef CONFIG_CPU_BIG_ENDIAN > > + cpu_if->vgic_elrsr = ((u64)elrsr0 << 32) | elrsr1; > > +#else > > + cpu_if->vgic_elrsr = ((u64)elrsr1 << 32) | elrsr0; > > +#endif > > I have some gut feeling that this is really broken, since we mix up > endian *byte* ordering with *bit* ordering here, don't we? Good feeling indeed. :) We have bitmap_{from,to)_u32array for things like this. But it was considered bad-designed, and I proposed new bitmap_{from,to)_arr32(). https://lkml.org/lkml/2017/11/15/592 What else I have in mind, to introduce something like bitmap_{from,to}_pair_32() as most of current users of bitmap_{from,to)_u32array(), (and those who should use it but don't, like this one) have only 2 32-bit halfwords to be copied from/to bitmap. Also, it will be complementary to bitmap_from_u64(). More reading about bitmap/array conversion is in comment to BITMAP_FROM_U64 macro. > I understand it's just copied and gets removed later on, so I was > wondering if you could actually move patch 35/37 ("Get rid of > vgic_elrsr") before this patch here, to avoid copying bogus code around? > Or does 35/37 depend on 34/37 to be correct? > > > +} > > + > > +static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base) > > +{ > > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > > + int i; > > + u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; > > + > > + for (i = 0; i < used_lrs; i++) { > > + if (cpu_if->vgic_elrsr & (1UL << i)) So, the vgic_elrsr is naturally bitmap, and bitmap API is preferred if no other considerations: if (test_bit(i, cpu_if->vgic_elrsr)) > > + cpu_if->vgic_lr[i] &= ~GICH_LR_STATE; > > + else > > + cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); > > + > > + writel_relaxed(0, base + GICH_LR0 + (i * 4)); > > + } > > +} I'd also headscratch about using for_each_clear_bit() here: /* * Setup default vgic_lr values somewhere earlier. * Not needed at all if you take my suggestion for * vgic_v2_restore_state() below */ for (i = 0; i < used_lrs; i++) cpu_if->vgic_lr[i] &= ~GICH_LR_STATE; static void save_lrs(struct kvm_vcpu *vcpu, void __iomem *base) { [...] for_each_clear_bit (i, cpu_if->vgic_elrsr, used_lrs) cpu_if->vgic_lr[i] = readl_relaxed(base + GICH_LR0 + (i * 4)); for (i = 0; i < used_lrs; i++) writel_relaxed(0, base + GICH_LR0 + (i * 4)); } Not sure how performance-critical this path is, but sometimes things get really faster with bitmaps. [...] > > +void vgic_v2_restore_state(struct kvm_vcpu *vcpu) > > +{ > > + struct kvm *kvm = vcpu->kvm; > > + struct vgic_dist *vgic = &kvm->arch.vgic; > > + struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2; > > + void __iomem *base = vgic->vctrl_base; > > + u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs; > > + int i; > > + > > + if (!base) > > + return; > > + > > + if (used_lrs) { > > + writel_relaxed(cpu_if->vgic_hcr, base + GICH_HCR); > > + writel_relaxed(cpu_if->vgic_apr, base + GICH_APR); > > + for (i = 0; i < used_lrs; i++) { > > + writel_relaxed(cpu_if->vgic_lr[i], > > + base + GICH_LR0 + (i * 4)); > > + } > > + } > > +} The alternative approach would be: for (i = 0; i < used_lrs; i++) { if (test_bit(i, cpu_if->vgic_elrsr)) writel_relaxed(~GICH_LR_STATE, base + GICH_LR0 + (i * 4)); else writel_relaxed(cpu_if->vgic_lr[i], base + GICH_LR0 + (i * 4)); } If cpu_if->vgic_elrsr is untouched in-between of course. It will make save_lrs() simpler and this function more verbose. Thanks, Yury