On 21 June 2012 21:58, Christoffer Dall <c.dall at virtualopensystems.com> wrote: > [snip] > >> OK, point taken. I'll have a look at reworking the macro mess, but will >> focus on the rest of the review for the time being. > > I didn't review the accessor functions as I assume you'll respin the > patch with adjustments, and I'll look at it then. > > [snip] > >>>> ?struct vgic_dist { >>>> +#ifdef CONFIG_KVM_ARM_VGIC >>>> + ? ? ? spinlock_t ? ? ? ? ? ? ?lock; >>>> + >>>> + ? ? ? void __iomem ? ? ? ? ? ?*vctrl_base; >>>> + >>>> + ? ? ? u32 ? ? ? ? ? ? ? ? ? ? enabled; >>>> + >>>> + ? ? ? struct vgic_bitmap ? ? ?irq_enabled; >>>> + ? ? ? struct vgic_bitmap ? ? ?irq_pending; >>>> + ? ? ? struct vgic_bitmap ? ? ?irq_active; /* Not used yet. Useful? */ >>> >>> are these not only used to preserve the GIC state for >>> migration/powerloss and therefore not relevant when already held in >>> main memory? If set to pending, figuring out if the interrupt is >>> active, I guess we can just read the List registers, right? >> >> We could indeed remove it, and just ignores the writes to this registers. >> > > I think we would want to support migration as a general concept, but > probably not between non-kvm accelerated qemu environments and > accelerated ones. I think conceptually it is supposed to work to migrate between KVM and TCG (emulated) QEMU. Basically the kernel should provide the ABI[*] for reading/writing the GIC state, and QEMU then marshalls that into a state struct that is shared between its in-kernel-GIC and emulated-GIC models. [*] some variation on the read-write-many-regs stuff that I think Rusty said he was going to look into, I would suggest. > So, we might need a way for userspace to tell KVM > that an interrupt is active, but getting the information out we can > query the list registers right? Yes, you need to allow userspace to save/restore the active status of an interrupt (it's part of the GIC's internal state after all) but how you represent it in kernel space is an implementation detail. >>>> + >>>> + ? ? ? struct vgic_bytemap ? ? irq_priority;/* Not used yet. Useful? */ >>> >>> if we don't use these fields and have no way of testing them, I >>> suggest removing them and if we need to remember that they could go >>> here, there could be a comment on the structure itself. >> >> It could be used to restore a KVM-based VM to a QEMU-based system. If we >> don't plan to support this, I'll just nuke the field. >> > > not sure why this field is necessary for that kind of migration, but I > don't think we will support that case. The out-of-kernel GIC model does implement interrupt priorities, and the priority registers are part of the state. But I think that the way we'd handle that is that save/restore would determine that the kernel didn't provide the priority registers and would just accept that it couldn't set them. Or if the kernel provided registers that read-as-written but don't have any effect, we could just save and restore the state into those. >>>> +/* Temporary hacks, need to probe DT instead */ >>>> +#define VGIC_DIST_BASE ? ? ? ? 0x2c001000 >>>> +#define VGIC_DIST_SIZE ? ? ? ? 0x1000 >>>> + >>> >>> what's involved in fixing this now? I assume this should come from >>> userspace who may read it off the supplied DT or? >> >> It is still something we have to work out with Peter. Probably involve >> playing with PERIPHBASE instead of the DT. Will have to come from >> userspace anyway. > > ok, so I read on in some of your patches and saw that you're actually > modifying it somewhat. There should be an ioctl that QEMU calls to set the base address of the per-CPU peripherals (for A15 this happens to just be one thing, the GIC). This is the equivalent of the hardware using the value of the PERIPHBASE signals. (There's a couple of reasons for not saying "use the value that QEMU sets the cp15 CBAR register to": * on some cores, eg A9, CBAR is actually writable by the guest so might not be the same as PERIPHBASE * IIRC we need the base address of the GIC in the kernel earlier than the point where QEMU would be setting register values ) As it happens QEMU doesn't actually have a particularly convenient place in the code for issuing this ioctl, but I can probably sort something out... -- PMM