On Tue, Oct 15, 2013 at 12:15:03PM +0100, Peter Maydell wrote: > On 26 September 2013 22:03, Christoffer Dall > <christoffer.dall@xxxxxxxxxx> wrote: > > Save and restore the ARM KVM VGIC state from the kernel. We rely on > > QEMU to marshal the GICState data structure and therefore simply > > synchronize the kernel state with the QEMU emulated state in both > > directions. > > > > We take some care on the restore path to check the VGIC has been > > configured with enough IRQs and CPU interfaces that we can properly > > restore the state, and for separate set/clear registers we first fully > > clear the registers and then set the required bits. > > > > Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> > > > > Changelog [v2]: > > - Remove num_irq from GIC VMstate structure > > - Increment GIC VMstate version number > > - Use extract32/deposit32 for bit-field modifications > > - Address other smaller review comments > > - Renames kvm_arm_gic_dist_[readr/writer] functions to > > kvm_dist_[get/put] and shortened other function names > > - Use concrete format for APRn > > --- > > hw/intc/arm_gic_common.c | 5 +- > > hw/intc/arm_gic_kvm.c | 424 +++++++++++++++++++++++++++++++++++++++++++++- > > hw/intc/gic_internal.h | 8 + > > 3 files changed, 433 insertions(+), 4 deletions(-) > > > > diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c > > index 5449d77..1d3b738 100644 > > --- a/hw/intc/arm_gic_common.c > > +++ b/hw/intc/arm_gic_common.c > > @@ -58,8 +58,8 @@ static const VMStateDescription vmstate_gic_irq_state = { > > > > static const VMStateDescription vmstate_gic = { > > .name = "arm_gic", > > - .version_id = 6, > > - .minimum_version_id = 6, > > + .version_id = 7, > > + .minimum_version_id = 7, > > .pre_save = gic_pre_save, > > .post_load = gic_post_load, > > .fields = (VMStateField[]) { > > @@ -78,6 +78,7 @@ static const VMStateDescription vmstate_gic = { > > VMSTATE_UINT16_ARRAY(current_pending, GICState, NCPU), > > VMSTATE_UINT8_ARRAY(bpr, GICState, NCPU), > > VMSTATE_UINT8_ARRAY(abpr, GICState, NCPU), > > + VMSTATE_UINT32_2DARRAY(apr, GICState, 4, NCPU), > > I feel like we should add this new apr state (plus some > documentation and at least the TCG read/write interface > to the state) in one patch and then put the save/load > in its own patch. > Sounds reasonable, I've added a separate patch for the next series. > > + err = kvm_device_ioctl(s->dev_fd, type, &attr); > > + if (err < 0) { > > + fprintf(stderr, "KVM_{SET/GET}_DEVICE_ATTR failed: %s\n", > > + strerror(-err)); > > + abort(); > > Bad indent. > indeed, sorry about that. > > + } > > +} > > static void kvm_arm_gic_put(GICState *s) > > { > > - /* TODO: there isn't currently a kernel interface to set the GIC state */ > > + uint32_t reg; > > + int i; > > + int cpu; > > + int num_cpu; > > + int num_irq; > > + > > + if (!kvm_arm_gic_can_save_restore(s)) { > > + DPRINTF("Cannot put kernel gic state, no kernel interface"); > > + return; > > + } > > + > > + /* Note: We do the restore in a slightly different order than the save > > + * (where the order doesn't matter and is simply ordered according to the > > + * register offset values */ > > + > > + /***************************************************************** > > + * Distributor State > > + */ > > + > > + /* s->enabled -> GICD_CTLR */ > > + reg = s->enabled; > > + kvm_gicd_access(s, 0x0, 0, ®, true); > > + > > + /* Sanity checking on GICD_TYPER and s->num_irq, s->num_cpu */ > > + kvm_gicd_access(s, 0x4, 0, ®, false); > > + num_irq = ((reg & 0x1f) + 1) * 32; > > + num_cpu = ((reg & 0xe0) >> 5) + 1; > > + > > + if (num_irq < s->num_irq) { > > + fprintf(stderr, "Restoring %u IRQs, but kernel supports max %d\n", > > + s->num_irq, num_irq); > > + abort(); > > + } else if (num_cpu != s->num_cpu ) { > > + fprintf(stderr, "Restoring %u CPU interfaces, kernel only has %d\n", > > + s->num_cpu, num_cpu); > > + /* Did we not create the VCPUs in the kernel yet? */ > > + abort(); > > + } > > + > > + /* TODO: Consider checking compatibility with the IIDR ? */ > > + > > + /* irq_state[n].enabled -> GICD_ISENABLERn */ > > + kvm_dist_put(s, 0x180, 1, s->num_irq, translate_clear); > > + kvm_dist_put(s, 0x100, 1, s->num_irq, translate_enabled); > > + > > + /* s->irq_target[irq] -> GICD_ITARGETSRn > > + * (restore targets before pending to ensure the pending state is set on > > + * the appropriate CPU interfaces in the kernel) */ > > + kvm_dist_put(s, 0x800, 8, s->num_irq, translate_targets); > > + > > + /* irq_state[n].pending + irq_state[n].level -> GICD_ISPENDRn */ > > + kvm_dist_put(s, 0x280, 1, s->num_irq, translate_clear); > > + kvm_dist_put(s, 0x200, 1, s->num_irq, translate_pending); > > + > > + /* irq_state[n].active -> GICD_ISACTIVERn */ > > + kvm_dist_put(s, 0x380, 1, s->num_irq, translate_clear); > > + kvm_dist_put(s, 0x300, 1, s->num_irq, translate_active); > > + > > + /* irq_state[n].trigger -> GICD_ICFRn */ > > + kvm_dist_put(s, 0xc00, 2, s->num_irq, translate_trigger); > > + > > + /* s->priorityX[irq] -> ICD_IPRIORITYRn */ > > + kvm_dist_put(s, 0x400, 8, s->num_irq, translate_priority); > > + > > + /* s->sgi_source -> ICD_CPENDSGIRn */ > > + kvm_dist_put(s, 0xf10, 8, GIC_NR_SGIS, translate_clear); > > + kvm_dist_put(s, 0xf20, 8, GIC_NR_SGIS, translate_sgisource); > > + > > + > > + /***************************************************************** > > + * CPU Interface(s) State > > + */ > > + > > + for (cpu = 0; cpu < s->num_cpu; cpu++) { > > + /* s->cpu_enabled[cpu] -> GICC_CTLR */ > > + reg = s->cpu_enabled[cpu]; > > + kvm_gicc_access(s, 0x00, cpu, ®, true); > > + > > + /* s->priority_mask[cpu] -> GICC_PMR */ > > + reg = (s->priority_mask[cpu] & 0xff); > > + kvm_gicc_access(s, 0x04, cpu, ®, true); > > + > > + /* s->bpr[cpu] -> GICC_BPR */ > > + reg = (s->bpr[cpu] & 0x7); > > + kvm_gicc_access(s, 0x08, cpu, ®, true); > > + > > + /* s->abpr[cpu] -> GICC_ABPR */ > > + reg = (s->abpr[cpu] & 0x7); > > + kvm_gicc_access(s, 0x1c, cpu, ®, true); > > + > > + /* s->apr[n][cpu] -> GICC_APRn */ > > + for (i = 0; i < 4; i++) { > > + reg = s->apr[i][cpu]; > > + kvm_gicc_access(s, 0xd0 + i * 4, cpu, ®, true); > > + } > > + } > > } > > So an interesting question here is "is there state we currently > save/restore for the TCG GIC which we're not passing to the > kernel, and if so why?". I think the fields we don't do anything > with are: > gic_irq_state::model the model field specifies if the interrup uses the 1:N or the N:N model. In reasonably recent versions of the GIC all SGIs are N:N and all peripheral interrupts are 1:N and it is therefore a static property for a GIC v2.0 that we don't need to save and restore. QEMU's model does seem to emulate an older version and a newer version at the same time, and this logic is slightly broken for the ICFGRn handling, but this would require a discussion of exactly what QEMU should be supporting and what we may break if we decide to support one thing vs. another. > GICState::last_active > GICState::running_irq > GICState::running_priority As far as I understand the code, these three are used to implement the running priority support on the QEMU emulated CPU interface side. This state is implemented in hardware for the virtual CPU interface for the VGIC based on the information in the list registers and the GICD_ISACTIVER and registers. I think the QEMU code code be changed to look at the active state of the registers and with a coherent CPU affinity model should be able to reproduce this state on the fly without keeping these extra state variables around, but I'm not 100% confident. > GICState::current_pending Again, the virtual CPU register generates this state based on the LRs present on the interface. The in-kernel distributor emulation needs to choose the right interrupts to queue based on priority if prorities were to be supported in the kernel. The necessary state, however, is preserved through the registers that we do save an restore. > > I'm guessing these are a mix of "kernel doesn't implement > something" and "TCG model is wrong and this state should > show up somewhere else"; but which is which? So I think the answer to all of the above is that this is implementation state that QEMU carries because it makes it more efficient/easier or just because that was the way it was written, and the kernel does maintain this state via the current registers being saved and restored. This of course implies that the QEMU state is somewhat redundant, but this may be on purpose. Do you feel it is necessary for this sort of documentation to go into the source somewhere, or would you just like to clear it up for the review? > > diff --git a/hw/intc/gic_internal.h b/hw/intc/gic_internal.h > > index 758b85a..f9133b9 100644 > > --- a/hw/intc/gic_internal.h > > +++ b/hw/intc/gic_internal.h > > @@ -99,6 +99,14 @@ typedef struct GICState { > > uint8_t bpr[NCPU]; > > uint8_t abpr[NCPU]; > > > > + /* The APR is implementation defined, so we choose a layout identical to > > + * the KVM ABI layout for QEMU's implementation of the gic: > > + * If one or more interrupts for preemption level X are active, then > > + * APRn[X mod 32] == 0b1, where n = X / 32 > > + * otherwise the bit is clear. > > + */ > > + uint32_t apr[4][NCPU]; > > You can delete the "or more" -- there can onyl be one > interrupt active at each group priority (GICv2 spec 3.3.3). > > Also, where does the hardcoded 4 come from? Answer: it's > the worst-case number of GICC_APRn registers, which happens > if you have 7 bits of group priority, which means you have > 128 preemption levels, which at one bit per level fits into > 4 32 bit APR regs. I think a GIC_NR_APRS #define might be nice. > Fair enough, I added one in the separate patch based on a new define for max number of preemption levels. > > + > > uint32_t num_cpu; > > > > MemoryRegion iomem; /* Distributor */ > > -- > > 1.7.10.4 Thanks for the review! -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm