On Thu, Jun 21, 2012 at 6:35 PM, Peter Maydell <peter.maydell at linaro.org> wrote: > On 21 June 2012 23:25, Christoffer Dall <c.dall at virtualopensystems.com> wrote: >> On Thu, Jun 21, 2012 at 5:29 PM, Peter Maydell <peter.maydell at linaro.org> wrote: >>> On 21 June 2012 21:58, Christoffer Dall <c.dall at virtualopensystems.com> wrote: >>>> I think we would want to support migration as a general concept, but >>>> probably not between non-kvm accelerated qemu environments and >>>> accelerated ones. >>> >>> I think conceptually it is supposed to work to migrate between >>> KVM and TCG (emulated) QEMU. Basically the kernel should provide >>> the ABI[*] for reading/writing the GIC state, and QEMU then marshalls >>> that into a state struct that is shared between its in-kernel-GIC >>> and emulated-GIC models. >> >> does anybody use this? are anyone going to? is it even tested? > > Well, the QEMU code shares a state structure already (it was the > obvious way to implement it, matching x86), and you need to > provide a load/save state function in the kernel anyhow. My > point is really "don't design things to rule it out". > >>> [*] some variation on the read-write-many-regs stuff that I think >>> Rusty said he was going to look into, I would suggest. >>> >> >> how would that work? represent the GIC registers as pseudo registers >> as part of the CP15 registers, or...? > > The idea is that the API includes a (subsystem,register-number) > tuple, so the copro registers live in one subsystem, and the > GIC registers would all be in another. This means we have one > consistent API for "does the kernel know about these registers?", > "read them", "write them", and we don't have to export lots of > structures and manage adding new fields to them. > ok, that sounds nice, in which case we might as well implement the set-active/clear-active registers and simply call these read/write functions from the read/write of the above API. >>> The out-of-kernel GIC model does implement interrupt priorities, >>> and the priority registers are part of the state. But I think that >>> the way we'd handle that is that save/restore would determine that >>> the kernel didn't provide the priority registers and would just >>> accept that it couldn't set them. Or if the kernel provided >>> registers that read-as-written but don't have any effect, we could >>> just save and restore the state into those. >> >> I just don't think we should keep this state around if we don't use >> it, but then it all of the sudden may have an effect if migrated to >> QEMU. >> >> If I understand correctly, the reason we don't have to deal with it is >> the fact that the guests we run (so far) sets all the priorities to >> the same value, so we can just ignore the fields, and return that same >> value to user space. >> >> On the other hand, if we were to return real values as written, I >> think we should actually respect these values when deciding whether or >> ?wnot to forward the interrupts to the vcpu interface through the list >> registers. > > reads-as-written is just as valid a dummy implementation as > writes-ignored. As soon as you get into not behaving the same way > the hardware does, you're in the world of "does it happen to work > OK for the guests I happen to care about running, is it convenient > to implement". > I think this is OK if it runs 99% of kernels compiled for the types of systems we wish to support. >> (What I really want to avoid is that some things seem to work >> correctly, but then happens to work differently on QEMU all of the >> sudden, because the guest kernel was updated to use interrupt >> priorities, but KVM never complains...) > > If you want to avoid that you need to actually implement priorities. > Or throw an undef into the guest if it tries to write the priority > registers to something other than the defaults. Choosing to only > partially implement the functionality of a device is inherently > choosing that things might explode if a guest attempts to use > the things you left out. > I think throwing an undef if the priority registers are written to something else than the assumption and making the assumption explicit is the way to go for now. I don't want to be the one debugging why the hell interrupts aren't firing in the way we expect them to on some 3.8 kernel :)