[Android-virt] [PATCH 04/15] ARM: KVM: VGIC distributor handling

c.dall at virtualopensystems.com (Christoffer Dall) · Thu, 21 Jun 2012 18:41:24 -0400

On Thu, Jun 21, 2012 at 6:35 PM, Peter Maydell <peter.maydell at linaro.org> wrote:
> On 21 June 2012 23:25, Christoffer Dall <c.dall at virtualopensystems.com> wrote:
>> On Thu, Jun 21, 2012 at 5:29 PM, Peter Maydell <peter.maydell at linaro.org> wrote:
>>> On 21 June 2012 21:58, Christoffer Dall <c.dall at virtualopensystems.com> wrote:
>>>> I think we would want to support migration as a general concept, but
>>>> probably not between non-kvm accelerated qemu environments and
>>>> accelerated ones.
>>>
>>> I think conceptually it is supposed to work to migrate between
>>> KVM and TCG (emulated) QEMU. Basically the kernel should provide
>>> the ABI[*] for reading/writing the GIC state, and QEMU then marshalls
>>> that into a state struct that is shared between its in-kernel-GIC
>>> and emulated-GIC models.
>>
>> does anybody use this? are anyone going to? is it even tested?
>
> Well, the QEMU code shares a state structure already (it was the
> obvious way to implement it, matching x86), and you need to
> provide a load/save state function in the kernel anyhow. My
> point is really "don't design things to rule it out".
>
>>> [*] some variation on the read-write-many-regs stuff that I think
>>> Rusty said he was going to look into, I would suggest.
>>>
>>
>> how would that work? represent the GIC registers as pseudo registers
>> as part of the CP15 registers, or...?
>
> The idea is that the API includes a (subsystem,register-number)
> tuple, so the copro registers live in one subsystem, and the
> GIC registers would all be in another. This means we have one
> consistent API for "does the kernel know about these registers?",
> "read them", "write them", and we don't have to export lots of
> structures and manage adding new fields to them.
>

ok, that sounds nice, in which case we might as well implement the
set-active/clear-active registers and simply call these read/write
functions from the read/write of the above API.

>>> The out-of-kernel GIC model does implement interrupt priorities,
>>> and the priority registers are part of the state. But I think that
>>> the way we'd handle that is that save/restore would determine that
>>> the kernel didn't provide the priority registers and would just
>>> accept that it couldn't set them. Or if the kernel provided
>>> registers that read-as-written but don't have any effect, we could
>>> just save and restore the state into those.
>>
>> I just don't think we should keep this state around if we don't use
>> it, but then it all of the sudden may have an effect if migrated to
>> QEMU.
>>
>> If I understand correctly, the reason we don't have to deal with it is
>> the fact that the guests we run (so far) sets all the priorities to
>> the same value, so we can just ignore the fields, and return that same
>> value to user space.
>>
>> On the other hand, if we were to return real values as written, I
>> think we should actually respect these values when deciding whether or
>> ?wnot to forward the interrupts to the vcpu interface through the list
>> registers.
>
> reads-as-written is just as valid a dummy implementation as
> writes-ignored. As soon as you get into not behaving the same way
> the hardware does, you're in the world of "does it happen to work
> OK for the guests I happen to care about running, is it convenient
> to implement".
>

I think this is OK if it runs 99% of kernels compiled for the types of
systems we wish to support.

>> (What I really want to avoid is that some things seem to work
>> correctly, but then happens to work differently on QEMU all of the
>> sudden, because the guest kernel was updated to use interrupt
>> priorities, but KVM never complains...)
>
> If you want to avoid that you need to actually implement priorities.
> Or throw an undef into the guest if it tries to write the priority
> registers to something other than the defaults. Choosing to only
> partially implement the functionality of a device is inherently
> choosing that things might explode if a guest attempts to use
> the things you left out.
>

I think throwing an undef if the priority registers are written to
something else than the assumption and making the assumption explicit
is the way to go for now. I don't want to be the one debugging why the
hell interrupts aren't firing in the way we expect them to on some 3.8
kernel :)