Re: in-kernel interrupt controller steering

Scott Wood <scottwood@xxxxxxxxxxxxx> · Mon, 4 Mar 2013 18:59:16 -0600

On 03/04/2013 04:20:47 PM, Alexander Graf wrote:
Howdy,

We just sat down to discuss the proposed XICS and MPIC interfaces and  
how we can take bits of each and create an interface that works for  
everyone. In this, it feels like we came to some conclusions. Some of  
which we already reached earlier, but forgot in between :).

I hope I didn't forget too many pieces. Scott, Paul and Stuart,  
please add whatever you find missing in here.

It looks about right.

1) We need to set the generic interrupt type of the system before we  
create vcpus.

This is a new ioctl that sets the overall system interrupt controller  
type to a specific model. This used so that when we create vcpus, we  
can create the appended "local interrupt controller" state without  
the actual interrupt controller device available yet. It is also used  
later to switch between interrupt controller implementations.

This interrupt type is write once and frozen after the first vcpu got  
created.

Who is going to write up this patch?

2) Interrupt controllers (XICS / MPIC) get created by the device  
create api

Getting and setting state of an interrupt controller also happens  
through this. Getting and setting state from vcpus happens through  
ONE_REG. Injecting interrupt happens through the normal irqchip ioctl  
(we probably need to encode the target device id in there somehow).

This fits in nicely with a model where the interrupt controller is a  
proper QOM device in QEMU, since we can create it long after vcpus  
have been created.

3) We open code interrupt controller distinction

There is no need for function pointers. We just switch() based on the  
type that gets set in the initial ioctl to determine which code to  
call. The retrieval of the irq type happens through a static inline  
function in a header that can return a constant number for  
configurations that don't support multiple in-kernel irqchips.

4) The device attribute API has separate groups that target different  
use cases

Paul needs live migration, so he will implement device attributes  
that enable him to do live migration.
Scott doesn't implement live migration, so his MPIC attribute groups  
are solely for debugging purposes today.

5) There is no need for atomic device control accessors today.

Live migration happens with vcpus stopped, so we don't need to be  
atomic in the kernel <-> user space interface.

6) The device attribute API will keep read and write (get / set)  
accessors.

There is no specific need for a generic "command" ioctl.

Gleb, is this OK?  A bidirectional command accessor could be added  
later if a need arises.

Will attributes still be renamed to "commands", even if the get/set  
approach is retained?

7) Interrupt line connections to vcpus are implicit

We don't explicitly mark which in-kernel irqchip interrupt line goes  
to which vcpu. This is done implicitly. If we see a need for it, we  
create a new irqchip device type that allows us to explicitly  
configure vcpu connections.

Are there any changes needed to the device control api patch (just  
patch 1/6, not the rest of the patchset), besides Christoffer's request  
to tone down one of the comments, and whatever the response is to the  
questions in #6?

Should we add a "size" field in kvm_device, both for error checking and  
to assist tools such as strace?

-Scott
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html