Re: [RFC][PATCH] KVM: Introduce direct MSI message injection for in-kernel irqchips

Jan Kiszka <jan.kiszka@xxxxxxxxxxx> · Fri, 21 Oct 2011 15:00:12 +0200

On 2011-10-21 14:04, Michael S. Tsirkin wrote:
> On Fri, Oct 21, 2011 at 01:51:15PM +0200, Jan Kiszka wrote:
>> On 2011-10-21 13:06, Michael S. Tsirkin wrote:
>>> On Fri, Oct 21, 2011 at 11:19:19AM +0200, Jan Kiszka wrote:
>>>> Currently, MSI messages can only be injected to in-kernel irqchips by
>>>> defining a corresponding IRQ route for each message. This is not only
>>>> unhandy if the MSI messages are generated "on the fly" by user space,
>>>> IRQ routes are a limited resource that user space as to manage
>>>> carefully.
>>>>
>>>> By providing a direct injection with, we can both avoid using up limited
>>>> resources and simplify the necessary steps for user land. The API
>>>> already provides a channel (flags) to revoke an injected but not yet
>>>> delivered message which will become important for in-kernel MSI-X vector
>>>> masking support.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
>>>
>>> I would love to see how you envision extending this to add the masking
>>> support at least at the API level, not necessarily the supporting code.
>>>
>>> It would seem hard to use flags field for that since MSIX mask is per
>>> device per vector, not per message.
>>> Which gets us back to resource per vector which userspace has to manage
>>> ...
>>>
>>> interrupt remapping is also per device, so it isn't any easier
>>> with this API.
>>
>> Yes, we will need an additional field to associate the message with its
>> source device. Could be a PCI address or a handle (like the one assigned
>> devices get) returned on MSI-X kernel region setup. We will need a flag
>> to declare that address/handle valid, also to tell apart platform MSI
>> messages (e.g. coming from HPET on x86).
> 
> I have not thought about remapping a lot yet:
> HPET interrupts are not subject to remapping?

Looks it is, at least on VT-d: The related VT-d document knows two
non-PCI source IDs, namely legacy pin interrupts and "other MSIs". So we
may want a more generic source ID that, for MSI-X in-kernel masking, can
then be associated with a device vector for which we accelerate mask
management.

> 
>> I see no obstacles ATM that
>> prevent doing that on top of this API, do you?
>>
>> Jan
> 
> For masking, I think I do. We need to maintain the pending bit
> and the io notifiers in kernel, per vector.
> An MSI injected with just an address/data pair, without
> vector/device info, can't be masked properly.
> 
> We get back to maintaining some handle per vector, right?

First of all, the common case for in-kernel MSI-X mask management will
be MSI sources that are _not_ injected as address-data pair from user
space but come from in-kernel sources (irqfd or host IRQs, ie. assigned
devices). In contrast, this API here is targeting MSI messages generated
in the hypervisor process (ie. current QEMU device emulation).

Still, the new interface should allow for injecting the other vectors as
well without requiring additional coordination of an in-kernel MSI-X
page vs. user space's view on it. For that reason we need a per vector
handle for that special case. But that will naturally derive from
defining a generic MSI-X in-kernel mask management API. You will have to
specify which device shall be accelerated and how many vectors it has
(at maximum). So a directly injected MSI message for those devices will
have to specify that source tuple (device, vector), but only in that
special case.

Maybe I will sit down now and create a draft for a MSI-X mask
acceleration API. That may help feeling better about this proposal. :)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html