On Sun, 2012-08-12 at 12:33 +0300, Michael S. Tsirkin wrote: > On Thu, Aug 09, 2012 at 01:26:15PM -0600, Alex Williamson wrote: > > On Mon, 2012-08-06 at 13:40 +0300, Avi Kivity wrote: > > > On 08/06/2012 01:38 PM, Avi Kivity wrote: > > > > > > > Regarding the implementation, instead of a linked list, would an array > > > > of counters parallel to the bitmap make it simpler? > > > > > > Or even, replace the bitmap with an array of counters. > > > > I'm not sure a counter array is what we're really after. That gives us > > reference counting for the irq source IDs, but not the key->gsi lookup. > > It also highlights another issue, that we have a limited set of source > > IDs. Looks like we have BITS_PER_LONG IDs, with two already used, one > > for the shared userspace ID and another for the PIT. How happy are we > > going to be with a limit of 62 level interrupts in use at one time? > > > > It's arguably a reasonable number since the most virtualization friendly > > devices (sr-iov VFs) don't even support this kind of interrupt. It's > > also very wasteful allocating an entire source ID for a single GSI > > within that source ID. PCI supports interrupts A, B, C, and D, which, > > in the most optimal config, each go to different GSIs. So we could > > theoretically be more efficient in our use and allocation of irq source > > IDs if we tracked use by the source ID, gsi pair. > > > > That probably makes it less practical to replace anything at the top > > level with a counter array. The key that we pass back is currently the > > actual source ID, but we don't specify what it is, so we could split it > > and have it encode a 16bit source ID plus 16 bit GSI. It could also be > > an idr entry. > > > > Michael, would the interface be more acceptable to you if we added > > separate ioctls to allocate and free some representation of an irq > > source ID, gsi pair? For instance, an ioctl might return an idr entry > > for an irq source ID/gsi object which would then be passed as a > > parameter in struct kvm_irqfd and struct kvm_eoifd so that the object > > representing the source id/gsi isn't magically freed on it's own. This > > would also allow us to deassign/close one end and reconfigure it later. > > Thanks, > > > > Alex > > It's acceptable to me either way. I was only pointing out that as > designed, the interface looks simple at first but then you find out some > subtle limitations which are implementation driven. This gives > an overall feeling the abstraction is too low level. > > If we compare to the existing irqfd, isn't the difference > simply that irqfd deasserts immediately ATM, while we > want to delay this until later? > > If yes, then along the lines that you proposed, and combining with my > idea of tracking deasserts, how do you like the following: > > /* Keep line asserted until guest has handled the interrupt. */ > #define KVM_IRQFD_FLAG_DEASSERT_ON_ACK (1 << 1) > /* Notify after line is deasserted. */ > #define KVM_IRQFD_FLAG_DEASSERT_EVENTFD (2 << 1) > > struct kvm_irqfd { > __u32 fd; > __u32 gsi; > __u32 flags; > /* eventfd to notify when line is deasserted */ > __u32 deassert_eventfd; > __u8 pad[16]; > }; > > now the only limitation is that KVM_IRQFD_FLAG_DEASSERT_ON_ACK is only > effective for level interrupts. > > Notes about lifetime of objects: > - closing deassert_eventfd does nothing (we can keep > reference to it from irqfd so no need for > complex polling/flushing scheme) > - closing irqfd or deasserting dis-associates > deassert_eventfd automatically > - source id is internal to irqfd and goes away with it > > it looks harder to misuse and fits what we want to do nicely, > and needs less code to implement. This is effectively what I meant when I suggested we either need to a) pull eoifd into irqfd or b) implement them as modular components. I chose to implement b) because I think that non-irqfd related ack notification to userspace will be useful and a) does not provide that. So this interface enables exactly the use case for device assignment and no more. I feel like this is the start of an ioctl that will be quickly deprecated, but if that's the direction we want to go, I'll write the code. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html