Avi Kivity wrote: > Gregory Haskins wrote: >> One thing I was thinking here was that I could create a flag for the >> kvm_irqfd() function for something like "KVM_IRQFD_MODE_CLEAR". This >> flag when specified at creation time will cause the event to execute a >> clear operation instead of a set when triggered. That way, the default >> mode is an edge-triggered set. The non-default mode is to trigger a >> clear. Level-triggered ints could therefore create two irqfds, one for >> raising, the other for clearing. >> > > That's my second choice option. > >> An alternative is to abandon the use of eventfd, and allow the irqfd to >> be a first-class anon-fd. The parameters passed to the write/signal() >> function could then indicate the desired level. The disadvantage would >> be that it would not be compatible with eventfd, so we would need to >> decide if the tradeoff is worth it. >> > > I would really like to keep using eventfd. Which is why I asked > Davide about the prospects of direct callbacks (vs wakeups). I saw that request. That would be ideal. > >> OTOH, I suspect level triggered interrupts will be primarily in the >> legacy domain, so perhaps we do not need to worry about it too much. >> Therefore, another option is that we *could* simply set the stake in the >> ground that legacy/level cannot use irqfd. >> > > This is my preferred option. For a virtio-net-server in the kernel, > we'd service its eventfd in qemu, raising and lowering the pci > interrupt in the traditional way. > > But we'd still need to know when to lower the interrupt. How? IIUC, isn't that usually device/subsystem specific, and out of scope of the GSI delivery vehicle? For instance, most devices I have seen with level ints have a register in their device register namespace for acking the int. As an aside, this is what causes some of the grief in dealing with shared interrupts like KVM pass-through and/or threaded-isrs: There isn't a standardized way to ACK them. You may also see some generalization of masking/acking in things like the MSI-X table. But again, this would be out of scope of the general GSI delivery path IIUC. I understand that there is a feedback mechanism in the ioapic model for calling back on acknowledgment of the interrupt. But I am not sure what is how the real hardware works normally, and therefore I am not convinced that is something we need to feed all the way back (i.e. via irqfd or whatever). In the interest of full disclosure, its been a few years since I studied the xAPIC docs, so I might be out to lunch on that assertion. ;) -Greg
Attachment:
signature.asc
Description: OpenPGP digital signature