On Thu, 2012-07-19 at 14:57 +0300, Michael S. Tsirkin wrote: > On Thu, Jul 19, 2012 at 02:25:29PM +0300, Gleb Natapov wrote: > > On Thu, Jul 19, 2012 at 02:12:13PM +0300, Michael S. Tsirkin wrote: > > > On Thu, Jul 19, 2012 at 01:54:53PM +0300, Gleb Natapov wrote: > > > > On Thu, Jul 19, 2012 at 01:26:48PM +0300, Michael S. Tsirkin wrote: > > > > > On Thu, Jul 19, 2012 at 12:41:24PM +0300, Gleb Natapov wrote: > > > > > > On Thu, Jul 19, 2012 at 12:33:29PM +0300, Michael S. Tsirkin wrote: > > > > > > > On Thu, Jul 19, 2012 at 12:21:07PM +0300, Gleb Natapov wrote: > > > > > > > > On Thu, Jul 19, 2012 at 12:17:19PM +0300, Michael S. Tsirkin wrote: > > > > > > > > > On Thu, Jul 19, 2012 at 10:53:37AM +0300, Gleb Natapov wrote: > > > > > > > > > > On Thu, Jul 19, 2012 at 01:11:53AM +0300, Michael S. Tsirkin wrote: > > > > > > > > > > > This creates a way to detect when kvm_set_irq(...,0) was run > > > > > > > > > > > twice with the same source id by returning 0 in this case. > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx> > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > > This is on top of my bugfix patch. Uncompiled and untested. Alex, I > > > > > > > > > > > think something like this patch will make it possible for you to simply > > > > > > > > > > > do > > > > > > > > > > > if (kvm_set_irq(...., 0)) > > > > > > > > > > > eventfd_signal() > > > > > > > > > > > > > > > > > > > > > Why caller can't track line state? > > > > > > > > > > > > > > > > > > Why duplicate information? As we are finding it's not trivial to keep > > > > > > > > > the two in sync. Think about migration etc ... > > > > > > > > > > > > > > > > > We do not migrate irq_states. The caller already have to have enough > > > > > > > > information to recreate its state and it should migrate the info, so why > > > > > > > > should we go all the way down the call chain to find something that is > > > > > > > > already known? > > > > > > > > > > > > > > Hmm it's an interesting point. Looks like irqfds for level lose state > > > > > > > across migration. Of course Alex wants to use them for assignment which > > > > > > > currently disables migration, but we are talking about a generic API, > > > > > > > so it's a problem that there's no way to retrieve the state. > > > > > > > > > > > > > There is no any problem. Source knows what the line status is. > > > > > > > > > > With EOIFD and level IRQFD, it does not. > > > > > > > > > So this is again eventfd and level interrupts incompatibility problem? > > > > > > At some level, yes. > > > > > So may be we shouldn't do that especially since you claim migration will > > not work. > > > > > > Furthermore this is a (benign) bug if device calls irq_set with > > > > > > the same level since it results in needless system calls. Qemu guilty > > > > > > of it and _that_ should be fixed. > > > > > > > > > > Fine but we are arguably returning a wrong result in that case: > > > > > set_irq twice to 0 return 1 each time. I would expect 0 the > > > > > second time. > > > > It returns 0 if interrupt was coalesced. It was not. > > > > > > Not really, if you call it with level 0 you always get 1 back. > > > Look at kvm_ioapic_set_irq, see what happens if level is 0. > > > It looks like a bug though a harmless one. > > > > > May be. What kvm_set_irq() return in case of level=0 was never > > important. > > Absolutely. Now it'll be helpful to fix this for the EOI thing > so that we can avoid signalling userspace in that case. > > > > > > > > > > > > > > > > > > > > Also migration is only one example. Duplicated state is generally > > > > > > > nasty. We would need extra locking too which is not nice. > > > > > > > > > > > > > I don't know what extra locking you are talking about, but calling > > > > > > kvm_set_irq() repeatedly with the same level will do a lot of unnecessary > > > > > > locking in ioapic. > > > > > > > > > > I am talking about Alex's EOIFD. This is what this patch is trying > > > > > to help. > > > > > > > > > Can you point me to exact problem in Alex's patch? > > > > > > It's very simple. Alex adds an interface to clear the level > > > automatically from guest on EOI. So the caller has no way to know the > > > current state for a given source ID and can not restore it after > > > migration. > > > > > Yes, but caller (read device emulation) knows what real state is. The > > fact that EOI was called does not mean the line is at 0. Device should > > reevaluate its state and re-trigger the line again if needed. > > Sounds reasonable, but let's document this property of level IRQFD. Yes, the problem isn't the state. The original patch works just fine to mask and assert the interrupt every time the device signals and de-assert and unmask on every EOI. KVM doesn't need to track this for migration (not that we support migration, of course), we can always just send an unmask to the device to retrigger an interrupt if needed. The thing Michael is trying to avoid is spurious assertions and de-assertions by tracking the state machine. Spurious assertions are not really a problem, at least for vfio where the interrupt is masked until kvm/qemu tells us to unmask it. So at any point in time we can reset the state machine with an unmask. Spurious unmasks are theoretically a problem if an IRQ is shared among multiple devices we can trigger unmasks for devices that haven't been asserted. vfio handles this pretty well though and recognizes the device isn't masked and does nothing. Something I note out of this discussion is that while the spinlock I use to maintain the state machine is ugly, the lock has no contention. I don't think that's necessarily the case with pic_lock. Anyway, I think we can do w/o the spinlock altogether. Lock contention and spurious eois over level triggered interrupts is probably not worth worrying about. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html