On 07/12/2012 08:38 PM, Alex Williamson wrote: > On Thu, 2012-07-12 at 10:19 -0600, Alex Williamson wrote: >> On Thu, 2012-07-12 at 12:35 +0300, Avi Kivity wrote: >> > On 07/11/2012 10:57 PM, Alex Williamson wrote: >> > >> >> > >> > We still have classic KVM device assignment to provide fast-path INTx. >> > >> > But if we want to replace it midterm, I think it's necessary for VFIO to >> > >> > be able to provide such a path as well. >> > >> >> > >> I would like VFIO to have no regressions vs. kvm device assignment, >> > >> except perhaps in uncommon corner cases. So I agree. >> > > >> > > I ran a few TCP_RR netperf tests forcing a 1Gb tg3 nic to use INTx. >> > > Without irqchip support vfio gets a bit more than 60% of KVM device >> > > assignment. That's a little bit of an unfair comparison since it's more >> > > than just the I/O path. With the proposed interfaces here, enabling >> > > irqchip, vfio is within 10% of KVM device assignment for INTx. For MSI, >> > > I can actually make vfio come out more than 30% better than KVM device >> > > assignment if I send the eventfd from the hard irq handler. Using a >> > > threaded handler as the code currently does, vfio is still behind KVM. >> > > It's hard to beat a direct call chain. >> > >> > We can have a direct call chain with vfio too, using a custom eventfd >> > poll function, no? Assuming we set up a fast path for unicast msi. >> >> You'll have to help me out a little, eventfd_signal walks the wait_queue >> and calls each function. On the injection path that includes >> irqfd_wakeup. For an MSI that seems to already provide direct >> injection. For level we'll schedule_work, so that explains the overhead >> in that path, but it's not too dissimilar to a a threaded irq. vfio >> does something very similar, so there's a schedule_work both on inject >> and on eoi. I'll have to check whether anything prevents the unmask >> from the wait_queue function in vfio, that could be a significant chunk >> of the gap. > > Yep, the schedule_work in the eoi is the culprit. A direct unmask from > the wait queue function gives me better results than kvm for INTx. > We'll have to see how the leapfrogging goes once KVM switches to > injection from the hard handler. I'm still curious what this custom > poll function would give us though. Thanks, > btw, why is the overhead so large? A context switch should be on the order of 1 microsecond or less. Given that, every 5000 context switches per second cost a 1% cpu load on one core. You would need a very heavy interrupt load to see a large degradation. Or is the extra latency the problem? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html