On Sun, Jan 02, 2011 at 12:58:50PM +0200, Avi Kivity wrote: > On 01/02/2011 12:39 PM, Michael S. Tsirkin wrote: > >> > > >> >I agree. At least it's not a regression. And in fact we haven't seen any device > >> >driver use this. I've checked Linux kernel code, found no one used PCI_MSIX_PBA or > >> >msix_pba_offset_reg(). > >> > > >> >I guess it's fine to get MSI-X mask part in first, then deal with PBA part if > >> >necessary - though we haven't seen any driver use it so far. It won't be worse > >> >with this patch anyway... > >> > >> In a way it is worse because before, the fix would belong in user > >> space, which is easier to test and distribute. Now we have to fix > >> it in the kernel. > >> > >> However I recognize that drivers which rely on the pending bit are > >> rare/nonexistent (likely on in preboot environments where interrupts > >> are hard), so even if we do code it, it will likely be incorrect > >> (certainly without a test). > >> > >> So I'll accept the patch without PBA. Michael, what about > >> supporting virtio? Can we base something on this patch? > > > >I don't see how userspace can send interrupts with this > >interface unfortunately. We also need irqfd support ... > > Sure we'll need additions to that interface. What I suggested is 1. an ioctl to map phy address + size to table id 2. a new gsi type with a table id + entry number. If we have that, assigned devices, virtio and vhost-net can work mostly as is, with just the mask bits accelerated. > What about vhost-net and vfio? I thought that they could emulate > the mask bits: > > - KVM_MMIOFD(vmfd, mmio_range, fd1, fd2) associates an mmio range with an fd > - writel(mmio_range) or readl(mmio_range) from the guest causes a > command to be written to fd1 > - for readl(), read from fd2 to see the result (works nicely for > "pci read flushes posted writes") > > this allows interesting stuff to be implemented in separate > processes, threads, or kernel modules. This could work. Some thought needs to be given to how we make sure that an appropriate type of file is passed in. Maybe using a netlink based connector for this a good idea? OTOH if we have MSIX mask bit emulation in kvm anyway, using it makes sense ... > -- > error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html