Hi Alex, On 05/03/2018 06:29 PM, Alex Williamson wrote: > On Thu, 3 May 2018 12:56:03 +0800 > Peter Xu <peterx@xxxxxxxxxx> wrote: > >> On Tue, May 01, 2018 at 10:43:46AM -0600, Alex Williamson wrote: >> >> [...] >> >>> -static void vfio_ioeventfd_exit(VFIOIOEventFD *ioeventfd) >>> +static void vfio_ioeventfd_exit(VFIOPCIDevice *vdev, VFIOIOEventFD *ioeventfd) >>> { >>> QLIST_REMOVE(ioeventfd, next); >>> + >>> memory_region_del_eventfd(ioeventfd->mr, ioeventfd->addr, ioeventfd->size, >>> ioeventfd->match_data, ioeventfd->data, >>> &ioeventfd->e); >>> - qemu_set_fd_handler(event_notifier_get_fd(&ioeventfd->e), NULL, NULL, NULL); >>> + >>> + if (ioeventfd->vfio) { >>> + struct vfio_device_ioeventfd vfio_ioeventfd; >>> + >>> + vfio_ioeventfd.argsz = sizeof(vfio_ioeventfd); >>> + vfio_ioeventfd.flags = ioeventfd->size; >>> + vfio_ioeventfd.data = ioeventfd->data; >>> + vfio_ioeventfd.offset = ioeventfd->region->fd_offset + >>> + ioeventfd->region_addr; >>> + vfio_ioeventfd.fd = -1; >>> + >>> + ioctl(vdev->vbasedev.fd, VFIO_DEVICE_IOEVENTFD, &vfio_ioeventfd); >> >> (If the series is going to respin, I am thinking whether it would >> worth it to capture this error to dump something. But it's for sure >> optional since even error happened we should have something in dmesg >> so it only matters on whether we also want something to be dumped >> from QEMU side too... After all there aren't much we can do) > > I'm torn whether to use QEMU standard error handling here, ie. > abort(). If we failed to remove the KVM ioeventfd, we'd abort before > we get here, so there's no chance that the vfio ioeventfd will continue > to be triggered. Obviously leaving a vfio ioeventfd that we can't > trigger and might interfere with future ioeventfds is not good, but do > we really want to kill the VM because we possibly can't add an > accelerator here later? I'm inclined to say no, so I think I'll just > error_report() unless there are objections. > >>> + >>> + } else { >>> + qemu_set_fd_handler(event_notifier_get_fd(&ioeventfd->e), >>> + NULL, NULL, NULL); >>> + } >>> + >>> event_notifier_cleanup(&ioeventfd->e); >>> trace_vfio_ioeventfd_exit(memory_region_name(ioeventfd->mr), >>> (uint64_t)ioeventfd->addr, ioeventfd->size, >> >> [...] >> >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>> index ba1239551115..84e27c7bb2d1 100644 >>> --- a/hw/vfio/pci.c >>> +++ b/hw/vfio/pci.c >>> @@ -3177,6 +3177,8 @@ static Property vfio_pci_dev_properties[] = { >>> no_geforce_quirks, false), >>> DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd, >>> false), >>> + DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd, >>> + false), >> >> Here it looks more like a tri-state for me, so we can either: >> >> - disable the acceleration in general, or >> - enable QEMU-side acceleration only, or >> - enable kernel-side acceleration > > So you're looking for a Auto/Off/KVM-only option? Do you really think > it's worth defining a new tristate property for this sort of debugging > option... > >> In other words, IIUC x-no-vfio-ioeventfd won't matter much if >> x-no-kvm-ioeventfd is already set. So not sure whether a single >> parameter would be nicer. > > That's correct, but who do we expect to be using this option and why? > I added enum OffAutoPCIBAR and the property to enable it for MSI-x > relocation because it is an option that a normal user might reasonably > need to use, given the right hardware on the right host, but it's an > unsupported option because we cannot programatically validate it. > Support rests with the individual user, if it doesn't work, don't use > it, if it helps, great. Here we have options that are really only for > debugging, to test whether something has gone wrong in this code, > disable this bypass to make all device interactions visible through > QEMU, or specifically to evaluate the performance of this path. Is it > reasonable to impose yet another property type on the common code for > this use case when a couple bools work just fine, if perhaps not > absolutely ideal? Am I overlooking an existing tri-state that might be > a reasonable match? > > Tying Eric's thread so we can discuss this in one place: > > On Thu, 3 May 2018 17:20:18 +0200 > Auger Eric <eric.auger@xxxxxxxxxx> wrote: >> I tend to agree with Peter about the 2 options. Only the KVM >> acceleration brings benefit here? > > Not at all, KVM acceleration does half the job (perhaps more than > half), but if we want to completely bypass the userspace exit then we > need both KVM triggering the ioeventfd and vfio consuming it. It's > useful for me at least to be able to compare completely disabled vs > just KVM vs also adding vfio. As above, I don't expect these to be > used for anything other than debugging and performance evaluation, so I > don't think it's worth making non-trivial code changes to enable the > vfio-only path, nor do I think it's worthwhile to define a new > tri-state property to resolve the slight mismatch we have vs two > bools. Again, let me know if you think I'm missing an existing > property or if there are trivial changes we could make to make this map > to an existing property. Thanks, Yes sorry I was not clear enough. I understood actual acceleration was due to the fact the ioeventfd is handled on kernel side and not on user side. So I thought we generally wanted both or nothing, hence a single option. Nevertheless I am fine as well to leave the 2 debug options. Thanks Eric > > Alex >