Re: [PATCH v2 4/4] vfio/quirks: Enable ioeventfd quirks to be handled by vfio directly

Auger Eric <eric.auger@xxxxxxxxxx> · Fri, 4 May 2018 09:38:40 +0200

Hi Alex,

On 05/03/2018 06:29 PM, Alex Williamson wrote:
> On Thu, 3 May 2018 12:56:03 +0800
> Peter Xu <peterx@xxxxxxxxxx> wrote:
> 
>> On Tue, May 01, 2018 at 10:43:46AM -0600, Alex Williamson wrote:
>>
>> [...]
>>
>>> -static void vfio_ioeventfd_exit(VFIOIOEventFD *ioeventfd)
>>> +static void vfio_ioeventfd_exit(VFIOPCIDevice *vdev, VFIOIOEventFD *ioeventfd)
>>>  {
>>>      QLIST_REMOVE(ioeventfd, next);
>>> +
>>>      memory_region_del_eventfd(ioeventfd->mr, ioeventfd->addr, ioeventfd->size,
>>>                                ioeventfd->match_data, ioeventfd->data,
>>>                                &ioeventfd->e);
>>> -    qemu_set_fd_handler(event_notifier_get_fd(&ioeventfd->e), NULL, NULL, NULL);
>>> +
>>> +    if (ioeventfd->vfio) {
>>> +        struct vfio_device_ioeventfd vfio_ioeventfd;
>>> +
>>> +        vfio_ioeventfd.argsz = sizeof(vfio_ioeventfd);
>>> +        vfio_ioeventfd.flags = ioeventfd->size;
>>> +        vfio_ioeventfd.data = ioeventfd->data;
>>> +        vfio_ioeventfd.offset = ioeventfd->region->fd_offset +
>>> +                                ioeventfd->region_addr;
>>> +        vfio_ioeventfd.fd = -1;
>>> +
>>> +        ioctl(vdev->vbasedev.fd, VFIO_DEVICE_IOEVENTFD, &vfio_ioeventfd);  
>>
>> (If the series is going to respin, I am thinking whether it would
>>  worth it to capture this error to dump something.  But it's for sure
>>  optional since even error happened we should have something in dmesg
>>  so it only matters on whether we also want something to be dumped
>>  from QEMU side too... After all there aren't much we can do)
> 
> I'm torn whether to use QEMU standard error handling here, ie.
> abort().  If we failed to remove the KVM ioeventfd, we'd abort before
> we get here, so there's no chance that the vfio ioeventfd will continue
> to be triggered.  Obviously leaving a vfio ioeventfd that we can't
> trigger and might interfere with future ioeventfds is not good, but do
> we really want to kill the VM because we possibly can't add an
> accelerator here later?  I'm inclined to say no, so I think I'll just
> error_report() unless there are objections.
> 
>>> +
>>> +    } else {
>>> +        qemu_set_fd_handler(event_notifier_get_fd(&ioeventfd->e),
>>> +                            NULL, NULL, NULL);
>>> +    }
>>> +
>>>      event_notifier_cleanup(&ioeventfd->e);
>>>      trace_vfio_ioeventfd_exit(memory_region_name(ioeventfd->mr),
>>>                                (uint64_t)ioeventfd->addr, ioeventfd->size,  
>>
>> [...]
>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index ba1239551115..84e27c7bb2d1 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3177,6 +3177,8 @@ static Property vfio_pci_dev_properties[] = {
>>>                       no_geforce_quirks, false),
>>>      DEFINE_PROP_BOOL("x-no-kvm-ioeventfd", VFIOPCIDevice, no_kvm_ioeventfd,
>>>                       false),
>>> +    DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd,
>>> +                     false),  
>>
>> Here it looks more like a tri-state for me, so we can either:
>>
>> - disable the acceleration in general, or
>> - enable QEMU-side acceleration only, or
>> - enable kernel-side acceleration
> 
> So you're looking for a Auto/Off/KVM-only option?  Do you really think
> it's worth defining a new tristate property for this sort of debugging
> option...
> 
>> In other words, IIUC x-no-vfio-ioeventfd won't matter much if
>> x-no-kvm-ioeventfd is already set.  So not sure whether a single
>> parameter would be nicer.
> 
> That's correct, but who do we expect to be using this option and why?
> I added enum OffAutoPCIBAR and the property to enable it for MSI-x
> relocation because it is an option that a normal user might reasonably
> need to use, given the right hardware on the right host, but it's an
> unsupported option because we cannot programatically validate it.
> Support rests with the individual user, if it doesn't work, don't use
> it, if it helps, great.  Here we have options that are really only for
> debugging, to test whether something has gone wrong in this code,
> disable this bypass to make all device interactions visible through
> QEMU, or specifically to evaluate the performance of this path.  Is it
> reasonable to impose yet another property type on the common code for
> this use case when a couple bools work just fine, if perhaps not
> absolutely ideal?  Am I overlooking an existing tri-state that might be
> a reasonable match?
> 
> Tying Eric's thread so we can discuss this in one place:
> 
> On Thu, 3 May 2018 17:20:18 +0200
> Auger Eric <eric.auger@xxxxxxxxxx> wrote:
>> I tend to agree with Peter about the 2 options. Only the KVM
>> acceleration brings benefit here?
> 
> Not at all, KVM acceleration does half the job (perhaps more than
> half), but if we want to completely bypass the userspace exit then we
> need both KVM triggering the ioeventfd and vfio consuming it.  It's
> useful for me at least to be able to compare completely disabled vs
> just KVM vs also adding vfio.  As above, I don't expect these to be
> used for anything other than debugging and performance evaluation, so I
> don't think it's worth making non-trivial code changes to enable the
> vfio-only path, nor do I think it's worthwhile to define a new
> tri-state property to resolve the slight mismatch we have vs two
> bools.  Again, let me know if you think I'm missing an existing
> property or if there are trivial changes we could make to make this map
> to an existing property.  Thanks,

Yes sorry I was not clear enough. I understood actual acceleration was
due to the fact the ioeventfd is handled on kernel side and not on user
side. So I thought we generally wanted both or nothing, hence a single
option. Nevertheless I am fine as well to leave the 2 debug options.

Thanks

Eric
> 
> Alex
>