On 11/17/2016 03:45 AM, Alex Williamson wrote: > On Thu, 17 Nov 2016 00:42:48 +0530 > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: >> On 11/16/2016 11:18 PM, Alex Williamson wrote: >>> On Wed, 16 Nov 2016 16:14:24 +0530 >>> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote: >>>> On 11/16/2016 3:07 PM, Jike Song wrote: >>>>> On 11/16/2016 05:14 PM, Kirti Wankhede wrote: >>>>>> On 11/16/2016 9:13 AM, Alex Williamson wrote: >>>>>>> On Wed, 16 Nov 2016 11:01:37 +0800 >>>>>>> Jike Song <jike.song@xxxxxxxxx> wrote: >>>>>>>> On 11/16/2016 07:11 AM, Alex Williamson wrote: >>>>>>>>> On Tue, 15 Nov 2016 19:35:46 +0800 >>>>>>>>> Jike Song <jike.song@xxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>>> The user of vfio_register_notifier might care about not only >>>>>>>>>> iommu events but also vfio_group events, so also register the >>>>>>>>>> notifier_block on vfio_group. >>>>>>>>>> >>>>>>>>>> Cc: Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> >>>>>>>>>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >>>>>>>>>> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> >>>>>>>>>> Signed-off-by: Jike Song <jike.song@xxxxxxxxx> >>>>>>>>>> --- >>>>>>>>>> drivers/vfio/vfio.c | 4 ++++ >>>>>>>>>> 1 file changed, 4 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c >>>>>>>>>> index b149ced..2c0eedb 100644 >>>>>>>>>> --- a/drivers/vfio/vfio.c >>>>>>>>>> +++ b/drivers/vfio/vfio.c >>>>>>>>>> @@ -2065,6 +2065,8 @@ int vfio_register_notifier(struct device *dev, struct notifier_block *nb) >>>>>>>>>> else >>>>>>>>>> ret = -ENOTTY; >>>>>>>>>> >>>>>>>>>> + vfio_group_register_notifier(group, nb); >>>>>>>>>> + >>>>>>>>>> up_read(&container->group_lock); >>>>>>>>>> vfio_group_try_dissolve_container(group); >>>>>>>>>> >>>>>>>>>> @@ -2102,6 +2104,8 @@ int vfio_unregister_notifier(struct device *dev, struct notifier_block *nb) >>>>>>>>>> else >>>>>>>>>> ret = -ENOTTY; >>>>>>>>>> >>>>>>>>>> + vfio_group_unregister_notifier(group, nb); >>>>>>>>>> + >>>>>>>>>> up_read(&container->group_lock); >>>>>>>>>> vfio_group_try_dissolve_container(group); >>>>>>>>>> >>>>>>>>> >>>>>>>>> You haven't addressed the error paths, if the iommu driver returns >>>>>>>>> error and therefore the {un}register returns error, what is the caller >>>>>>>>> to expect about the group registration? >>>>>>>>> >>>>>>>> >>>>>>>> Will change to: >>>>>>>> >>>>>>>> driver = container->iommu_driver; >>>>>>>> if (likely(driver && driver->ops->register_notifier)) >>>>>>>> ret = driver->ops->register_notifier(container->iommu_data, nb); >>>>>>>> else >>>>>>>> ret = -ENOTTY; >>>>>>>> if (ret) >>>>>>>> goto err_register_iommu; >>>>>>>> >>>>>>>> ret = vfio_group_register_notifier(group, nb); >>>>>>>> if (ret) >>>>>>>> driver->ops->unregister_notifier(container->iommu_data, nb); >>>>>>>> >>>>>>>> err_register_iommu: >>>>>>>> up_read(&container->group_lock); >>>>>>>> vfio_group_try_dissolve_container(group); >>>>>>>> >>>>>>>> err_register_nb: >>>>>>>> vfio_group_put(group); >>>>>>>> return ret; >>>>>>> >>>>>>> >>>>>>> >>>>>>> What if a vendor driver only cares about the kvm state and doesn't pin >>>>>>> memory (ie. no DMA) or only cares about iommu and not group notifies? >>>>>>> If we handled notifier_fn_t 'action' as a bitmask then we could have the >>>>>>> registrar specify which notification they wanted (a mask/filter), so if >>>>>>> they only want KVM, we only send that notify, if they only want UNMAPs, >>>>>>> etc. Then we know whether iommu registration is required. As a bonus, >>>>>>> we could add a pr_info() indicating vendors that ask for KVM >>>>>>> notification so that we can interrogate why they think they need it. >>>>>>> The downside is that handling action as a bitmask means that we limit >>>>>>> the number of actions we have available (32 or 64 bits worth). That >>>>>>> limit is hopefully far enough off to be ok though. Thoughts? Thanks, >>>>>>> >>>>>> >>>>>> As per my understanding, this bitmask is input to >>>>>> vfio_register_notifier() and vfio_unregister_notifier(), right? >>>>>> >>>>>> These functions are not called from vendor driver directly, these are >>>>>> called from vfio_mdev. Then should this bitmask be part of parent_ops >>>>>> that vendor driver can specify? >>>>> >>>>> I think so, there should be a 'notifiler_filter' in parent_ops to indicate >>>>> that. A draft patch to show Alex's proposal: >>>>> >>>> >>>> In that case how can we use bitmask to register multiple actions from >>>> backend iommu module (now we only have VFIO_IOMMU_NOTIFY_DMA_UNMAP >>>> action)? Even if we pass bitmask to backend iommu modules >>>> register_notifier(), backend module would have to create separate >>>> notifier for each action? >>>> >>>> Instead of taking bitmask as input from vendor driver, vendor driver's >>>> callback can send return depending on action if they don't want to get >>>> future notification: >>>> >>>> #define NOTIFY_DONE 0x0000 /* Don't care */ >>>> #define NOTIFY_OK 0x0001 /* Suits me */ >>>> #define NOTIFY_STOP_MASK 0x8000 /* Don't call further */ >>>> #define NOTIFY_BAD (NOTIFY_STOP_MASK|0x0002) >>>> /* Bad/Veto action */ >>>> /* >>>> * Clean way to return from the notifier and stop further calls. >>>> */ >>>> #define NOTIFY_STOP (NOTIFY_OK|NOTIFY_STOP_MASK) >>> >>> I don't think that's a correct interpretation of NOTIFY_STOP_MASK, it >>> intends to say "this notification was for me, I've handled it, no need >>> to continue through the call chain". It does not imply "don't tell me >>> about this event in the future", entries in the call chain don't have >>> the ability to suppress certain events. >>> >> >> OK, thanks for correcting me. >> >>> Also note that NOTIFY_STOP_MASK is returned by >>> blocking_notifier_call_chain(), so we can pr_warn if a call chain >>> member is potentially preventing other notify-ees from receiving the >>> event. >>> >>> Should we be revisiting the question of whether notifiers are registered >>> as part of the mdev-core? In the current proposals >>> vfio_{un}register_notifier() is exported, so it's really just a >>> convenience that the mdev core registers it on behalf of the vendor >>> driver. Vendor drivers could register on open and unregister on >>> release (vfio could additionally do housekeeping on release). We >>> already expect vendor drivers to call vfio_{un}pin_pages() directly. >>> Then the vendor driver would provide filter flags directly and we'd >>> make that part of the mdev-vendor driver API a little more flexible. >> >> Ok. But that still doesn't solve the problem if more actions need to be >> added to backend iommu module. >> >> If vendor driver notifier callback is called, vendor driver can return >> NOTIFY_DONE or NOTIFY_OK for the action they are not interested in. > > Perhaps calling it a filter is not correct, I was thinking that a > vendor driver would register the notifier with a set of required > events. The driver would need to handle/ignore additional events > outside of the required mask. There are certainly some complications > to this model though that I hadn't thought all the way through until > now. For instance what if we add event XYZ in the future and the > vendor driver adds this to their required mask. If we run that on an > older kernel where the vfio infrastructure doesn't know about that > event, the vendor driver needs to fail, or at least know that event is > not supported and retry with a set of supported events. > > There's another problem with my proposal too, we can't put a single > notifier_block on multiple notifier_block heads, that just doesn't > work. So we probably need to separate a group notifier from an iommu > notifier, the vendor driver will need to register for each. > > Maybe we end up with something like: > > int vfio_register_notifier(struct device *dev, > vfio_notify_type_t type, > unsigned long *required_events, > struct notifier_block *nb); > > typedef unsigned short vfio_notify_type_t; > enum vfio_notify_type { > VFIO_IOMMU_NOTIFY = (__force vfio_notify_type_t)0, > VFIO_GROUP_NOTIFY = (__force vfio_notify_type_t)1, > }; > > (stealing this from pci_dev_flags_t, hope it works) > > A VFIO_GROUP_NOTIFY would add the notifier_block to the vfio_group, a > VFIO_IOMMU_NOTIFY would add it to vfio_iommu. Each would have their > own unique set of events and each would compare their supported events > to the required events by the caller. Supported events would be > cleared from the callers required_events arg. If required_events still > has bits set, the notifier_block is not registered, an error is > returned, and the caller can identify the unsupported events by the > remaining bits in the required_events arg. Can that work? Thanks, Let me summarize the discussion: - There should be 2 notifier heads, one in vfio_iommu another in vfio_group; - vfio_{un}register_notifier() has the type specified in parameter - In vfio_register_notifier, maybe: static vfio_iommu_register_notifier() {..} static vfio_group_register_notifier() {..} int vfio_register_notififer(type.. { if (type == VFIO_IOMMU_NOTIFY) return vfio_iommu_register_notifier(); if (type == VFIO_GROUP_NOTIFY) return vfio_group_register_notifier(); } What's more, if we still want registration to be done in mdev framework, we should change parent_ops: - rename 'notifier' to 'iommu_notifier' - add "group_notifier" - Add a group_events and a iommu_events to indicate what events vendor is interested in, respectively or otherwise don't touch it from mdev framework at all? -- Thanks, Jike -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html