Hi Alex, > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > Sent: Thursday, April 16, 2020 10:41 PM > To: Liu, Yi L <yi.l.liu@xxxxxxxxx> > Subject: Re: [PATCH v1 7/8] vfio/type1: Add VFIO_IOMMU_CACHE_INVALIDATE > > On Thu, 16 Apr 2020 10:40:03 +0000 > "Liu, Yi L" <yi.l.liu@xxxxxxxxx> wrote: > > > Hi Alex, > > Still have a direction question with you. Better get agreement with you > > before heading forward. > > > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > Sent: Friday, April 3, 2020 11:35 PM > > [...] > > > > > > + * > > > > > > + * returns: 0 on success, -errno on failure. > > > > > > + */ > > > > > > +struct vfio_iommu_type1_cache_invalidate { > > > > > > + __u32 argsz; > > > > > > + __u32 flags; > > > > > > + struct iommu_cache_invalidate_info cache_info; > > > > > > +}; > > > > > > +#define VFIO_IOMMU_CACHE_INVALIDATE _IO(VFIO_TYPE, > > > VFIO_BASE > > > > > + 24) > > > > > > > > > > The future extension capabilities of this ioctl worry me, I wonder if > > > > > we should do another data[] with flag defining that data as CACHE_INFO. > > > > > > > > Can you elaborate? Does it mean with this way we don't rely on iommu > > > > driver to provide version_to_size conversion and instead we just pass > > > > data[] to iommu driver for further audit? > > > > > > No, my concern is that this ioctl has a single function, strictly tied > > > to the iommu uapi. If we replace cache_info with data[] then we can > > > define a flag to specify that data[] is struct > > > iommu_cache_invalidate_info, and if we need to, a different flag to > > > identify data[] as something else. For example if we get stuck > > > expanding cache_info to meet new demands and develop a new uapi to > > > solve that, how would we expand this ioctl to support it rather than > > > also create a new ioctl? There's also a trade-off in making the ioctl > > > usage more difficult for the user. I'd still expect the vfio layer to > > > check the flag and interpret data[] as indicated by the flag rather > > > than just passing a blob of opaque data to the iommu layer though. > > > Thanks, > > > > Based on your comments about defining a single ioctl and a unified > > vfio structure (with a @data[] field) for pasid_alloc/free, bind/ > > unbind_gpasid, cache_inv. After some offline trying, I think it would > > be good for bind/unbind_gpasid and cache_inv as both of them use the > > iommu uapi definition. While the pasid alloc/free operation doesn't. > > It would be weird to put all of them together. So pasid alloc/free > > may have a separate ioctl. It would look as below. Does this direction > > look good per your opinion? > > > > ioctl #22: VFIO_IOMMU_PASID_REQUEST > > /** > > * @pasid: used to return the pasid alloc result when flags == ALLOC_PASID > > * specify a pasid to be freed when flags == FREE_PASID > > * @range: specify the allocation range when flags == ALLOC_PASID > > */ > > struct vfio_iommu_pasid_request { > > __u32 argsz; > > #define VFIO_IOMMU_ALLOC_PASID (1 << 0) > > #define VFIO_IOMMU_FREE_PASID (1 << 1) > > __u32 flags; > > __u32 pasid; > > struct { > > __u32 min; > > __u32 max; > > } range; > > }; > > Can't the ioctl return the pasid valid on alloc (like GET_DEVICE_FD)? Yep, I think you mentioned before. At that time, I believed it would be better to return the result via a __u32 buffer so that make full use of the 32 bits. But looks like it doesn't make much difference. I'll follow your suggestion. > Would it be useful to support freeing a range of pasids? If so then we > could simply use range for both, ie. allocate a pasid from this range > and return it, or free all pasids in this range? vfio already needs to > track pasids to free them on release, so presumably this is something > we could support easily. yes, I think it is a nice thing. then I can remove the @pasid field. will do it. > > ioctl #23: VFIO_IOMMU_NESTING_OP > > struct vfio_iommu_type1_nesting_op { > > __u32 argsz; > > __u32 flags; > > __u32 op; > > __u8 data[]; > > }; > > data only has 4-byte alignment, I think we really want it at an 8-byte > alignment. This is why I embedded the "op" into the flag for > DEVICE_FEATURE. Thanks, got it. I may also merge the op into flags (maybe the lower 16 bits for op). Thanks, Yi Liu > Alex > > > > > /* Nesting Ops */ > > #define VFIO_IOMMU_NESTING_OP_BIND_PGTBL 0 > > #define VFIO_IOMMU_NESTING_OP_UNBIND_PGTBL 1 > > #define VFIO_IOMMU_NESTING_OP_CACHE_INVLD 2 > > > > Thanks, > > Yi Liu > >