On 5/11/22 02:10, Tian, Kevin wrote: >> From: Jason Gunthorpe <jgg@xxxxxxxxxx> >> Sent: Tuesday, May 10, 2022 9:47 PM >> >> On Tue, May 10, 2022 at 01:38:26AM +0000, Tian, Kevin wrote: >> >>>> However, tt costs nothing to have dirty tracking as long as all iommus >>>> support it in the system - which seems to be the normal case today. >>>> >>>> We should just always turn it on at this point. >>> >>> Then still need a way to report " all iommus support it in the system" >>> to userspace since many old systems don't support it at all. >> >> Userspace can query the iommu_domain directly, or 'try and fail' to >> turn on tracking. >> >> A device capability flag is useless without a control knob to request >> a domain is created with tracking, and we don't have that, or a reason >> to add that. >> > > I'm getting confused on your last comment. A capability flag has to > accompany with a control knob which iiuc is what you advocated > in earlier discussion i.e. specifying the tracking property when creating > the domain. In this case the flag assists the userspace in deciding > whether to set the property. > > Not sure whether we argued pass each other but here is another attempt. > > In general I saw three options here: > > a) 'try and fail' when creating the domain. It succeeds only when > all iommus support tracking; > > b) capability reported on iommu domain. The capability is reported true > only when all iommus support tracking. This allows domain property > to be set after domain is created. But there is no much gain of doing > so when comparing to a). > > c) capability reported on device. future compatible for heterogenous > platform. domain property is specified at domain creation and domains > can have different properties based on tracking capability of attached > devices. > > I'm inclined to c) as it is more aligned to Robin's cleanup effort on > iommu_capable() and iommu_present() in the iommu layer which > moves away from global manner to per-device style. Along with > that direction I guess we want to discourage adding more APIs > assuming 'all iommus supporting certain capability' thing? > Not sure where we are left off on this one, so hopefully just for my own clarification on what we see is the path forward. I have a tiny inclination towards option b) because VMMs with IOMMU dirty tracking only care about what an IOMMU domain (its set of devices) can do. Like migration shouldn't even be attempted if one of the devices in the IOMMU domain don't support it. Albeit, it seems we will need something like c) for other usecases that depend on the PCIe endpoint support (like PRS) a) is what we have in the RFC and has the same context as b) with b) having an explicit query support API rather than implicit failure if one of the devices in the iommu_domain doesn't support it. Here's an interface sketch for b) and c). Kevin seems to be inclined into c); how about you Jason? For b): + +/** + * enum iommufd_dirty_status_flags - Flags for dirty tracking status + */ +enum iommufd_dirty_status_flags { + IOMMU_DIRTY_TRACKING_DISABLED = 0, + IOMMU_DIRTY_TRACKING_ENABLED = 1 << 0, + IOMMU_DIRTY_TRACKING_SUPPORTED = 1 << 1, + IOMMU_DIRTY_TRACKING_UNSUPPORTED = 1 << 2, +}; + +/** + * struct iommu_hwpt_get_dirty - ioctl(IOMMU_HWPT_GET_DIRTY) + * @size: sizeof(struct iommu_hwptgset_dirty) + * @hwpt_id: HW pagetable ID that represents the IOMMU domain. + * @out_status: status of dirty tracking support (see iommu_dirty_status_flags) + * + * Get dirty tracking status on an HW pagetable. + */ +struct iommu_hwpt_get_dirty { + __u32 size; + __u32 hwpt_id; + __u16 out_status; + __u16 __reserved; +}; +#define IOMMU_HWPT_GET_DIRTY _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_GET_DIRTY) The IOMMU implementation tells if it's enabled/disabled or supported/unsupported for the set of devices in the iommu domain. After we set dirty we are supposed to fail device attach for any potential IOMMU not supporting dirty tracking. This is anyways supposed to happen regardless of any of the approaches. For c): + +/** + * enum iommufd_device_caps + * @IOMMU_CAP_DIRTY_TRACKING: IOMMU device support for dirty tracking + */ +enum iommufd_device_caps { + IOMMUFD_CAP_DIRTY_TRACKING = 1 << 0, +}; + +/* + * struct iommu_device_caps - ioctl(IOMMU_DEVICE_GET_CAPS) + * @size: sizeof(struct iommu_device_caps) + * @dev_id: the device to query + * @caps: IOMMU capabilities of the device + */ +struct iommu_device_caps { + __u32 size; + __u32 dev_id; + __aligned_u64 caps; +}; +#define IOMMU_DEVICE_GET_CAPS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_GET_CAPS) Returns a hardware-agnostic view of IOMMU 'capabilities' of the device. @dev_id is supposed to be an iommufd_device object id. VMM is supposed to store and iterate dev_id and check every one of them for dirty tracking support prior to set_dirty.