Hi Jean, On 2021/1/15 0:41, Jean-Philippe Brucker wrote:
I guess detailing what's needed for nested IOPF can help the discussion, although I haven't seen any concrete plan about implementing it, and it still seems a couple of years away. There are two important steps with nested IOPF: (1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event comes with this information, but a PRI page request doesn't. The IOMMU driver has to first translate the IOVA to a GPA, injecting the fault into the guest if this translation fails by using the usual iommu_report_device_fault(). (2) Translating the faulting GPA to a HVA that can be fed to handle_mm_fault(). That requires help from KVM, so another interface - either KVM registering GPA->HVA translation tables or IOMMU driver querying each translation. Either way it should be reusable by device drivers that implement IOPF themselves. (1) could be enabled with iommu_dev_enable_feature(). (2) requires a more complex interface. (2) alone might also be desirable - demand-paging for level 2 only, no SVA for level 1. Anyway, back to this patch. What I'm trying to convey is "can the IOMMU receive incoming I/O page faults for this device and, when SVA is enabled, feed them to the mm subsystem? Enable that or return an error." I'm stuck on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1 is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2) above. IOMMU_DEV_FEAT_IOPF_FLAT? That leaves space for the nested extensions. (1) above could be IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or just an external fault handler) and could be used with either IOPF_FLAT or IOPF_NESTED. We can figure out the details later. What do you think?
I agree that we can define IOPF_ for current usage and leave space for future extensions. IOPF_FLAT represents IOPF on first-level translation, currently first level translation could be used in below cases. 1) FL w/ internal Page Table: Kernel IOVA; 2) FL w/ external Page Table: VFIO passthrough; 3) FL w/ shared CPU page table: SVA We don't need to support IOPF for case 1). Let's put it aside. IOPF handling of 2) and 3) are different. Do we need to define different names to distinguish these two cases? Best regards, baolu