Re: [RFC PATCHES 00/17] IOMMUFD: Deliver IO page faults to user space

Nicolin Chen <nicolinc@xxxxxxxxxx> · Sun, 25 Jun 2023 12:21:28 -0700

On Sun, Jun 25, 2023 at 02:30:46PM +0800, Baolu Lu wrote:
> External email: Use caution opening links or attachments
> 
> 
> On 2023/5/31 2:50, Nicolin Chen wrote:
> > Hi Baolu,
> > 
> > On Tue, May 30, 2023 at 01:37:07PM +0800, Lu Baolu wrote:
> > 
> > > This series implements the functionality of delivering IO page faults to
> > > user space through the IOMMUFD framework. The use case is nested
> > > translation, where modern IOMMU hardware supports two-stage translation
> > > tables. The second-stage translation table is managed by the host VMM
> > > while the first-stage translation table is owned by the user space.
> > > Hence, any IO page fault that occurs on the first-stage page table
> > > should be delivered to the user space and handled there. The user space
> > > should respond the page fault handling result to the device top-down
> > > through the IOMMUFD response uAPI.
> > > 
> > > User space indicates its capablity of handling IO page faults by setting
> > > a user HWPT allocation flag IOMMU_HWPT_ALLOC_FLAGS_IOPF_CAPABLE. IOMMUFD
> > > will then setup its infrastructure for page fault delivery. Together
> > > with the iopf-capable flag, user space should also provide an eventfd
> > > where it will listen on any down-top page fault messages.
> > > 
> > > On a successful return of the allocation of iopf-capable HWPT, a fault
> > > fd will be returned. User space can open and read fault messages from it
> > > once the eventfd is signaled.
> > 
> > I think that, whether the guest has an IOPF capability or not,
> > the host should always forward any stage-1 fault/error back to
> > the guest. Yet, the implementation of this series builds with
> > the IOPF framework that doesn't report IOMMU_FAULT_DMA_UNRECOV.
> > 
> > And I have my doubt at the using the IOPF framework with that
> > IOMMU_PAGE_RESP_ASYNC flag: using the IOPF framework is for
> > its bottom half workqueue, because a page response could take
> > a long cycle. But adding that flag feels like we don't really
> > need the bottom half workqueue, i.e. losing the point of using
> > the IOPF framework, IMHO.
> > 
> > Combining the two facts above, I wonder if we really need to
> > go through the IOPF framework; can't we just register a user
> > fault handler in the iommufd directly upon a valid event_fd?
> 
> Agreed. We should avoid workqueue in sva iopf framework. Perhaps we
> could go ahead with below code? It will be registered to device with
> iommu_register_device_fault_handler() in IOMMU_DEV_FEAT_IOPF enabling
> path. Un-registering in the disable path of cause.

Well, for a virtualization use case, I still think it's should
be registered in iommufd. Having a device without an IOPF/PRI
capability, a guest OS should receive some faults too, if that
device causes a translation failure.

And for a vSVA use case, the IOMMU_DEV_FEAT_IOPF feature only
gets enabled in the guest VM right? How could the host enable
the IOMMU_DEV_FEAT_IOPF to trigger this handler?

Thanks
Nic

> static int io_pgfault_handler(struct iommu_fault *fault, void *cookie)
> {
>         ioasid_t pasid = fault->prm.pasid;
>         struct device *dev = cookie;
>         struct iommu_domain *domain;
> 
>         if (fault->type != IOMMU_FAULT_PAGE_REQ)
>                 return -EOPNOTSUPP;
> 
>         if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
>                 domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
>         else
>                 domain = iommu_get_domain_for_dev(dev);
> 
>         if (!domain || !domain->iopf_handler)
>                 return -ENODEV;
> 
>         if (domain->type == IOMMU_DOMAIN_SVA)
>                 return iommu_queue_iopf(fault, cookie);
> 
>         return domain->iopf_handler(fault, dev, domain->fault_data);
> }
> 
> Best regards,
> baolu