On 12/11/23 11:24 PM, Jason Gunthorpe wrote:
Also iopf_queue_remove_device() is messed up - it returns an error code but nothing ever does anything with it 🙁 Remove functions like this should never fail.
Yes, agreed.
Removal should be like I explained earlier: - Disable new PRI reception
This could be done by rcu_assign_pointer(param->fault_param, NULL); ?
- Ack all outstanding PRQ to the device
All outstanding page requests are responded with IOMMU_PAGE_RESP_INVALID, indicating that device should not attempt any retry.
- Disable PRI on the device - Tear down the iopf infrastructure So under this model if the iopf_queue_remove_device() has been called it should be sort of a 'disassociate' action where fault_param is still floating out there but iommu_page_response() does nothing.
Yes. All pending requests have been auto-responded.
IOW pass the refcount from the iommu_report_device_fault() down into the fault handler, into the work and then into iommu_page_response() which will ultimately put it back.
Yes. Best regards, baolu