Re: [PATCH v8 06/10] iommufd: Add iommufd fault object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/4/24 7:06 AM, Nicolin Chen wrote:
Hi Baolu,

Hi Nicolin,

On Tue, Jul 02, 2024 at 02:34:40PM +0800, Lu Baolu wrote:

An iommufd fault object provides an interface for delivering I/O page
faults to user space. These objects are created and destroyed by user
space, and they can be associated with or dissociated from hardware page
table objects during page table allocation or destruction.

User space interacts with the fault object through a file interface. This
interface offers a straightforward and efficient way for user space to
handle page faults. It allows user space to read fault messages
sequentially and respond to them by writing to the same file. The file
interface supports reading messages in poll mode, so it's recommended that
user space applications use io_uring to enhance read and write efficiency.

A fault object can be associated with any iopf-capable iommufd_hw_pgtable
during the pgtable's allocation. All I/O page faults triggered by devices
when accessing the I/O addresses of an iommufd_hw_pgtable are routed
through the fault object to user space. Similarly, user space's responses
to these page faults are routed back to the iommu device driver through
the same fault object.
There is a need for VIOMMU object to report HW fault to VMM. For
example, a HW-accelerated VCMDQ may encounter HW errors. HW will
raise an IRQ to the host kernel and the host kernel will forward
it to the guest. I think we can have a viommu->fault, similar to
the hwpt->fault introduced by this series. This viommu->fault can
also benefit nested IOMMU for reporting translation error.

I learned that this hwpt->fault is exclusively for IOPF/PRI. And
Jason suggested me to add a different one for VIOMMU. Yet, after
taking a closer look, I found the fault object in this series is
seemingly quite generic at the uAPI level: its naming/structure,
and the way how it's allocated and passed to hwpt, despite being
highly correlated with IOPF in its fops code. So, I feel that we
might have a chance of reusing it for different fault types:

+enum iommu_fault_type {
+	IOMMU_FAULT_TYPE_HWPT_IOPF,
+	IOMMU_FAULT_TYPE_VIOMMU_IRQ,
+};

  struct iommu_fault_alloc {
  	__u32 size;
  	__u32 flags;
+	__u32 type;  /* enum iommu_fault_type */
  	__u32 out_fault_id;
  	__u32 out_fault_fd;
  };

I understand that this is already v8. So, maybe we can, for now,
apply the small diff above with an IOMMU_FAULT_TYPE_HWPT_IOPF type
check in the ioctl handler. And a decoupling for the iopf fops in
the ioctl handler can come later in the viommu series:
	switch (type) {
	case IOMMU_FAULT_TYPE_HWPT_IOPF:
		filep = anon_inode_getfile("[iommufd-pgfault]",
					   &iommufd_fault_fops_iopf);
	case IOMMU_FAULT_TYPE_VIOMMU_IRQ:
		filep = anon_inode_getfile("[iommufd-viommu-irq]",
					   &iommufd_fault_fops_viommu);
	default:
		return -EOPNOSUPP;
	}

Since you are the designer here, I think you have a better 10000
foot view -- maybe I am missing something here implying that the
fault object can't be really reused by viommu.

Would you mind sharing some thoughts here?

I think this is a choice between "two different objects" vs. "same
object with different FD interfaces". If I understand it correctly, your
proposal of unrecoverable fault delivery is not limited to vcmdq, but
generic to all unrecoverable events that userspace should be aware of
when the passed-through device is affected.

From a hardware architecture perspective, the interfaces for
unrecoverable events don't always match the page faults. For example,
VT-d architecture defines a PR queue for page faults, but uses a
register set to report unrecoverable events. The 'reason', 'request id'
and 'pasid' fields of the register set indicate what happened on the
hardware. New unrecoverable events will not be reported until the
previous one has been fetched.

With the above being said, I have no strong opinions between these two
choices. Jason and Kevin should have more insights.

Thanks,
baolu




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux