Re: [PATCH v8 06/10] iommufd: Add iommufd fault object

Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> · Thu, 4 Jul 2024 15:32:32 +0800

On 2024/7/4 14:37, Tian, Kevin wrote:
From: Nicolin Chen<nicolinc@xxxxxxxxxx>
Sent: Thursday, July 4, 2024 1:36 PM

On Thu, Jul 04, 2024 at 10:59:45AM +0800, Baolu Lu wrote:
On Tue, Jul 02, 2024 at 02:34:40PM +0800, Lu Baolu wrote:

+enum iommu_fault_type {
+     IOMMU_FAULT_TYPE_HWPT_IOPF,
+     IOMMU_FAULT_TYPE_VIOMMU_IRQ,
+};

   struct iommu_fault_alloc {
       __u32 size;
       __u32 flags;
+     __u32 type;  /* enum iommu_fault_type */
       __u32 out_fault_id;
       __u32 out_fault_fd;
and need a new reserved field for alignment.

   };

I understand that this is already v8. So, maybe we can, for now,
apply the small diff above with an IOMMU_FAULT_TYPE_HWPT_IOPF
type
check in the ioctl handler. And a decoupling for the iopf fops in
the ioctl handler can come later in the viommu series:
       switch (type) {
       case IOMMU_FAULT_TYPE_HWPT_IOPF:
               filep = anon_inode_getfile("[iommufd-pgfault]",
                                          &iommufd_fault_fops_iopf);
       case IOMMU_FAULT_TYPE_VIOMMU_IRQ:
               filep = anon_inode_getfile("[iommufd-viommu-irq]",
                                          &iommufd_fault_fops_viommu);
       default:
               return -EOPNOSUPP;
       }

Since you are the designer here, I think you have a better 10000
foot view -- maybe I am missing something here implying that the
fault object can't be really reused by viommu.

Would you mind sharing some thoughts here?
I think this is a choice between "two different objects" vs. "same
object with different FD interfaces". If I understand it correctly, your
proposal of unrecoverable fault delivery is not limited to vcmdq, but
generic to all unrecoverable events that userspace should be aware of
when the passed-through device is affected.
It's basically IRQ forwarding, not confined to unrecoverable
faults. For example, a VCMDQ used by the guest kernel would
raise an HW IRQ if the guest kernel issues an illegal command
to the HW Queue assigned to it. The host kernel will receive
the IRQ, so it needs a way to forward it to the VM for guest
kernel to recover the HW queue.

The way that we define the structure can follow what we have
for hwpt_alloc/invalidate uAPIs, i.e. driver data/event. And
such an event can carry unrecoverable translation faults too.
SMMU at least reports DMA translation faults using an eventQ
in its own native language.

 From a hardware architecture perspective, the interfaces for
unrecoverable events don't always match the page faults. For example,
VT-d architecture defines a PR queue for page faults, but uses a
register set to report unrecoverable events. The 'reason', 'request id'
and 'pasid' fields of the register set indicate what happened on the
hardware. New unrecoverable events will not be reported until the
previous one has been fetched.
Understood. I don't think we can share the majority pieces in
the fault.c. Just the "IOMMU_FAULT_QUEUE_ALLOC" ioctl itself
looks way too general to be limited to page-fault usage only.
So, I feel we can share, for example:
     IOMMU_FAULT_QUEUE_ALLOC (type=hwpt_iopf) -> fault_id=1
     IOMMU_HWPT_ALLOC (fault_id=1) -> hwpt_id=2
     IOMMU_FAULT_QUEUE_ALLOC (type=viommu_irq) -> fault_id=3
     IOMMU_VIOMMU_ALLOC (fault_id=2) -> viommu_id=4
The handler will direct to different fops as I drafted in my
previous mail.

With the above being said, I have no strong opinions between these two
choices. Jason and Kevin should have more insights.
Thanks. Jason is out of office this week, so hopefully Kevin
may shed some light. I personally feel that we don't need to
largely update this series until we add VIOMMU. Yet, it would
be convenient if we add a "type" in the uAPI with this series.

This is ok to me.

So

Nicolin, perhaps can you please cook an additional patch on top of this
series and post it for further review?

Thanks,
baolu