Re: [PATCH v5 01/12] iommu: Introduce a replace API for device pasid

Baolu Lu <baolu.lu@xxxxxxxxxxxxxxx> · Wed, 6 Nov 2024 16:52:18 +0800

On 2024/11/5 23:10, Jason Gunthorpe wrote:
On Tue, Nov 05, 2024 at 04:10:59PM +0800, Yi Liu wrote:

not quite get why this handle is related to iommu driver flushing PRs.
Before __iommu_set_group_pasid(), the pasid is still attached with the
old domain, so is the hw configuration.
I meant that in the path of __iommu_set_group_pasid(), the iommu drivers
have the opportunity to flush the PRs pending in the hardware queue. If
the attach_handle is switched (by calling xa_store()) before
__iommu_set_group_pasid(), the pending PRs will be routed to iopf
handler of the new domain, which is not desirable.
I see. You mean the handling of PRQs. I was interpreting you are talking
about PRQ draining.
I don't think we need to worry about this race, and certainly you
shouldn't be making the domain replacement path non-hitless just to
fence the page requests.

If a page request comes in during the race window of domain change
there are only three outcomes:

   1) The old domain handles it and it translates on the old domain
   2) The new domain handles it and it translates on the new domain
   3) The old domain handles it and it translates on the new domain.
      a) The page request is ack'd and the device retries and loads the
        new domain - OK - at best it will use the new translation, at
        worst it will retry.
      b) the page request fails and the device sees the failure. This
         is the same as #1 - OK

All are correct. We don't need to do more here than just let the race
resolve itself.

Once the domains are switched in HW we do have to flush everything
queued due to the fault path locking scheme on the domain.

Agreed. To my understanding, the worst case is that the device retries
the transaction which might result in another page fault, which will set
up the translation in the new domain.

--
baolu