On 22/03/17 15:44, Joerg Roedel wrote: > On Mon, Feb 27, 2017 at 07:54:35PM +0000, Jean-Philippe Brucker wrote: >> It is an important distinction because, if the IOMMU driver reassigns a >> PASID while the IOMMU still holds pending PPR targeting that PASID >> internally, the PPR will trigger a fault in the wrong address space. > > The IOMMU driver also controls a devices apbility to issue PPR requests > (at least on PCI), so it already knows whether a device has still > requests pending or if it even can create new ones. Apart from resetting the PRI capability, the SMMU doesn't have any control over the device's PPR requests, so we simply mandate that the caller did the required work to stop issuing them before calling iommu_unbind. > Furhter, the IOMMU driver can already wait for all pending faults to be > processed before it shuts down a PASID. So it is not clear to me why the > device driver needs to be involved here. The problem might be too tied to the specifics of the SMMU. As implemented in this series, the normal flow for a PPR with the SMMU is the following: (1) PCI device issues a PPR for PASID 1 (2) The PPR is queued by the SMMU in the (hardware) PRI queue (3) The SMMU driver receives an interrupt, dequeues the PPR and moves it to a software work queue. (4) The PPR is finally handled and a PRI response is sent to the device. The case that worries me is if someone unbinds PASID 1 between (2) and (3), while the PPR is still in the hardware queue, and immediately binds it to a new address space. Then (3) and (4) happen, the PPR is handled and the fault is for the new address space. It's certainly undesirable, but I don't know if it could be exploited. We don't kill the task for an unhandled fault at the moment, simply report a failed PPR to the device, so I might be worrying for nothing. Having the caller tell us if PPRs might still be pending in the hardware PRI queue ensures that the SMMU driver waits until it's entirely safe: * If the device has no outstanding PPR, PASID can be reallocated * If the device has outstanding PPRs, wait for a Stop Marker, or drain the PRI queue after a while (if the Stop Marker was lost in a PRI queue overflow). Draining the PRI queue is very costly, we need to block the PRI thread to inspect the queue, risking an overflow. And with these PASID state flags we avoid flushing any queue. But since the problem seems too centered around the SMMU, I might just drop this patch along with the CLEAN/FLUSHED flags in my next version, and go with the full-drain solution. After all, unbind should be a fairly rare event. Thanks, Jean-Philippe > When the device driver issues a PASID-unbind call the iommu driver > just waits until all pending faults are processed, answers new faults > with INVALID, then switch off the devices capability to issue new > faults, and then release the PASID. >