On Wed, Apr 08, 2020 at 07:32:18PM -0300, Jason Gunthorpe wrote: > On Wed, Apr 08, 2020 at 02:35:52PM -0700, Jacob Pan wrote: > > > On Wed, Apr 08, 2020 at 11:35:52AM -0700, Jacob Pan wrote: > > > > Hi Jean, > > > > > > > > On Wed, 8 Apr 2020 16:04:25 +0200 > > > > Jean-Philippe Brucker <jean-philippe@xxxxxxxxxx> wrote: > > > > > > > > > The IOMMU SVA API currently requires device drivers to implement > > > > > an mm_exit() callback, which stops device jobs that do DMA. This > > > > > function is called in the release() MMU notifier, when an address > > > > > space that is shared with a device exits. > > > > > > > > > > It has been noted several time during discussions about SVA that > > > > > cancelling DMA jobs can be slow and complex, and doing it in the > > > > > release() notifier might cause synchronization issues (patch 2 has > > > > > more background). Device drivers must in any case call unbind() to > > > > > remove their bond, after stopping DMA from a more favorable > > > > > context (release of a file descriptor). > > > > > > > > > > So after mm exits, rather than notifying device drivers, we can > > > > > hold on to the PASID until unbind(), ask IOMMU drivers to > > > > > silently abort DMA and Page Requests in the meantime. This change > > > > > should relieve the mmput() path. > > > > > > > > I assume mm is destroyed after all the FDs are closed > > > > > > FDs do not hold a mmget(), but they may hold a mmgrab(), ie anything > > > using mmu_notifiers has to hold a grab until the notifier is > > > destroyed, which is often triggered by FD close. > > > > > Sorry, I don't get this. Are you saying we have to hold a mmgrab() > > between svm_bind/mmu_notifier_register and > > svm_unbind/mmu_notifier_unregister? > > Yes. This is done automatically for the caller inside the mmu_notifier > implementation. We now even store the mm_struct pointer inside the > notifier. > > Once a notifier is registered the mm_struct remains valid memory until > the notifier is unregistered. > > > Isn't the idea of mmu_notifier is to avoid holding the mm reference and > > rely on the notifier to tell us when mm is going away? > > The notifier only holds a mmgrab(), not a mmget() - this allows > exit_mmap to proceed, but the mm_struct memory remains. > > This is also probably why it is a bad idea to tie the lifetime of > something like a pasid to the mmdrop as a evil user could cause a > large number of mm structs to be released but not freed, probably > defeating cgroup limits and so forth (not sure) The max number of processes can be limited for a user. PASID is per address space so the max number of PASID can be limited for the user. So the user cannot exhaust PASID so easily, right? Intel ENQCMD instruction uses PASID MSR to store the PASID. Each software thread can store the PASID in its own MSR/fpu state. If free PASID in unbind_mm(), the threads PASID MSRs need to be cleared as well: tracking which thread has the MSR set up, searching the threads, sending IPIs to the thread to clear the MSR, locking, etc. It's doable but complex with low performance. Binding the PASID to the mm and freeing the PASID in __mmdrop() can get ride of the complexity. Thanks. -Fenghua