On Mon, Oct 21, 2019 at 02:40:41PM -0400, Jerome Glisse wrote: > On Tue, Oct 15, 2019 at 03:12:27PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > > 8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1, > > scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where > > they only use invalidate_range_start/end and immediately check the > > invalidating range against some driver data structure to tell if the > > driver is interested. Half of them use an interval_tree, the others are > > simple linear search lists. > > > > Of the ones I checked they largely seem to have various kinds of races, > > bugs and poor implementation. This is a result of the complexity in how > > the notifier interacts with get_user_pages(). It is extremely difficult to > > use it correctly. > > > > Consolidate all of this code together into the core mmu_notifier and > > provide a locking scheme similar to hmm_mirror that allows the user to > > safely use get_user_pages() and reliably know if the page list still > > matches the mm. > > > > This new arrangment plays nicely with the !blockable mode for > > OOM. Scanning the interval tree is done such that the intersection test > > will always succeed, and since there is no invalidate_range_end exposed to > > drivers the scheme safely allows multiple drivers to be subscribed. > > > > Four places are converted as an example of how the new API is used. > > Four are left for future patches: > > - i915_gem has complex locking around destruction of a registration, > > needs more study > > - hfi1 (2nd user) needs access to the rbtree > > - scif_dma has a complicated logic flow > > - vhost's mmu notifiers are already being rewritten > > > > This is still being tested, but I figured to send it to start getting help > > from the xen, amd and hfi drivers which I cannot test here. > > It might be a good oportunity to also switch those users to > hmm_range_fault() instead of GUP as GUP is pointless for those > users. In fact the GUP is an impediment to normal mm operations. I think vhost can use hmm_range_fault hfi1 does actually need to have the page pin, it doesn't fence DMA during invalidate. i915_gem feels alot like amdgpu, so probably it would benefit No idea about scif_dma > I will test on nouveau. Thanks, hopefully it still works, I think Ralph was able to do some basic checks. But it is a pretty complicated series, I probably made some mistakes. FWIW, I know that nouveau gets a lockdep splat now from Daniel Vetter's recent changes, it tries to do GFP_KERENEL allocations under a lock also held by the invalidate_range_start path. Thanks for looking at it! Regards, Jason