On Wed, Jun 24, 2020 at 01:47:23PM -0700, John Hubbard wrote: > On 2020-06-24 12:21, Jason Gunthorpe wrote: > > On Wed, Jun 24, 2020 at 08:14:17PM +0100, Chris Wilson wrote: > > > A general rule of thumb is that shrinkers should be fast and effective. > > > They are called from direct reclaim at the most incovenient of times when > > > the caller is waiting for a page. If we attempt to reclaim a page being > > > pinned for active dma [pin_user_pages()], we will incur far greater > > > latency than a normal anonymous page mapped multiple times. Worse the > > > page may be in use indefinitely by the HW and unable to be reclaimed > > > in a timely manner. > > > > A pinned page can't be migrated, discarded or swapped by definition - > > it would cause data corruption. > > > > So, how do things even get here and/or work today at all? I think the > > explanation is missing something important. > > > > Well, those activities generally try to unmap the page, and > have to be prepared to deal with failure to unmap. From my reading, > it seemed very clear. I think Yang explained it - the page is removed from the mappings but freeing it does not happen because page_ref_freeze() does not succeed due to the pin. Presumably the mappings can reconnect to the same physical page if it is re-faulted to avoid any data corruption. So, the issue here is the mappings are trashed while the page remains - and trashing the mapping triggers a mmu notifier which upsets i915. > What's less clear is why the comment and the commit description > only talk about reclaim, when there are additional things that call > try_to_unmap(), including: > > migrate_vma_unmap() > split_huge_page_to_list() --> unmap_page() It looks like the same unmap first then abort if the refcount is still elevated design as shrink_page_list() ? > I do like this code change, though. And I *think* it's actually safe to > do this, as it stays away from writeback or other filesystem activity. > But let me double check that, in case I'm forgetting something. It would be nice to have an explanation why it is OK now to change it.. I don't know, but could it be that try_to_unmap() has to be done before checking the refcount as each mapping is included in the refcount? ie we couldn't know a DMA pin was active in advance? Now that we have your pin stuff we can detect a DMA pin without doing all the unmaps? Jason