On Tue, Feb 09, 2021 at 09:35:20AM -0400, Jason Gunthorpe wrote: > On Tue, Feb 09, 2021 at 11:57:28PM +1100, Alistair Popple wrote: > > On Tuesday, 9 February 2021 9:27:05 PM AEDT Daniel Vetter wrote: > > > > > > > > Recent changes to pin_user_pages() prevent the creation of pinned pages in > > > > ZONE_MOVABLE. This series allows pinned pages to be created in > > ZONE_MOVABLE > > > > as attempts to migrate may fail which would be fatal to userspace. > > > > > > > > In this case migration of the pinned page is unnecessary as the page can > > be > > > > unpinned at anytime by having the driver revoke atomic permission as it > > > > does for the migrate_to_ram() callback. However a method of calling this > > > > when memory needs to be moved has yet to be resolved so any discussion is > > > > welcome. > > > > > > Why do we need to pin for gpu atomics? You still have the callback for > > > cpu faults, so you > > > can move the page as needed, and hence a long-term pin sounds like the > > > wrong approach. > > > > Technically a real long term unmoveable pin isn't required, because as you say > > the page can be moved as needed at any time. However I needed some way of > > stopping the CPU page from being freed once the userspace mappings for it had > > been removed. > > The issue is you took the page out of the PTE it belongs to, which > makes it orphaned and unlocatable by the rest of the mm? > > Ideally this would leave the PTE in place so everything continues to > work, just disable CPU access to it. > > Maybe some kind of special swap entry? > > I also don't much like the use of ZONE_DEVICE here, that should only > be used for actual device memory, not as a temporary proxy for CPU > pages.. Having two struct pages refer to the same physical memory is > pretty ugly. > > > The normal solution of registering an MMU notifier to unpin the page when it > > needs to be moved also doesn't work as the CPU page tables now point to the > > device-private page and hence the migration code won't call any invalidate > > notifiers for the CPU page. > > The fact the page is lost from the MM seems to be the main issue here. > > > Yes, I would like to avoid the long term pin constraints as well if possible I > > just haven't found a solution yet. Are you suggesting it might be possible to > > add a callback in the page migration logic to specially deal with moving these > > pages? > > How would migration even find the page? Migration can scan memory from physical address (isolate_migratepages_range()) So the CPU mapping is not the only path to get to a page. Cheers, Jérôme