Jason Gunthorpe wrote: > On Fri, Sep 30, 2022 at 10:56:27AM -0700, Dan Williams wrote: > > Jan Kara wrote: > > [..] > > > I agree this is doable but there's the nasty sideeffect that inode reclaim > > > may block for abitrary time waiting for page pinning. If the application > > > that has pinned the page requires __GFP_FS memory allocation to get to a > > > point where it releases the page, we even have a deadlock possibility. > > > So it's better than the UAF issue but still not ideal. > > > > I expect VMA pinning would have similar deadlock exposure if pinning a > > VMA keeps the inode allocated. Anything that puts a page-pin release > > dependency in the inode freeing path can potentially deadlock a reclaim > > event that depends on that inode being freed. > > I think the desire would be to go from the VMA to an inode_get and > hold the inode reference for the from the pin_user_pages() to the > unpin_user_page(), ie prevent it from being freed in the first place. > > It is a fine idea, the trouble is just the high complexity to get > there. > > However, I wonder if the trucate/hole punch paths have the same > deadlock problem? If the deadlock is waiting for inode reclaim to complete then I can see why the VMA pin proposal and the current truncate paths do not trigger that deadlock because the inode is kept out of the reclaim path. > I agree with you though, given the limited options we should convert > the UAF into an unlikely deadlock. I think this approach makes the implementation incrementally better, and that the need to plumb VMA pinning can await evidence that a driver actually does this *and* the driver can not be fixed.