On Fri, Oct 20, 2017 at 09:47:50AM +0200, Christoph Hellwig wrote: > I'd like to brainstorm how we can do something better. > > How about: > > If we hit a page with an elevated refcount in truncate / hole puch > etc for a DAX file system we do not free the blocks in the file system, > but add it to the extent busy list. We mark the page as delayed > free (e.g. page flag?) so that when it finally hits refcount zero we > call back into the file system to remove it from the busy list. Brainstorming some more: Given that on a DAX file there shouldn't be any long-term page references after we unmap it from the page table and don't allow get_user_pages calls why not wait for the references for all DAX pages to go away first? E.g. if we find a DAX page in truncate_inode_pages_range that has an elevated refcount we set a new flag to prevent new references from showing up, and then simply wait for it to go away. Instead of a busy way we can do this through a few hashed waitqueued in dev_pagemap. And in fact put_zone_device_page already gets called when putting the last page so we can handle the wakeup from there. In fact if we can't find a page flag for the stop new callers things we could probably come up with a way to do that through dev_pagemap somehow, but I'm not sure how efficient that would be.