On Wed, Sep 16, 2020 at 07:04:46PM -0700, Hugh Dickins wrote: > On Thu, 17 Sep 2020, Dave Chinner wrote: > > > > So.... > > > > P0 p1 > > > > hole punch starts > > takes XFS_MMAPLOCK_EXCL > > truncate_pagecache_range() > > unmap_mapping_range(start, end) > > <clears ptes> > > <read fault> > > do_fault_around() > > ->map_pages > > filemap_map_pages() > > page mapping valid, > > page is up to date > > maps PTEs > > <fault done> > > truncate_inode_pages_range() > > truncate_cleanup_page(page) > > invalidates page > > delete_from_page_cache_batch(page) > > frees page > > <pte now points to a freed page> > > No. filemap_map_pages() checks page->mapping after trylock_page(), > before setting up the pte; and truncate_cleanup_page() does a one-page > unmap_mapping_range() if page_mapped(), while holding page lock. Ok, fair, I missed that. So why does truncate_pagecache() talk about fault races and require a second unmap range after the invalidation "for correctness" if this sort of race cannot happen? Why is that different to truncate_pagecache_range() which -doesn't-i do that second removal? It's called for more than just hole_punch - from the filesystem's persepective holepunch should do exactly the same as truncate to the page cache, and for things like COLLAPSE_RANGE it is absolutely essential because the data in that range is -not zero- and will be stale if the mappings are not invalidated completely.... Also, if page->mapping == NULL is sufficient to detect an invalidated page in all cases, then why does page_cache_delete() explicitly leave page->index intact: page->mapping = NULL; /* Leave page->index set: truncation lookup relies upon it */ Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx