On Tue, Dec 19, 2017 at 05:11:38PM -0800, Dan Williams wrote: > On Fri, Nov 10, 2017 at 1:08 AM, Christoph Hellwig <hch@xxxxxx> wrote: > >> + struct { > >> + /* > >> + * ZONE_DEVICE pages are never on an lru or handled by > >> + * a slab allocator, this points to the hosting device > >> + * page map. > >> + */ > >> + struct dev_pagemap *pgmap; > >> + /* > >> + * inode association for MEMORY_DEVICE_FS_DAX page-idle > >> + * callbacks. Note that we don't use ->mapping since > >> + * that has hard coded page-cache assumptions in > >> + * several paths. > >> + */ > > > > What assumptions? I'd much rather fix those up than having two fields > > that have the same functionality. > > [ Reviving this old thread where you asked why I introduce page->inode > instead of reusing page->mapping ] > > For example, xfs_vm_set_page_dirty() assumes that page->mapping being > non-NULL indicates a typical page cache page, this is a false > assumption for DAX. That means every single filesystem has an incorrect assumption for DAX pages. xfs_vm_set_page_dirty() is derived directly from __set_page_dirty_buffers(), which is the default function that set_page_dirty() calls to do it's work. Indeed, ext4 also calls __set_page_dirty_buffers(), so whatever problem XFS has here with DAX and racing truncates is going to manifest in ext4 as well. > My guess at a fix for this is to add > pagecache_page() checks to locations like this, but I worry about how > to find them all. Where pagecache_page() is: > > bool pagecache_page(struct page *page) > { > if (!page->mapping) > return false; > if (!IS_DAX(page->mapping->host)) > return false; > return true; > } This is likely to be a problem in lots more places if we have to treat "has page been truncated away" race checks on dax mappings differently to page cache mappings. This smells of a whack-a-mole style bandaid to me.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx