Re: [PATCH v3 07/25] fsdax: Hold dax lock over mapping insertion

Dan Williams <dan.j.williams@xxxxxxxxx> · Mon, 17 Oct 2022 13:17:23 -0700

Jason Gunthorpe wrote:
> On Fri, Oct 14, 2022 at 04:57:37PM -0700, Dan Williams wrote:
> > In preparation for dax_insert_entry() to start taking page and pgmap
> > references ensure that page->pgmap is valid by holding the
> > dax_read_lock() over both dax_direct_access() and dax_insert_entry().
> > 
> > I.e. the code that wants to elevate the reference count of a pgmap page
> > from 0 -> 1 must ensure that the pgmap is not exiting and will not start
> > exiting until the proper references have been taken.
> 
> I'm surprised we can have a vmfault while the pgmap is exiting?
> 
> Shouldn't the FS have torn down all the inodes before it starts
> killing the pgmap?

Historically, no. The block-device is allowed to disappear while inodes
are still live. For example, the filesystem's calls to blk_queue_enter()
will start failing, but otherwise the filesystem tries to hobble along
after the device-driver has finished ->remove(). In the typical
page-cache case this makes sense since there is still some residual
usability of cached data even after the backing device is gone.

Recently Ruan plumbed support for failure-notification callbacks into
the filesystem, or at least XFS. With that in place the driver can
theoretically notify failures like "device gone" and the FS can take
actions like tearing down inodes. However, that is FS specific enabling
/ behaviour, not something the pgmap code can rely upon. At least, not
without some layering violations.