On Wed, Sep 21, 2022 at 02:38:56PM -0700, Dan Williams wrote: > Dan Williams wrote: > > Jason Gunthorpe wrote: > > > On Thu, Sep 15, 2022 at 08:36:07PM -0700, Dan Williams wrote: > > > > The percpu_ref in 'struct dev_pagemap' is used to coordinate active > > > > mappings of device-memory with the device-removal / unbind path. It > > > > enables the semantic that initiating device-removal (or > > > > device-driver-unbind) blocks new mapping and DMA attempts, and waits for > > > > mapping revocation or inflight DMA to complete. > > > > > > This seems strange to me > > > > > > The pagemap should be ref'd as long as the filesystem is mounted over > > > the dax. The ref should be incrd when the filesystem is mounted and > > > decrd when it is unmounted. > > > > > > When the filesystem unmounts it should zap all the mappings (actually > > > I don't think you can even unmount a filesystem while mappings are > > > open) and wait for all page references to go to zero, then put the > > > final pagemap back. > > > > > > The rule is nothing can touch page->pgmap while page->refcount == 0, > > > and if page->refcount != 0 then page->pgmap must be valid, without any > > > refcounting on the page map itself. > > > > > > So, why do we need pgmap refcounting all over the place? It seems like > > > it only existed before because of the abuse of the page->refcount? > > > > Recall that this percpu_ref is mirroring the same function as > > blk_queue_enter() whereby every new request is checking to make sure the > > device is still alive, or whether it has started exiting. > > > > So pgmap 'live' reference taking in fs/dax.c allows the core to start > > failing fault requests once device teardown has started. It is a 'block > > new, and drain old' semantic. It is weird this email never arrived for me.. I think that is all fine, but it would be much more logically expressed as a simple 'is pgmap alive' call before doing a new mapping than mucking with the refcount logic. Such a test could simply READ_ONCE a bool value in the pgmap struct. Indeed, you could reasonably put such a liveness test at the moment every driver takes a 0 refcount struct page and turns it into a 1 refcount struct page. Jason