On Wed, Nov 13, 2019 at 2:49 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote: > > On 11/13/19 2:00 PM, Dan Williams wrote: > ... > >> Ugh, when did all this HMM specific manipulation sneak into the > >> generic ZONE_DEVICE path? It used to be gated by pgmap type with its > >> own put_zone_device_private_page(). For example it's certainly > >> unnecessary and might be broken (would need to check) to call > >> mem_cgroup_uncharge() on a DAX page. ZONE_DEVICE users are not a > >> monolith and the HMM use case leaks pages into code paths that DAX > >> explicitly avoids. > > > > It's been this way for a while and I did not react previously, > > apologies for that. I think __ClearPageActive, __ClearPageWaiters, and > > mem_cgroup_uncharge, belong behind a device-private conditional. The > > history here is: > > > > Move some, but not all HMM specifics to hmm_devmem_free(): > > 2fa147bdbf67 mm, dev_pagemap: Do not clear ->mapping on final put > > > > Remove the clearing of mapping since no upstream consumers needed it: > > b7a523109fb5 mm: don't clear ->mapping in hmm_devmem_free > > > > Add it back in once an upstream consumer arrived: > > 7ab0ad0e74f8 mm/hmm: fix ZONE_DEVICE anon page mapping reuse > > > > We're now almost entirely free of ->page_free callbacks except for > > that weird nouveau case, can that FIXME in nouveau_dmem_page_free() > > also result in killing the ->page_free() callback altogether? In the > > meantime I'm proposing a cleanup like this: > > > OK, assuming this is acceptable (no obvious problems jump out at me, > and we can also test it with HMM), then how would you like to proceed, as > far as patches go: add such a patch as part of this series here, or as a > stand-alone patch either before or after this series? Or something else? > And did you plan on sending it out as such? I think it makes sense to include it in your series since you're looking to refactor the implementation. I can send you one based on current linux-next as a lead-in cleanup before the refactor. Does that work for you? > > Also, the diffs didn't quite make it through intact to my "git apply", so > I'm re-posting the diff in hopes that this time it survives: Apologies for that. For quick "how about this" patch examples, I just copy and paste into gmail and it sometimes clobbers it. > > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > index f9f76f6ba07b..21db1ce8c0ae 100644 > --- a/drivers/nvdimm/pmem.c > +++ b/drivers/nvdimm/pmem.c > @@ -338,13 +338,7 @@ static void pmem_release_disk(void *__pmem) > put_disk(pmem->disk); > } > > -static void pmem_pagemap_page_free(struct page *page) > -{ > - wake_up_var(&page->_refcount); > -} > - > static const struct dev_pagemap_ops fsdax_pagemap_ops = { > - .page_free = pmem_pagemap_page_free, > .kill = pmem_pagemap_kill, > .cleanup = pmem_pagemap_cleanup, > }; > diff --git a/mm/memremap.c b/mm/memremap.c > index 03ccbdfeb697..157edb8f7cf8 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -419,12 +419,6 @@ void __put_devmap_managed_page(struct page *page) > * holds a reference on the page. > */ > if (count == 1) { > - /* Clear Active bit in case of parallel mark_page_accessed */ > - __ClearPageActive(page); > - __ClearPageWaiters(page); > - > - mem_cgroup_uncharge(page); > - > /* > * When a device_private page is freed, the page->mapping field > * may still contain a (stale) mapping value. For example, the > @@ -446,10 +440,17 @@ void __put_devmap_managed_page(struct page *page) > * handled differently or not done at all, so there is no need > * to clear page->mapping. > */ > - if (is_device_private_page(page)) > - page->mapping = NULL; > + if (is_device_private_page(page)) { > + /* Clear Active bit in case of parallel mark_page_accessed */ > + __ClearPageActive(page); > + __ClearPageWaiters(page); > > - page->pgmap->ops->page_free(page); > + mem_cgroup_uncharge(page); > + > + page->mapping = NULL; > + page->pgmap->ops->page_free(page); > + } else > + wake_up_var(&page->_refcount); > } else if (!count) > __put_page(page); > } > -- > 2.24.0 > > > thanks, > -- > John Hubbard > NVIDIA > > > > > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > > index ad8e4df1282b..4eae441f86c9 100644 > > --- a/drivers/nvdimm/pmem.c > > +++ b/drivers/nvdimm/pmem.c > > @@ -337,13 +337,7 @@ static void pmem_release_disk(void *__pmem) > > put_disk(pmem->disk); > > } > > > > -static void pmem_pagemap_page_free(struct page *page) > > -{ > > - wake_up_var(&page->_refcount); > > -} > > - > > static const struct dev_pagemap_ops fsdax_pagemap_ops = { > > - .page_free = pmem_pagemap_page_free, > > .kill = pmem_pagemap_kill, > > .cleanup = pmem_pagemap_cleanup, > > }; > > diff --git a/mm/memremap.c b/mm/memremap.c > > index 03ccbdfeb697..157edb8f7cf8 100644 > > --- a/mm/memremap.c > > +++ b/mm/memremap.c > > @@ -419,12 +419,6 @@ void __put_devmap_managed_page(struct page *page) > > * holds a reference on the page. > > */ > > if (count == 1) { > > - /* Clear Active bit in case of parallel mark_page_accessed */ > > - __ClearPageActive(page); > > - __ClearPageWaiters(page); > > - > > - mem_cgroup_uncharge(page); > > - > > /* > > * When a device_private page is freed, the page->mapping field > > * may still contain a (stale) mapping value. For example, the > > @@ -446,10 +440,17 @@ void __put_devmap_managed_page(struct page *page) > > * handled differently or not done at all, so there is no need > > * to clear page->mapping. > > */ > > - if (is_device_private_page(page)) > > - page->mapping = NULL; > > + if (is_device_private_page(page)) { > > + /* Clear Active bit in case of parallel > > mark_page_accessed */ > > + __ClearPageActive(page); > > + __ClearPageWaiters(page); > > > > - page->pgmap->ops->page_free(page); > > + mem_cgroup_uncharge(page); > > + > > + page->mapping = NULL; > > + page->pgmap->ops->page_free(page); > > + } else > > + wake_up_var(&page->_refcount); > > } else if (!count) > > __put_page(page); > > } > >