On Mon, Dec 4, 2017 at 4:34 PM, Christoph Hellwig <hch@xxxxxx> wrote: > Both callers of get_dev_pagemap that pass in a pgmap don't actually hold a > reference to the pgmap they pass in, contrary to the comment in the function. > > Change the calling convention so that get_dev_pagemap always consumes the > previous reference instead of doing this using an explicit earlier call to > put_dev_pagemap in the callers. > > The callers will still need to put the final reference after finishing the > loop over the pages. I don't think we need this change, but perhaps the reasoning should be added to the code as a comment... details below. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > kernel/memremap.c | 17 +++++++++-------- > mm/gup.c | 7 +++++-- > 2 files changed, 14 insertions(+), 10 deletions(-) > > diff --git a/kernel/memremap.c b/kernel/memremap.c > index f0b54eca85b0..502fa107a585 100644 > --- a/kernel/memremap.c > +++ b/kernel/memremap.c > @@ -506,22 +506,23 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start) > * @pfn: page frame number to lookup page_map > * @pgmap: optional known pgmap that already has a reference > * > - * @pgmap allows the overhead of a lookup to be bypassed when @pfn lands in the > - * same mapping. > + * If @pgmap is non-NULL and covers @pfn it will be returned as-is. If @pgmap > + * is non-NULL but does not cover @pfn the reference to it while be released. > */ > struct dev_pagemap *get_dev_pagemap(unsigned long pfn, > struct dev_pagemap *pgmap) > { > - const struct resource *res = pgmap ? pgmap->res : NULL; > resource_size_t phys = PFN_PHYS(pfn); > > /* > - * In the cached case we're already holding a live reference so > - * we can simply do a blind increment > + * In the cached case we're already holding a live reference. > */ > - if (res && phys >= res->start && phys <= res->end) { > - percpu_ref_get(pgmap->ref); > - return pgmap; > + if (pgmap) { > + const struct resource *res = pgmap ? pgmap->res : NULL; > + > + if (res && phys >= res->start && phys <= res->end) > + return pgmap; > + put_dev_pagemap(pgmap); > } > > /* fall back to slow path lookup */ > diff --git a/mm/gup.c b/mm/gup.c > index d3fb60e5bfac..9d142eb9e2e9 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -1410,7 +1410,6 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > > VM_BUG_ON_PAGE(compound_head(page) != head, page); > > - put_dev_pagemap(pgmap); > SetPageReferenced(page); > pages[*nr] = page; > (*nr)++; > @@ -1420,6 +1419,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > ret = 1; > > pte_unmap: > + if (pgmap) > + put_dev_pagemap(pgmap); > pte_unmap(ptem); > return ret; > } > @@ -1459,10 +1460,12 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, > SetPageReferenced(page); > pages[*nr] = page; > get_page(page); > - put_dev_pagemap(pgmap); It's safe to do the put_dev_pagemap() here because the pgmap cannot be released until the corresponding put_page() for that get_page() we just did occurs. So we're only holding the pgmap reference long enough to take individual page references. We used to take and put individual pgmap references inside get_page() / put_page(), but that got simplified in this commit to just take and put page reference at devm_memremap_pages() setup / teardown time: 71389703839e mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>