+ memory-failure-fetch-compound_head-after-pgmap_pfn_valid.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: memory-failure: fetch compound_head after pgmap_pfn_valid()
has been added to the -mm tree.  Its filename is
     memory-failure-fetch-compound_head-after-pgmap_pfn_valid.patch

This patch should soon appear at
    https://ozlabs.org/~akpm/mmots/broken-out/memory-failure-fetch-compound_head-after-pgmap_pfn_valid.patch
and later at
    https://ozlabs.org/~akpm/mmotm/broken-out/memory-failure-fetch-compound_head-after-pgmap_pfn_valid.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Joao Martins <joao.m.martins@xxxxxxxxxx>
Subject: memory-failure: fetch compound_head after pgmap_pfn_valid()

Patch series "mm, device-dax: Introduce compound pages in devmap", v6.

This series converts device-dax to use compound pages, and moves away from
the 'struct page per basepage on PMD/PUD' that is done today.  Doing so,
1) unlocks a few noticeable improvements on unpin_user_pages() and makes
device-dax+altmap case 4x times faster in pinning (numbers below and in
last patch) 2) as mentioned in various other threads it's one important
step towards cleaning up ZONE_DEVICE refcounting.

I've split the compound pages on devmap part from the rest based on recent
discussions on devmap pending and future work planned[5][6].  There is
consensus that device-dax should be using compound pages to represent its
PMD/PUDs just like HugeTLB and THP, and that leads to less specialization
of the dax parts.  I will pursue the rest of the work in parallel once
this part is merged, particular the GUP-{slow,fast} improvements [7] and
the tail struct page deduplication memory savings part[8].

To summarize what the series does:

Patch 1: Prepare hwpoisoning to work with dax compound pages.

Patches 2-3: Split the current utility function of prep_compound_page()
into head and tail and use those two helpers where appropriate to take
advantage of caches being warm after __init_single_page().  This is used
when initializing zone device when we bring up device-dax namespaces.

Patches 4-10: Add devmap support for compound pages in device-dax. 
memmap_init_zone_device() initialize its metadata as compound pages, and
it introduces a new devmap property known as vmemmap_shift which outlines
how the vmemmap is structured (defaults to base pages as done today).  The
property describe the page order of the metadata essentially.  While at it
do a few cleanups in device-dax in patches 5-9.  Finally enable device-dax
usage of devmap @vmemmap_shift to a value based on its own @align
property.  @vmemmap_shift returns 0 by default (which is today's case of
base pages in devmap, like fsdax or the others) and the usage of compound
devmap is optional.  Starting with device-dax (*not* fsdax) we enable it
by default.  There are a few pinning improvements particular on the
unpinning case and altmap, as well as unpin_user_page_range_dirty_lock()
being just as effective as THP/hugetlb[0] pages.

    $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S -a -n 512 -w
    (pin_user_pages_fast 2M pages) put:~71 ms -> put:~22 ms
    [altmap]
    (pin_user_pages_fast 2M pages) get:~524ms put:~525 ms -> get: ~127ms put:~71ms
    
     $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S -a -n 512 -w
    (pin_user_pages_fast 2M pages) put:~513 ms -> put:~188 ms
    [altmap with -m 127004]
    (pin_user_pages_fast 2M pages) get:~4.1 secs put:~4.12 secs -> get:~1sec put:~563ms

Tested on x86 with 1Tb+ of pmem (alongside registering it with RDMA with
and without altmap), alongside gup_test selftests with dynamic dax regions
and static dax regions.  Coupled with ndctl unit tests for dynamic dax
devices that exercise all of this.  Note, for dynamic dax regions I had to
revert commit 8aa83e6395 ("x86/setup: Call early_reserve_memory()
earlier"), it is a known issue that this commit broke efi_fake_mem=.


This patch (of 10):

memory_failure_dev_pagemap() at the moment assumes base pages (e.g. 
dax_lock_page()).  For devmap with compound pages fetch the compound_head
in case a tail page memory failure is being handled.

Currently this is a nop, but in the advent of compound pages in
dev_pagemap it allows memory_failure_dev_pagemap() to keep working.

Link: https://lkml.kernel.org/r/20211124191005.20783-1-joao.m.martins@xxxxxxxxxx
Link: https://lkml.kernel.org/r/20211124191005.20783-2-joao.m.martins@xxxxxxxxxx
Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx>
Reported-by: Jane Chu <jane.chu@xxxxxxxxxx>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
Cc: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Cc: Dave Jiang <dave.jiang@xxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxx>
Cc: John Hubbard <jhubbard@xxxxxxxxxx>
Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Cc: Jonathan Corbet <corbet@xxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/memory-failure.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/mm/memory-failure.c~memory-failure-fetch-compound_head-after-pgmap_pfn_valid
+++ a/mm/memory-failure.c
@@ -1558,6 +1558,12 @@ static int memory_failure_dev_pagemap(un
 	}
 
 	/*
+	 * Pages instantiated by device-dax (not filesystem-dax)
+	 * may be compound pages.
+	 */
+	page = compound_head(page);
+
+	/*
 	 * Prevent the inode from being freed while we are interrogating
 	 * the address_space, typically this would be handled by
 	 * lock_page(), but dax pages do not use the page lock. This
_

Patches currently in -mm which might be from joao.m.martins@xxxxxxxxxx are

memory-failure-fetch-compound_head-after-pgmap_pfn_valid.patch
mm-page_alloc-split-prep_compound_page-into-head-and-tail-subparts.patch
mm-page_alloc-refactor-memmap_init_zone_device-page-init.patch
mm-memremap-add-zone_device-support-for-compound-pages.patch
device-dax-use-align-for-determining-pgoff.patch
device-dax-use-struct_size.patch
device-dax-ensure-dev_dax-pgmap-is-valid-for-dynamic-devices.patch
device-dax-factor-out-page-mapping-initialization.patch
device-dax-set-mapping-prior-to-vmf_insert_pfn_pmdpud.patch
device-dax-compound-devmap-support.patch




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux