+ mm-compaction-drain-pcps-for-zone-when-kcompactd-fails.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, compaction: drain pcps for zone when kcompactd fails
has been added to the -mm tree.  Its filename is
     mm-compaction-drain-pcps-for-zone-when-kcompactd-fails.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-compaction-drain-pcps-for-zone-when-kcompactd-fails.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-compaction-drain-pcps-for-zone-when-kcompactd-fails.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: David Rientjes <rientjes@xxxxxxxxxx>
Subject: mm, compaction: drain pcps for zone when kcompactd fails

It's possible for buddy pages to become stranded on pcps that, if drained,
could be merged with other buddy pages on the zone's free area to form
large order pages, including up to MAX_ORDER.

Consider a verbose example using the tools/vm/page-types tool at the
beginning of a ZONE_NORMAL, where 'B' indicates a buddy page and 'S'
indicates a slab page, which the migration scanner is attempting to
defragment (and doing it well, absent coalescing up to cc.order):

109954  1       _______S________________________________________________________
109955  2       __________B_____________________________________________________
109957  1       ________________________________________________________________
109958  1       __________B_____________________________________________________
109959  7       ________________________________________________________________
109960  1       __________B_____________________________________________________
109961  9       ________________________________________________________________
10996a  1       __________B_____________________________________________________
10996b  3       ________________________________________________________________
10996e  1       __________B_____________________________________________________
10996f  1       ________________________________________________________________
109970  1       __________B_____________________________________________________
109971  f       ________________________________________________________________
...
109f88  1       __________B_____________________________________________________
109f89  3       ________________________________________________________________
109f8c  1       __________B_____________________________________________________
109f8d  2       ________________________________________________________________
109f8f  2       __________B_____________________________________________________
109f91  f       ________________________________________________________________
109fa0  1       __________B_____________________________________________________
109fa1  7       ________________________________________________________________
109fa8  1       __________B_____________________________________________________
109fa9  1       ________________________________________________________________
109faa  1       __________B_____________________________________________________
109fab  1       _______S________________________________________________________

These buddy pages, spanning 1,621 pages, could be coalesced and allow for
three transparent hugepages to be dynamically allocated.  Totaling all
hugepage length spans that could be coalesced, this could yield over 400
hugepages on the zone's free area when at the time this /proc/kpageflags
was collected, there were _no_ order-9 or order-10 pages available for
allocation even after triggering compaction through procfs.

When kcompactd fails to defragment memory such that a cc.order page can be
allocated, drain all pcps for the zone back to the buddy allocator so this
stranding cannot occur.  Compaction for that order will subsequently be
deferred, which acts as a ratelimit on this drain.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803010340100.88270@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/compaction.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff -puN mm/compaction.c~mm-compaction-drain-pcps-for-zone-when-kcompactd-fails mm/compaction.c
--- a/mm/compaction.c~mm-compaction-drain-pcps-for-zone-when-kcompactd-fails
+++ a/mm/compaction.c
@@ -1988,6 +1988,14 @@ static void kcompactd_do_work(pg_data_t
 			compaction_defer_reset(zone, cc.order, false);
 		} else if (status == COMPACT_PARTIAL_SKIPPED || status == COMPACT_COMPLETE) {
 			/*
+			 * Buddy pages may become stranded on pcps that could
+			 * otherwise coalesce on the zone's free area for
+			 * order >= cc.order.  This is ratelimited by the
+			 * upcoming deferral.
+			 */
+			drain_all_pages(zone);
+
+			/*
 			 * We use sync migration mode here, so we defer like
 			 * sync direct compaction does.
 			 */
_

Patches currently in -mm which might be from rientjes@xxxxxxxxxx are

mm-page_alloc-extend-kernelcore-and-movablecore-for-percent.patch
mm-page_alloc-extend-kernelcore-and-movablecore-for-percent-fix.patch
mm-page_alloc-move-mirrored_kernelcore-to-__meminitdata.patch
mm-compaction-drain-pcps-for-zone-when-kcompactd-fails.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux