+ mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, page_alloc: actually ignore mempolicies for high priority allocations
has been added to the -mm tree.  Its filename is
     mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vlastimil Babka <vbabka@xxxxxxx>
Subject: mm, page_alloc: actually ignore mempolicies for high priority allocations

The __alloc_pages_slowpath() function has for a long time contained code
to ignore node restrictions from memory policies for high priority
allocations.  The current code that resets the zonelist iterator however
does effectively nothing after commit 7810e6781e0f ("mm, page_alloc: do
not break __GFP_THISNODE by zonelist reset") removed a buggy zonelist
reset.  Even before that commit, mempolicy restrictions were still not
ignored, as they are passed in ac->nodemask which is untouched by the
code.

We can either remove the code, or make it work as intended.  Since
ac->nodemask can be set from task's mempolicy via alloc_pages_current()
and thus also alloc_pages(), it may indeed affect kernel allocations, and
it makes sense to ignore it to allow progress for high priority
allocations.

Thus, this patch resets ac->nodemask to NULL in such cases.  This assumes
all callers can handle it (i.e.  there are no guarantees as in the case of
__GFP_THISNODE) which seems to be the case.  The same assumption is
already present in check_retry_cpuset() for some time.

The expected effect is that high priority kernel allocations in the
context of userspace tasks (e.g.  OOM victims) restricted by mempolicies
will have higher chance to succeed if they are restricted to nodes with
depleted memory, while there are other nodes with free memory left.


Ot's not a new intention, but for the first time the code will match the
intention, AFAICS.  It was intended by commit 183f6371aac2 ("mm: ignore
mempolicies when using ALLOC_NO_WATERMARK") in v3.6 but I think it never
really worked, as mempolicy restriction was already encoded in nodemask,
not zonelist, at that time.

So originally that was for ALLOC_NO_WATERMARK only.  Then it was adjusted
by e46e7b77c909 ("mm, page_alloc: recalculate the preferred zoneref if the
context can ignore memory policies") and cd04ae1e2dc8 ("mm, oom: do not
rely on TIF_MEMDIE for memory reserves access") to the current state.  So
even GFP_ATOMIC would now ignore mempolicies after the initial attempts
fail - if the code worked as people thought it does.

Link: http://lkml.kernel.org/r/20180612122624.8045-1-vbabka@xxxxxxx
Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/page_alloc.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff -puN mm/page_alloc.c~mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations
+++ a/mm/page_alloc.c
@@ -4164,11 +4164,12 @@ retry:
 		alloc_flags = reserve_flags;
 
 	/*
-	 * Reset the zonelist iterators if memory policies can be ignored.
-	 * These allocations are high priority and system rather than user
-	 * orientated.
+	 * Reset the nodemask and zonelist iterators if memory policies can be
+	 * ignored. These allocations are high priority and system rather than
+	 * user oriented.
 	 */
 	if (!(alloc_flags & ALLOC_CPUSET) || reserve_flags) {
+		ac->nodemask = NULL;
 		ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
 					ac->high_zoneidx, ac->nodemask);
 	}
_

Patches currently in -mm which might be from vbabka@xxxxxxx are

mm-page_alloc-actually-ignore-mempolicies-for-high-priority-allocations.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux