[PATCH] slub: consider pfmemalloc_match() in get_partial_node()

Joonsoo Kim <js1304@xxxxxxxxx> · Sat, 25 Aug 2012 03:00:02 +0900

There is no consideration for pfmemalloc_match() in get_partial(). If we don't
consider that, we can't restrict access to PFMEMALLOC page mostly.

We may encounter following scenario.

Assume there is a request from normal allocation
and there is no objects in per cpu cache and no node partial slab.

In this case, slab_alloc go into slow-path and
new_slab_objects() is invoked. It may return PFMEMALLOC page.
Current user is not allowed to access PFMEMALLOC page,
deactivate_slab() is called (commit 5091b74a95d447e34530e713a8971450a45498b3),
then return object from PFMEMALLOC page.

Next time, when we meet another request from normal allocation,
slab_alloc() go into slow-path and re-go new_slab_objects().
In new_slab_objects(), we invoke get_partial() and we get a partial slab
which we have been deactivated just before, that is, PFMEMALLOC page.
We extract one object from it and re-deactivate.

"deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.

As a result, we can't restrict access to PFMEMALLOC page and
moreover, it introduce much performance degration to normal allocation
because of deactivation frequently.

Now, we need to consider pfmemalloc_match() in get_partial_node()
It prevent "deactivate -> re-get in get_partial".
Instead, new_slab() is called. It may return !PFMEMALLOC page,
so above situation will be suspended sometime.

Signed-off-by: Joonsoo Kim <js1304@xxxxxxxxx>
Cc: David Miller <davem@xxxxxxxxxxxxx>
Cc: Neil Brown <neilb@xxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Mike Christie <michaelc@xxxxxxxxxxx>
Cc: Eric B Munson <emunson@xxxxxxxxx>
Cc: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Cc: Sebastian Andrzej Siewior <sebastian@xxxxxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
This patch based on Pekka's slab/next tree with my two patches.

[PATCH 1/2] slub: rename cpu_partial to max_cpu_object
https://lkml.org/lkml/2012/8/24/293

[PATCH 2/2] slub: correct the calculation of the number of cpu objects in get_partial_node
https://lkml.org/lkml/2012/8/24/290

diff --git a/mm/slub.c b/mm/slub.c
index c96e0e4..a21da3a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1529,12 +1529,13 @@ static inline void *acquire_slab(struct kmem_cache *s,
 }
 
 static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain);
+static inline bool pfmemalloc_match(struct page *page, gfp_t gfpflags);
 
 /*
  * Try to allocate a partial slab from a specific node.
  */
-static void *get_partial_node(struct kmem_cache *s,
-		struct kmem_cache_node *n, struct kmem_cache_cpu *c)
+static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n,
+				struct kmem_cache_cpu *c, gfp_t flags)
 {
 	struct page *page, *page2;
 	void *object = NULL;
@@ -1551,8 +1552,12 @@ static void *get_partial_node(struct kmem_cache *s,
 
 	spin_lock(&n->list_lock);
 	list_for_each_entry_safe(page, page2, &n->partial, lru) {
-		void *t = acquire_slab(s, n, page, object == NULL);
+		void *t;
 
+		if (!pfmemalloc_match(page, flags))
+			continue;
+
+		t = acquire_slab(s, n, page, object == NULL);
 		if (!t)
 			break;
 
@@ -1620,7 +1625,7 @@ static void *get_any_partial(struct kmem_cache *s, gfp_t flags,
 
 			if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
 					n->nr_partial > s->min_partial) {
-				object = get_partial_node(s, n, c);
+				object = get_partial_node(s, n, c, flags);
 				if (object) {
 					/*
 					 * Return the object even if
@@ -1649,7 +1654,7 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node,
 	void *object;
 	int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
 
-	object = get_partial_node(s, get_node(s, searchnode), c);
+	object = get_partial_node(s, get_node(s, searchnode), c, flags);
 	if (object || node != NUMA_NO_NODE)
 		return object;
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>