+ mm-page_alloc-enable-pcpu_drain-with-zone-capability.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Wed, 12 Dec 2018 16:34:32 -0800

The patch titled
     Subject: mm, page_alloc: enable pcpu_drain with zone capability
has been added to the -mm tree.  Its filename is
     mm-page_alloc-enable-pcpu_drain-with-zone-capability.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-enable-pcpu_drain-with-zone-capability.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-enable-pcpu_drain-with-zone-capability.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Wei Yang <richard.weiyang@xxxxxxxxx>
Subject: mm, page_alloc: enable pcpu_drain with zone capability

drain_all_pages is documented to drain per-cpu pages for a given zone (if
non-NULL).  The current implementation doesn't match the description
though.  It will drain all pcp pages for all zones that happen to have
cached pages on the same cpu as the given zone.  This will lead to
premature pcp cache draining for zones that are not of any interest to the
caller - e.g.  compaction, hwpoison or memory offline.

This forces the page allocator to take locks and potential lock contention
as a result.

There is no real reason for this sub-optimal implementation.  Replace
per-cpu work item with a dedicated structure which contains a pointer to
the zone and pass it over to the worker.  This will get the zone
information all the way down to the worker function and do the right job.

[mhocko@xxxxxxxx: refactor the whole changelog]
Link: http://lkml.kernel.org/r/20181212142550.61686-1-richard.weiyang@xxxxxxxxx
Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Reviewed-by: Oscar Salvador <osalvador@xxxxxxx>
Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

--- a/mm/page_alloc.c~mm-page_alloc-enable-pcpu_drain-with-zone-capability
+++ a/mm/page_alloc.c
@@ -97,8 +97,12 @@ int _node_numa_mem_[MAX_NUMNODES];
 #endif
 
 /* work_structs for global per-cpu drains */
+struct pcpu_drain {
+	struct zone *zone;
+	struct work_struct work;
+};
 DEFINE_MUTEX(pcpu_drain_mutex);
-DEFINE_PER_CPU(struct work_struct, pcpu_drain);
+DEFINE_PER_CPU(struct pcpu_drain, pcpu_drain);
 
 #ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY
 volatile unsigned long latent_entropy __latent_entropy;
@@ -2779,6 +2783,8 @@ void drain_local_pages(struct zone *zone
 
 static void drain_local_pages_wq(struct work_struct *work)
 {
+	struct pcpu_drain *drain =
+		container_of(work, struct pcpu_drain, work);
 	/*
 	 * drain_all_pages doesn't use proper cpu hotplug protection so
 	 * we can race with cpu offline when the WQ can move this from
@@ -2787,7 +2793,7 @@ static void drain_local_pages_wq(struct
 	 * a different one.
 	 */
 	preempt_disable();
-	drain_local_pages(NULL);
+	drain_local_pages(drain->zone);
 	preempt_enable();
 }
 
@@ -2858,12 +2864,14 @@ void drain_all_pages(struct zone *zone)
 	}
 
 	for_each_cpu(cpu, &cpus_with_pcps) {
-		struct work_struct *work = per_cpu_ptr(&pcpu_drain, cpu);
-		INIT_WORK(work, drain_local_pages_wq);
-		queue_work_on(cpu, mm_percpu_wq, work);
+		struct pcpu_drain *drain = per_cpu_ptr(&pcpu_drain, cpu);
+
+		drain->zone = zone;
+		INIT_WORK(&drain->work, drain_local_pages_wq);
+		queue_work_on(cpu, mm_percpu_wq, &drain->work);
 	}
 	for_each_cpu(cpu, &cpus_with_pcps)
-		flush_work(per_cpu_ptr(&pcpu_drain, cpu));
+		flush_work(&per_cpu_ptr(&pcpu_drain, cpu)->work);
 
 	mutex_unlock(&pcpu_drain_mutex);
 }
_

Patches currently in -mm which might be from richard.weiyang@xxxxxxxxx are

mm-slub-remove-validation-on-cpu_slab-in-__flush_cpu_slab.patch
mm-slub-page-is-always-non-null-for-node_match.patch
mm-slub-record-final-state-of-slub-action-in-deactivate_slab.patch
mm-slub-improve-performance-by-skipping-checked-node-in-get_any_partial.patch
mm-remove-reset-of-pcp-counter-in-pageset_init.patch
vmscan-return-node_reclaim_noscan-in-node_reclaim-when-config_numa-is-n.patch
drivers-base-memoryc-remove-an-unnecessary-check-on-nr_mem_sections.patch
mm-check-nr_initialised-with-pages_per_section-directly-in-defer_init.patch
mm-sparse-drop-pgdat_resize_lock-in-sparse_add-remove_one_section.patch
mm-sparse-drop-pgdat_resize_lock-in-sparse_add-remove_one_section-v4.patch
mm-sparse-pass-nid-instead-of-pgdat-to-sparse_add_one_section.patch
mm-hotplug-move-init_currently_empty_zone-under-zone_span_lock-protection.patch
mm-show_mem-drop-pgdat_resize_lock-in-show_mem.patch
mm-page_alloc-calculate-first_deferred_pfn-directly.patch
memory_hotplug-remove-duplicate-declaration-of-offline_pages.patch
mm-page_alloc-enable-pcpu_drain-with-zone-capability.patch