[patch 106/159] mm/cma: add PF flag to force non cma alloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx>
Subject: mm/cma: add PF flag to force non cma alloc

Patch series "mm/kvm/vfio/ppc64: Migrate compound pages out of CMA
region", v8.

ppc64 uses the CMA area for the allocation of guest page table (hash page
table).  We won't be able to start guest if we fail to allocate hash page
table.  We have observed hash table allocation failure because we failed
to migrate pages out of CMA region because they were pinned.  This happen
when we are using VFIO.  VFIO on ppc64 pins the entire guest RAM.  If the
guest RAM pages get allocated out of CMA region, we won't be able to
migrate those pages.  The pages are also pinned for the lifetime of the
guest.

Currently we support migration of non-compound pages.  With THP and with
the addition of hugetlb migration we can end up allocating compound pages
from CMA region.  This patch series add support for migrating compound
pages.  



This patch (of 4):

Add PF_MEMALLOC_NOCMA which make sure any allocation in that context is
marked non-movable and hence cannot be satisfied by CMA region.

This is useful with get_user_pages_longterm where we want to take a page
pin by migrating pages from CMA region.  Marking the section
PF_MEMALLOC_NOCMA ensures that we avoid unnecessary page migration later.

Link: http://lkml.kernel.org/r/20190114095438.32470-2-aneesh.kumar@xxxxxxxxxxxxx
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
Suggested-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Reviewed-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Alexey Kardashevskiy <aik@xxxxxxxxx>
Cc: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>
Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/sched.h    |    1 
 include/linux/sched/mm.h |   48 ++++++++++++++++++++++++++++++-------
 2 files changed, 41 insertions(+), 8 deletions(-)

--- a/include/linux/sched.h~mm-cma-add-pf-flag-to-force-non-cma-alloc
+++ a/include/linux/sched.h
@@ -1407,6 +1407,7 @@ extern struct pid *cad_pid;
 #define PF_UMH			0x02000000	/* I'm an Usermodehelper process */
 #define PF_NO_SETAFFINITY	0x04000000	/* Userland is not allowed to meddle with cpus_allowed */
 #define PF_MCE_EARLY		0x08000000      /* Early kill for mce process policy */
+#define PF_MEMALLOC_NOCMA	0x10000000	/* All allocation request will have _GFP_MOVABLE cleared */
 #define PF_MUTEX_TESTER		0x20000000	/* Thread belongs to the rt mutex tester */
 #define PF_FREEZER_SKIP		0x40000000	/* Freezer should not count it as freezable */
 #define PF_SUSPEND_TASK		0x80000000      /* This thread called freeze_processes() and should not be frozen */
--- a/include/linux/sched/mm.h~mm-cma-add-pf-flag-to-force-non-cma-alloc
+++ a/include/linux/sched/mm.h
@@ -148,17 +148,25 @@ static inline bool in_vfork(struct task_
  * Applies per-task gfp context to the given allocation flags.
  * PF_MEMALLOC_NOIO implies GFP_NOIO
  * PF_MEMALLOC_NOFS implies GFP_NOFS
+ * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
  */
 static inline gfp_t current_gfp_context(gfp_t flags)
 {
-	/*
-	 * NOIO implies both NOIO and NOFS and it is a weaker context
-	 * so always make sure it makes precedence
-	 */
-	if (unlikely(current->flags & PF_MEMALLOC_NOIO))
-		flags &= ~(__GFP_IO | __GFP_FS);
-	else if (unlikely(current->flags & PF_MEMALLOC_NOFS))
-		flags &= ~__GFP_FS;
+	if (unlikely(current->flags &
+		     (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+		/*
+		 * NOIO implies both NOIO and NOFS and it is a weaker context
+		 * so always make sure it makes precedence
+		 */
+		if (current->flags & PF_MEMALLOC_NOIO)
+			flags &= ~(__GFP_IO | __GFP_FS);
+		else if (current->flags & PF_MEMALLOC_NOFS)
+			flags &= ~__GFP_FS;
+#ifdef CONFIG_CMA
+		if (current->flags & PF_MEMALLOC_NOCMA)
+			flags &= ~__GFP_MOVABLE;
+#endif
+	}
 	return flags;
 }
 
@@ -248,6 +256,30 @@ static inline void memalloc_noreclaim_re
 	current->flags = (current->flags & ~PF_MEMALLOC) | flags;
 }
 
+#ifdef CONFIG_CMA
+static inline unsigned int memalloc_nocma_save(void)
+{
+	unsigned int flags = current->flags & PF_MEMALLOC_NOCMA;
+
+	current->flags |= PF_MEMALLOC_NOCMA;
+	return flags;
+}
+
+static inline void memalloc_nocma_restore(unsigned int flags)
+{
+	current->flags = (current->flags & ~PF_MEMALLOC_NOCMA) | flags;
+}
+#else
+static inline unsigned int memalloc_nocma_save(void)
+{
+	return 0;
+}
+
+static inline void memalloc_nocma_restore(unsigned int flags)
+{
+}
+#endif
+
 #ifdef CONFIG_MEMCG
 /**
  * memalloc_use_memcg - Starts the remote memcg charging scope.
_



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux