+ mm-export-get_vma_policy.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     mm: export get_vma_policy()
has been added to the -mm tree.  Its filename is
     mm-export-get_vma_policy.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find
out what to do about this

The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/

------------------------------------------------------
Subject: mm: export get_vma_policy()
From: Stephen Wilson <wilsons@xxxxxxxx>

Recently a concern was raised[1] that performing an allocation while
holding a reference on a tasks mm could lead to a stalemate in the oom
killer.  The concern was specific to the goings-on in /proc.  Hugh Dickins
stated the issue thusly:

    ...imagine what happens if the system is out of memory, and the mm
    we're looking at is selected for killing by the OOM killer: while we
    wait in __get_free_page for more memory, no memory is freed from the
    selected mm because it cannot reach exit_mmap while we hold that
    reference.

The primary goal of this series is to eliminate repeated allocation/free
cycles currently happening in show_numa_maps() while we hold a reference
to an mm.

The strategy is to perform the allocation once when /proc/pid/numa_maps is
opened, before a reference on the target tasks mm is taken.

Unfortunately, show_numa_maps() is implemented in mm/mempolicy.c while the
primary procfs implementation lives in fs/proc/task_mmu.c.  This makes
clean cooperation between show_numa_maps() and the other seq_file
operations (start(), stop(), etc) difficult.


Patches 1-5 convert show_numa_maps() to use the generic walk_page_range()
functionality instead of the mempolicy.c specific page table walking
logic.  Also, get_vma_policy() is exported.  This makes the
show_numa_maps() implementation independent of mempolicy.c.  

Patch 6 moves show_numa_maps() and supporting routines over to
fs/proc/task_mmu.c.

Finally, patches 7 and 8 provide minor cleanup and eliminates the
troublesome allocation.

 
Please note that moving show_numa_maps() into fs/proc/task_mmu.c
essentially reverts 1a75a6c825 ("Fold numa_maps into mempolicies.c") and
48fce3429d ("mempolicies: unexport get_vma_policy()").  Also, please see
the discussion at [2].  My main justifications for moving the code back
into task_mmu.c is:

  - Having the show() operation "miles away" from the corresponding
    seq_file iteration operations is a maintenance burden. 
    
  - The need to export ad hoc info like struct proc_maps_private is
    eliminated.


This patch:

In commit 48fce3429df84a ("mempolicies: unexport get_vma_policy()")
get_vma_policy() was marked static as all clients were local to
mempolicy.c.

However, the decision to generate /proc/pid/numa_maps in the numa memory
policy code and outside the procfs subsystem introduces an artificial
interdependency between the two systems.  Exporting get_vma_policy() once
again is the first step to clean up this interdependency.

Signed-off-by: Stephen Wilson <wilsons@xxxxxxxx>
Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Lee Schermerhorn <lee.schermerhorn@xxxxxx>
Cc: Alexey Dobriyan <adobriyan@xxxxxxxxx>
Cc: Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/mempolicy.h |    3 +++
 mm/mempolicy.c            |    2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff -puN include/linux/mempolicy.h~mm-export-get_vma_policy include/linux/mempolicy.h
--- a/include/linux/mempolicy.h~mm-export-get_vma_policy
+++ a/include/linux/mempolicy.h
@@ -199,6 +199,9 @@ void mpol_free_shared_policy(struct shar
 struct mempolicy *mpol_shared_policy_lookup(struct shared_policy *sp,
 					    unsigned long idx);
 
+struct mempolicy *get_vma_policy(struct task_struct *tsk,
+		struct vm_area_struct *vma, unsigned long addr);
+
 extern void numa_default_policy(void);
 extern void numa_policy_init(void);
 extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new,
diff -puN mm/mempolicy.c~mm-export-get_vma_policy mm/mempolicy.c
--- a/mm/mempolicy.c~mm-export-get_vma_policy
+++ a/mm/mempolicy.c
@@ -1489,7 +1489,7 @@ asmlinkage long compat_sys_mbind(compat_
  * freeing by another task.  It is the caller's responsibility to free the
  * extra reference for shared policies.
  */
-static struct mempolicy *get_vma_policy(struct task_struct *task,
+struct mempolicy *get_vma_policy(struct task_struct *task,
 		struct vm_area_struct *vma, unsigned long addr)
 {
 	struct mempolicy *pol = task->mempolicy;
_

Patches currently in -mm which might be from wilsons@xxxxxxxx are

mm-export-get_vma_policy.patch
mm-use-walk_page_range-instead-of-custom-page-table-walking-code.patch
mm-remove-mpol_mf_stats.patch
mm-make-gather_stats-type-safe-and-remove-forward-declaration.patch
mm-remove-check_huge_range.patch
mm-proc-move-show_numa_map-to-fs-proc-task_mmuc.patch
proc-make-struct-proc_maps_private-truly-private.patch
proc-allocate-storage-for-numa_maps-statistics-once.patch
proc-put-check_mem_permission-after-__get_free_page-in-mem_write.patch
proc-fix-pagemap_read-error-case.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux