Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 6, 2020 at 4:28 PM lixinhai.lxh@xxxxxxxxx
<lixinhai.lxh@xxxxxxxxx> wrote:
>
> On 2020-01-06 at 18:43 Konstantin Khlebnikov wrote:
> >On 06/01/2020 09.37, Li Xinhai wrote:
> >> For fork case, the dst->vm_prev is always same as src->vm_prev when
> >> anon_vma_clone() is called. Removing the assignment from
> >> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
> >> anon_vma which is shared by its parent vmas.
> >
> >This doesn't sound right.
> >
> >I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev.
> >So, this thing works isn't as is supposed to be.
> >
> >I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0
> >then in child related DST1 DST2 .. DSTn should fork and share ANON1:
> >Forking DST1 creates new ANON1 and then DST2 and following share it.
>
> This logic was not fully clarified in
> https://lore.kernel.org/linux-mm/20191011072256.16275-2-richardw.yang@xxxxxxxxxxxxxxx/
> I've assumed that sharing parent vma's anon_vma with child vma was the
> purpose of that patch, and it intentionally want the first child has its own new
> anon_vma (don't sharing as done by other child vma).

Well, this more or less follows from original design.
Page anon-vma along with page offset limits set of vmas scanned by rmap:
it skips vmas where page cannot be mapped for sure.

If vmas in one process shares anon-vma then they likely have
non-overlapping offsets,
so there is no reason to fork personal anon-vma for each of them when
process forks.
But it's good to fork new anon-vma for all of them together: then rmap
could skip scanning
parent vmas for pages allocated\cowed in child process. Together they
act like one big vma.

>
> >
> >Also this assumption is wrong:
> > > Parent has vm_prev, which implies we have vm_prev.
> >If in parent prev VMA has VM_DONTCOPY then in child prev VMA will
> >not match pprev or even could be NULL if it was first in mm.
> >
> >See patch:
> >https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u
> >
> >I've tested it using this:
> >
> >--- a/fs/proc/task_mmu.c
> >+++ b/fs/proc/task_mmu.c
> >@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v)
> >                 seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> >         show_smap_vma_flags(m, vma);
> >
> >+       if (vma->anon_vma)
> >+               seq_printf(m, "AnonVMA: %p %p %d\n",
> >+                          vma->anon_vma,
> >+                          vma->anon_vma->parent,
> >+                          vma->anon_vma->degree);
> >+
> >         m_cache_vma(m, vma);
> >
> >         return 0;
> >
> >---
> >
> >#include <sys/mman.h>
> >#include <stdlib.h>
> >#include <unistd.h>
> >#include <string.h>
> >#include <stdio.h>
> >
> >int main(int argc, char **argv) {
> >       void *ptr;
> >       char buf[100];
> >
> >       ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> >       memset(ptr, 0, 0x3000);
> >       mprotect(ptr + 0x1000, 0x1000, PROT_READ);
> >
> >       sprintf(buf, "cat /proc/%d/smaps", getpid());
> >       system(buf);
> >
> >       if (fork()) {
> >       wait(NULL);
> >       } else {
> >       printf("\n\n\n");
> >       fflush(stdout);
> >       sprintf(buf, "cat /proc/%d/smaps", getpid());
> >       system(buf);
> >       }
> >}
> >
> >---
> >
> >>
> >> Signed-off-by: Li Xinhai <lixinhai.lxh@xxxxxxxxx>
> >> Cc: Wei Yang <richardw.yang@xxxxxxxxxxxxxxx>
> >> Cc: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
> >> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> >> ---
> >>   mm/rmap.c | 7 +++----
> >>   1 file changed, 3 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index b3e3819..3c912a6c 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >>   {
> >>   struct anon_vma_chain *avc, *pavc;
> >>   struct anon_vma *root = NULL;
> >> -    struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
> >> +    struct vm_area_struct *pprev = src->vm_prev;
> >>
> >>   /*
> >> -    * If parent share anon_vma with its vm_prev, keep this sharing in in
> >> +    * If parent share anon_vma with its vm_prev, keep this sharing in
> >>   * child.
> >>   *
> >>   * 1. Parent has vm_prev, which implies we have vm_prev.
> >> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >>   */
> >>   if (!dst->anon_vma && src->anon_vma &&
> >>       pprev && pprev->anon_vma == src->anon_vma)
> >> -    dst->anon_vma = prev->anon_vma;
> >> -
> >> +    dst->anon_vma = pprev->anon_vma;
> >>
> >>   list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> >>   struct anon_vma *anon_vma;
> >>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux