Re: [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 25, 2024 at 01:06:40PM GMT, zhiguojiang wrote:
>
>
> 在 2024/8/25 0:26, Lorenzo Stoakes 写道:
> > [Some people who received this message don't often get email from lorenzo.stoakes@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > On Fri, Aug 23, 2024 at 11:02:06PM GMT, Zhiguo Jiang wrote:
> > > After CoWed by do_wp_page, the vma established a new mapping relationship
> > > with the CoWed folio instead of the non-CoWed folio. However, regarding
> > > the situation where vma->anon_vma and the non-CoWed folio's anon_vma are
> > > not same, the avc binding relationship between them will no longer be
> > > needed, so it is issue for the avc binding relationship still existing
> > > between them.
> > >
> > > This patch will remove the avc binding relationship between vma and the
> > > non-CoWed folio's anon_vma, which each has their own independent
> > > anon_vma. It can also alleviates rmap overhead simultaneously.
> > >
> > > Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx>
> >
> > NACK (until fixed). This is broken (see below).
> >
> Hi Lorenzo Stoakes,
>
> Thank you for your comments.
> > I'm not seeing any numbers to back anything up here as to why we want to
> > make changes to this incredibly sensitive code?
> I added a debug trace log (as follows) in wp_page_copy() and observed
> that a large number of these orphan avc-objects still exist. I believe
> this will have a certain redundant overhead impact on anonymous folios'
> rmap avcs, so I want to remove it, which is also the most essential
> value of this patch.

Sorry nack to that idea unless you can provide actual _data_ to demonstrate
an overhead.

And even if you did, given the original patch was so completely broken, and
in such a sensitive area, I'm going to need to be VERY confident you didn't
break anything, so we're going to need tests.

>
> -- the vital part of debug trace patch:

Thanks for providing! Will snip for sake of making it easier to reply.

[snip]

> >
> > Also anon_vma logic is very complicated and confusing, this commit message
> > feels about 3 paragraphs too light.
> >
> > Under what circumstances will vma->anon_vma be different from
> > folio_anon_vma(non_cowed_folio)? etc.
> In anon_vma_fork() --> anon_vma_clone(), child vma is bound with parent
> vma's anon_vma firstly.
>     /*
>      * First, attach the new VMA to the parent VMA's anon_vmas,
>      * so rmap can find non-COWed pages in child processes.
>      */
>     error = anon_vma_clone(vma, pvma);
>
> When child vma->anon_vma is NULL in anon_vma_fork(),
>     /* An existing anon_vma has been reused, all done then. */
>     if (vma->anon_vma)
>         return 0;
>
>     /* Then add our own anon_vma. */
>     anon_vma = anon_vma_alloc();
>
> new anon_vma will be alloced and filled in this child vma->anon_vma.
> Then during CoWed in do_wp_page() --> wp_page_copy(), this child vma's
> new anon_vma will be different from folio_anon_vma(non_cowed_folio).

Thanks for the explanation, but I was suggesting you have to put this in
the commit message rather than in repy to me :)

> > Confusing topics strongly require explanations that help (somewhat)
> > compensate. This is one of them.

[snip]

> > > index 93c0c25433d0..4c89cb1cb73e
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
> > >                         * old page will be flushed before it can be reused.
> > >                         */
> > >                        folio_remove_rmap_pte(old_folio, vmf->page, vma);
> > > +
> > > +                     /*
> > > +                      * If the new_folio's anon_vma is different from the
> > > +                      * old_folio's anon_vma, the avc binding relationship
> > > +                      * between vma and the old_folio's anon_vma is removed,
> > > +                      * avoiding rmap redundant overhead.
> > What overhead? Worth spelling out for instance if it's unnecessary to
> > traverse avc's.
> I think this will have a certain redundant overhead impact on anonymous
> folios rmap traverse avcs process.

This is again nowhere near detailed enough, and again I'm asking you to
write this _IN THE COMMENT_ not in review.

I already understand what you're trying to do (I think the fact I provided
a _working_ version of your patch as an attachment in this thread should
give a clue ;), this is for the benefit of people coming to read this code.

[snip]

> > Again I question the value of this change. Are we REALLY seeing a big
> > problem due to unneeded avc's hanging around? This is very sensitive,
> > fiddly, confusing code, do we REALLY want to be playing with it?
> Thank you for helping to identify mang issues with this patch. However,
> I think this will have a certain benefits for anonymous folio rmap
> traverse avc overhead.
> >
> > It'd be good to get some tests though unless you move this to vma.c with
> > its userland testing (probably a good idea actually as Andrew suggested)
> > this might be tricky.
> This patch belongs to anon_vma rmap's content, and it seems more
> appropriate in mm/rmap.c?
> >
> > NACK until the issues are fixed and the approach at least seems more
> > correct.
> Thanks
> Zhiguo
>

Please see the attachment in thread for an example of a working version of
this, this is sadly fundamentally broken.

But you're going to really need to sell this a lot better, provide some
numbers, and provide extensive testing and a much, much better test for
this to stand any chance.

I appreciate what you're trying to do here, and it's not totally crazy, but
we have to be so, so careful around this code.

anon_vma code is horrendously subtle and confusing (I actually had to
reference my unpublished book to remind myself how this stuff works :)), so
we have to tread very carefully.

I definitely think we need ASCII diagrams if we were to go ahead with a new
version of this. But then again I'm a bit of a fan of ASCII diagrams...

Please cc- me on future revisions of this series, thanks :)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux