On Sun, Aug 25, 2024 at 01:06:40PM GMT, zhiguojiang wrote: > > > 在 2024/8/25 0:26, Lorenzo Stoakes 写道: > > [Some people who received this message don't often get email from lorenzo.stoakes@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > > > On Fri, Aug 23, 2024 at 11:02:06PM GMT, Zhiguo Jiang wrote: > > > After CoWed by do_wp_page, the vma established a new mapping relationship > > > with the CoWed folio instead of the non-CoWed folio. However, regarding > > > the situation where vma->anon_vma and the non-CoWed folio's anon_vma are > > > not same, the avc binding relationship between them will no longer be > > > needed, so it is issue for the avc binding relationship still existing > > > between them. > > > > > > This patch will remove the avc binding relationship between vma and the > > > non-CoWed folio's anon_vma, which each has their own independent > > > anon_vma. It can also alleviates rmap overhead simultaneously. > > > > > > Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx> > > > > NACK (until fixed). This is broken (see below). > > > Hi Lorenzo Stoakes, > > Thank you for your comments. > > I'm not seeing any numbers to back anything up here as to why we want to > > make changes to this incredibly sensitive code? > I added a debug trace log (as follows) in wp_page_copy() and observed > that a large number of these orphan avc-objects still exist. I believe > this will have a certain redundant overhead impact on anonymous folios' > rmap avcs, so I want to remove it, which is also the most essential > value of this patch. Sorry nack to that idea unless you can provide actual _data_ to demonstrate an overhead. And even if you did, given the original patch was so completely broken, and in such a sensitive area, I'm going to need to be VERY confident you didn't break anything, so we're going to need tests. > > -- the vital part of debug trace patch: Thanks for providing! Will snip for sake of making it easier to reply. [snip] > > > > Also anon_vma logic is very complicated and confusing, this commit message > > feels about 3 paragraphs too light. > > > > Under what circumstances will vma->anon_vma be different from > > folio_anon_vma(non_cowed_folio)? etc. > In anon_vma_fork() --> anon_vma_clone(), child vma is bound with parent > vma's anon_vma firstly. > /* > * First, attach the new VMA to the parent VMA's anon_vmas, > * so rmap can find non-COWed pages in child processes. > */ > error = anon_vma_clone(vma, pvma); > > When child vma->anon_vma is NULL in anon_vma_fork(), > /* An existing anon_vma has been reused, all done then. */ > if (vma->anon_vma) > return 0; > > /* Then add our own anon_vma. */ > anon_vma = anon_vma_alloc(); > > new anon_vma will be alloced and filled in this child vma->anon_vma. > Then during CoWed in do_wp_page() --> wp_page_copy(), this child vma's > new anon_vma will be different from folio_anon_vma(non_cowed_folio). Thanks for the explanation, but I was suggesting you have to put this in the commit message rather than in repy to me :) > > Confusing topics strongly require explanations that help (somewhat) > > compensate. This is one of them. [snip] > > > index 93c0c25433d0..4c89cb1cb73e > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > > > * old page will be flushed before it can be reused. > > > */ > > > folio_remove_rmap_pte(old_folio, vmf->page, vma); > > > + > > > + /* > > > + * If the new_folio's anon_vma is different from the > > > + * old_folio's anon_vma, the avc binding relationship > > > + * between vma and the old_folio's anon_vma is removed, > > > + * avoiding rmap redundant overhead. > > What overhead? Worth spelling out for instance if it's unnecessary to > > traverse avc's. > I think this will have a certain redundant overhead impact on anonymous > folios rmap traverse avcs process. This is again nowhere near detailed enough, and again I'm asking you to write this _IN THE COMMENT_ not in review. I already understand what you're trying to do (I think the fact I provided a _working_ version of your patch as an attachment in this thread should give a clue ;), this is for the benefit of people coming to read this code. [snip] > > Again I question the value of this change. Are we REALLY seeing a big > > problem due to unneeded avc's hanging around? This is very sensitive, > > fiddly, confusing code, do we REALLY want to be playing with it? > Thank you for helping to identify mang issues with this patch. However, > I think this will have a certain benefits for anonymous folio rmap > traverse avc overhead. > > > > It'd be good to get some tests though unless you move this to vma.c with > > its userland testing (probably a good idea actually as Andrew suggested) > > this might be tricky. > This patch belongs to anon_vma rmap's content, and it seems more > appropriate in mm/rmap.c? > > > > NACK until the issues are fixed and the approach at least seems more > > correct. > Thanks > Zhiguo > Please see the attachment in thread for an example of a working version of this, this is sadly fundamentally broken. But you're going to really need to sell this a lot better, provide some numbers, and provide extensive testing and a much, much better test for this to stand any chance. I appreciate what you're trying to do here, and it's not totally crazy, but we have to be so, so careful around this code. anon_vma code is horrendously subtle and confusing (I actually had to reference my unpublished book to remind myself how this stuff works :)), so we have to tread very carefully. I definitely think we need ASCII diagrams if we were to go ahead with a new version of this. But then again I'm a bit of a fan of ASCII diagrams... Please cc- me on future revisions of this series, thanks :)