在 2024/8/25 14:42, Lorenzo Stoakes 写道:
[Some people who received this message don't often get email from lorenzo.stoakes@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
On Sun, Aug 25, 2024 at 01:06:40PM GMT, zhiguojiang wrote:
在 2024/8/25 0:26, Lorenzo Stoakes 写道:
[Some people who received this message don't often get email from lorenzo.stoakes@xxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
On Fri, Aug 23, 2024 at 11:02:06PM GMT, Zhiguo Jiang wrote:
After CoWed by do_wp_page, the vma established a new mapping relationship
with the CoWed folio instead of the non-CoWed folio. However, regarding
the situation where vma->anon_vma and the non-CoWed folio's anon_vma are
not same, the avc binding relationship between them will no longer be
needed, so it is issue for the avc binding relationship still existing
between them.
This patch will remove the avc binding relationship between vma and the
non-CoWed folio's anon_vma, which each has their own independent
anon_vma. It can also alleviates rmap overhead simultaneously.
Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx>
NACK (until fixed). This is broken (see below).
Hi Lorenzo Stoakes,
Thank you for your comments.
I'm not seeing any numbers to back anything up here as to why we want to
make changes to this incredibly sensitive code?
I added a debug trace log (as follows) in wp_page_copy() and observed
that a large number of these orphan avc-objects still exist. I believe
this will have a certain redundant overhead impact on anonymous folios'
rmap avcs, so I want to remove it, which is also the most essential
value of this patch.
Sorry nack to that idea unless you can provide actual _data_ to demonstrate
an overhead.
And even if you did, given the original patch was so completely broken, and
in such a sensitive area, I'm going to need to be VERY confident you didn't
break anything, so we're going to need tests.
-- the vital part of debug trace patch:
Thanks for providing! Will snip for sake of making it easier to reply.
[snip]
Also anon_vma logic is very complicated and confusing, this commit message
feels about 3 paragraphs too light.
Under what circumstances will vma->anon_vma be different from
folio_anon_vma(non_cowed_folio)? etc.
In anon_vma_fork() --> anon_vma_clone(), child vma is bound with parent
vma's anon_vma firstly.
/*
* First, attach the new VMA to the parent VMA's anon_vmas,
* so rmap can find non-COWed pages in child processes.
*/
error = anon_vma_clone(vma, pvma);
When child vma->anon_vma is NULL in anon_vma_fork(),
/* An existing anon_vma has been reused, all done then. */
if (vma->anon_vma)
return 0;
/* Then add our own anon_vma. */
anon_vma = anon_vma_alloc();
new anon_vma will be alloced and filled in this child vma->anon_vma.
Then during CoWed in do_wp_page() --> wp_page_copy(), this child vma's
new anon_vma will be different from folio_anon_vma(non_cowed_folio).
Thanks for the explanation, but I was suggesting you have to put this in
the commit message rather than in repy to me :)
Ok, I will update it in next version patch.
Confusing topics strongly require explanations that help (somewhat)
compensate. This is one of them.
[snip]
index 93c0c25433d0..4c89cb1cb73e
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
* old page will be flushed before it can be reused.
*/
folio_remove_rmap_pte(old_folio, vmf->page, vma);
+
+ /*
+ * If the new_folio's anon_vma is different from the
+ * old_folio's anon_vma, the avc binding relationship
+ * between vma and the old_folio's anon_vma is removed,
+ * avoiding rmap redundant overhead.
What overhead? Worth spelling out for instance if it's unnecessary to
traverse avc's.
I think this will have a certain redundant overhead impact on anonymous
folios rmap traverse avcs process.
This is again nowhere near detailed enough, and again I'm asking you to
write this _IN THE COMMENT_ not in review.
Ok, I will try to test and update it in next version patch.
I already understand what you're trying to do (I think the fact I provided
a _working_ version of your patch as an attachment in this thread should
give a clue ;), this is for the benefit of people coming to read this code.
[snip]
Again I question the value of this change. Are we REALLY seeing a big
problem due to unneeded avc's hanging around? This is very sensitive,
fiddly, confusing code, do we REALLY want to be playing with it?
Thank you for helping to identify mang issues with this patch. However,
I think this will have a certain benefits for anonymous folio rmap
traverse avc overhead.
It'd be good to get some tests though unless you move this to vma.c with
its userland testing (probably a good idea actually as Andrew suggested)
this might be tricky.
This patch belongs to anon_vma rmap's content, and it seems more
appropriate in mm/rmap.c?
NACK until the issues are fixed and the approach at least seems more
correct.
Thanks
Zhiguo
Please see the attachment in thread for an example of a working version of
this, this is sadly fundamentally broken.
Yes, I have seen the attachment you provided and thank you very much for
your help. I will update it based on your modifications in next version.
Currently I am thinking about how to implement the anon_vma->root code.
But you're going to really need to sell this a lot better, provide some
numbers, and provide extensive testing and a much, much better test for
this to stand any chance.
Ok, I will try to do some tests.
I appreciate what you're trying to do here, and it's not totally crazy, but
we have to be so, so careful around this code.
anon_vma code is horrendously subtle and confusing (I actually had to
reference my unpublished book to remind myself how this stuff works :)), so
we have to tread very carefully.
You are right, anon_vma code is indeed complex and sensitive.
I definitely think we need ASCII diagrams if we were to go ahead with a new
version of this. But then again I'm a bit of a fan of ASCII diagrams...
Please cc- me on future revisions of this series, thanks :)
Ok.
Thanks
Zhiguo