Re: [PATCH v5 10/13] mm/madvise: add MADV_COLLAPSE to process_madvise()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/11/22 11:34 PM, Zach O'Keefe wrote:
Hey Rongwei,

Thanks for taking the time to review!

On Tue, May 10, 2022 at 5:49 PM Rongwei Wang
<rongwei.wang@xxxxxxxxxxxxxxxxx> wrote:

Hi, Zach

Thanks for your great patchset!
Recently, We also try to collapse THP in this way, likes performance
degradation due to using too much hugepages in our scenes.

And there is a doubt about process_madvise(MADV_COLLAPSE) when we test
this patchset:. It seems that process_madvise(MADV_COLLAPSE) rely on
madvise(MADV_HUGEPAGE)? If the vma wasn't marked with 'hg',
process_madvise(MADV_COLLAPSE) will fail to collapse. And if I miss
something, please let me know.


I tried to have MADV_COLLAPSE follow the same THP eligibility
semantics as khugepaged and at-fault: either THP=always, or
THP=madvise and the vma is marked with MADV_HUGEPAGE, as you point
out.

If I understand you correctly, the usefulness of
process_madvise(MADV_COLLAPSE) is limited in the case where
THP=madvise and a CAP_SYS_ADMIN user is requesting a collapse of
behalf of another process since they don't have a way to mark the
target memory as eligible (which requires VM_HUGEPAGE).
Hi Zach

Yes, that's my means. It seems that MADV_[NO]HUGEPAGE for process_madivse(2) just needs little codes.

Anyway, looking forward to your next patchset.

Thanks
-wrw


If so, I think that's a valid point, and your suggestion below of a
supporting MADV_[NO]HUGEPAGE for process_madvise(2) makes sense. For
the sake of exploring all options, I'll mention that there was also a
previous idea suggested by Yang Shi where MADV_COLLAPSE could also set
VM_HUGEPAGE[1].

Since it's possible supporting MADV_[NO]HUGEPAGE for
process_madivse(2) has applications outside a subsequent
MADV_COLLAPSE, and since I don't see process_madvise(MADV_COLLAPSE) to
be in a hot path, I'd vote in favor of your suggestion and include
process_madvise(MADV_[NO]HUGEPAGE) support in v6 unless others object.

Thanks again for your review and your suggestion!
Zach

[1] https://lore.kernel.org/linux-mm/CAHbLzkqLRBd6u3qn=KqpOhRcPZtpGXbTXLUjK1z=4d_dQ06Pvw@xxxxxxxxxxxxxx/

If so, how about introducing process_madvise(MADV_HUGEPAGE) or
process_madvise(MADV_NOHUGEPAGE)? The former helps to mark the target
vma with 'hg', and the collapse process can be finished completely with
the help of other processes. the latter could let some special vma avoid
collapsing when setting 'THP=always'.

Best regards,
-wrw

On 5/5/22 5:44 AM, Zach O'Keefe wrote:
Allow MADV_COLLAPSE behavior for process_madvise(2) if caller has
CAP_SYS_ADMIN or is requesting collapse of it's own memory.

Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
---
   mm/madvise.c | 6 ++++--
   1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 638517952bd2..08c11217025a 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1168,13 +1168,15 @@ madvise_behavior_valid(int behavior)
   }

   static bool
-process_madvise_behavior_valid(int behavior)
+process_madvise_behavior_valid(int behavior, struct task_struct *task)
   {
       switch (behavior) {
       case MADV_COLD:
       case MADV_PAGEOUT:
       case MADV_WILLNEED:
               return true;
+     case MADV_COLLAPSE:
+             return task == current || capable(CAP_SYS_ADMIN);
       default:
               return false;
       }
@@ -1452,7 +1454,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
               goto free_iov;
       }

-     if (!process_madvise_behavior_valid(behavior)) {
+     if (!process_madvise_behavior_valid(behavior, task)) {
               ret = -EINVAL;
               goto release_task;
       }




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux