Re: [PATCH v5 10/13] mm/madvise: add MADV_COLLAPSE to process_madvise()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Zach

Thanks for your great patchset!
Recently, We also try to collapse THP in this way, likes performance
degradation due to using too much hugepages in our scenes.

And there is a doubt about process_madvise(MADV_COLLAPSE) when we test this patchset:. It seems that process_madvise(MADV_COLLAPSE) rely on madvise(MADV_HUGEPAGE)? If the vma wasn't marked with 'hg', process_madvise(MADV_COLLAPSE) will fail to collapse. And if I miss something, please let me know.

If so, how about introducing process_madvise(MADV_HUGEPAGE) or process_madvise(MADV_NOHUGEPAGE)? The former helps to mark the target vma with 'hg', and the collapse process can be finished completely with the help of other processes. the latter could let some special vma avoid
collapsing when setting 'THP=always'.

Best regards,
-wrw

On 5/5/22 5:44 AM, Zach O'Keefe wrote:
Allow MADV_COLLAPSE behavior for process_madvise(2) if caller has
CAP_SYS_ADMIN or is requesting collapse of it's own memory.

Signed-off-by: Zach O'Keefe <zokeefe@xxxxxxxxxx>
---
  mm/madvise.c | 6 ++++--
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 638517952bd2..08c11217025a 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1168,13 +1168,15 @@ madvise_behavior_valid(int behavior)
  }
static bool
-process_madvise_behavior_valid(int behavior)
+process_madvise_behavior_valid(int behavior, struct task_struct *task)
  {
  	switch (behavior) {
  	case MADV_COLD:
  	case MADV_PAGEOUT:
  	case MADV_WILLNEED:
  		return true;
+	case MADV_COLLAPSE:
+		return task == current || capable(CAP_SYS_ADMIN);
  	default:
  		return false;
  	}
@@ -1452,7 +1454,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
  		goto free_iov;
  	}
- if (!process_madvise_behavior_valid(behavior)) {
+	if (!process_madvise_behavior_valid(behavior, task)) {
  		ret = -EINVAL;
  		goto release_task;
  	}




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux