On 03.04.23 17:50, Stefan Roesch wrote:
I guess the interpreter could enable it (like a memory allocator could enable it
for the whole heap). But I get that it's much easier to enable this per-process,
and eventually only when a lot of the same processes are running in that
particular environment.
We don't want it to get enabled for all workloads of that interpreter,
instead we want to be able to select for which workloads we enable KSM.
Right.
1. New options for prctl system command
This patch series adds two new options to the prctl system call.
The first one allows to enable KSM at the process level and the second
one to query the setting.
The setting will be inherited by child processes.
With the above setting, KSM can be enabled for the seed process of a
cgroup and all processes in the cgroup will inherit the setting.
2. Changes to KSM processing
When KSM is enabled at the process level, the KSM code will iterate
over all the VMA's and enable KSM for the eligible VMA's.
When forking a process that has KSM enabled, the setting will be
inherited by the new child process.
In addition when KSM is disabled for a process, KSM will be disabled
for the VMA's where KSM has been enabled.
Do we want to make MADV_MERGEABLE/MADV_UNMERGEABLE fail while the new prctl is
enabled for a process?
I decided to allow enabling KSM with prctl even when MADV_MERGEABLE,
this allows more flexibility.
MADV_MERGEABLE will be a nop. But IIUC, MADV_UNMERGEABLE will end up
calling unmerge_ksm_pages() and clear VM_MERGEABLE. But then, the next
KSM scan will merge the pages in there again.
Not sure if that flexibility is worth having.
[...]
@@ -2661,6 +2662,32 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
case PR_SET_VMA:
error = prctl_set_vma(arg2, arg3, arg4, arg5);
break;
+#ifdef CONFIG_KSM
+ case PR_SET_MEMORY_MERGE:
+ if (!capable(CAP_SYS_RESOURCE))
+ return -EPERM;
+
+ if (arg2) {
+ if (mmap_write_lock_killable(me->mm))
+ return -EINTR;
+
+ if (!test_bit(MMF_VM_MERGE_ANY, &me->mm->flags))
+ error = __ksm_enter(me->mm, MMF_VM_MERGE_ANY);
Hm, I think this might be problematic if we alread called __ksm_enter() via
madvise(). Maybe we should really consider making MMF_VM_MERGE_ANY set
MMF_VM_MERGABLE instead. Like:
error = 0;
if(test_bit(MMF_VM_MERGEABLE, &me->mm->flags))
error = __ksm_enter(me->mm);
if (!error)
set_bit(MMF_VM_MERGE_ANY, &me->mm->flags);
If we make that change, we would no longer be able to distinguish
if MMF_VM_MERGEABLE or MMF_VM_MERGE_ANY have been set.
Why would you need that exactly? To cleanup? See below.
+ mmap_write_unlock(me->mm);
+ } else {
+ __ksm_exit(me->mm, MMF_VM_MERGE_ANY);
Hm, I'd prefer if we really only call __ksm_exit() when we really exit the
process. Is there a strong requirement to optimize disabling of KSM or would it
be sufficient to clear the MMF_VM_MERGE_ANY flag here?
Then we still have the mm_slot allocated until the process gets
terminated.
Which is the same as using MADV_UNMERGEABLE, no?
Also, I wonder what happens if we have another VMA in that process that has it
enabled ..
Last but not least, wouldn't we want to do the same thing as MADV_UNMERGEABLE
and actually unmerge the KSM pages?
Do you want to call unmerge for all VMA's?
The question is what clearing MMF_VM_MERGE_ANY is supposed to do. If
it's supposed to disable KSM (like MADV_UNMERGEABLE) would, then I guess
you should go over all VMA's and unmerge.
Also, it depend on how you want to handle VM_MERGABLE with
MMF_VM_MERGE_ANY. If MMF_VM_MERGE_ANY would not set VM_MERGABLE, then
you'd only unmerge where VM_MERGABLE is not set. Otherwise, you'd
unshare everywhere where VM_MERGABLE is set (and clear VM_MERGABLE)
while at it.
Unsharing when clearing MMF_VM_MERGE_ANY might be the right thing to do
IMHO.
I guess the main questions regarding implementation are:
1) Do we want setting MMF_VM_MERGE_ANY to set VM_MERGABLE on all
candidate VMA's (go over all VMA's and set VM_MERGABLE). Then,
clearing MMF_VM_MERGE_ANY would simply unmerge and clear VM_MERGABLE
on all VMA's.
2) Do we want to make MMF_VM_MERGE_ANY imply MMF_VM_MERGABLE. You could
still disable KSM (__ksm_exit()) during clearing MMF_VM_MERGE_ANY
after going over all VMA's (where you might want to unshare already
either way).
I guess the code will end up simpler if you make MMF_VM_MERGE_ANY simply
piggy-back on MMF_VM_MERGABLE + VM_MERGABLE. I might be wrong, of course.
--
Thanks,
David / dhildenb