Johannes Weiner <hannes@xxxxxxxxxxx> writes: > Hi Stefan, > > On Fri, Feb 10, 2023 at 01:50:04PM -0800, Stefan Roesch wrote: >> So far KSM can only be enabled by calling madvise for memory regions. What is >> required to enable KSM for more workloads is to enable / disable it at the >> process / cgroup level. >> >> Use case: >> The madvise call is not available in the programming language. An example for >> this are programs with forked workloads using a garbage collected language without >> pointers. In such a language madvise cannot be made available. >> >> In addition the addresses of objects get moved around as they are garbage >> collected. KSM sharing needs to be enabled "from the outside" for these type of >> workloads. > > It would be good to expand on the argument that Rik made about the > interpreter being used for things were there are no merging > opportunities, and the KSM scanning overhead isn't amortized. > > There is a fundamental mismatch in scopes. madvise() is a > workload-local decision, whereas sizable sharing opportunities may or > may not exist across multiple workloads. Only a higher-level entity > like a job scheduler can know for certain whether it's running one or > more instances of a job. That job scheduler in turn doesn't have the > necessary knowledge of the workload's internals to make targeted and > well-timed advise calls with, say, process_madvise(). > > This also applies to the security concerns brought up in previous > threads. An individual workload doesn't know what else is running on > the machine, so it needs to be highly conservative about what it can > give up for system-wide merging. However, if the system is dedicated > to running multiple jobs within the same security domain, it's the job > scheduler that knows that sharing isn't a problem, and even desirable. > > So I think this series makes sense, but it would be good to expand a > bit on the reasoning and address the security aspect in the cover/doc. > These are good points Johannes, I'll elaborate on them with the next version of the patch. >> Stefan Roesch (19): >> mm: add new flag to enable ksm per process >> mm: add flag to __ksm_enter >> mm: add flag to __ksm_exit call >> mm: invoke madvise for all vmas in scan_get_next_rmap_item >> mm: support disabling of ksm for a process >> mm: add new prctl option to get and set ksm for a process > > The implementation looks sound to me as well. > > I think it would be a bit easier to review if you folded these ^^^ > patches, the tools patch below, and the prctl selftests, all into one > single commit. It's one logical change. This way the new flags and > helper functions can be reviewed against the new users and callsites > without having to jump back and forth between emails. > I'll fold them in the next version. >> mm: split off pages_volatile function >> mm: expose general_profit metric >> docs: document general_profit sysfs knob >> mm: calculate ksm process profit metric >> mm: add ksm_merge_type() function >> mm: expose ksm process profit metric in ksm_stat >> mm: expose ksm merge type in ksm_stat >> docs: document new procfs ksm knobs > > Same with the new knobs/stats and their documentation. > I'll fold them in the next version. > Logical splitting is easier to follow than geographical splitting. > > Thanks!