Hi,
KSM is a powerful tool for deduplicating memory, reducing usage by merging
identical pages across processes. However, there are certain interface and
implementation aspect that prevents its deployment in our use case; wherein
security and efficiency (CPU overhead - due to background scanning) are of
greater importance.
We propose Selective KSM, a mechanism to control when the merging takes
place and what pages can be merged together. We do this by partitioning the
merge-space as per security-domains and carryout the merging as part of a
synchronous syscall. Doing so, we ensure sensitive-content is not merged
with non-sensitive content.
Our overall goal is to optimize the memory utilization in a virtualized
environment, wherein there exists significant duplications across guest
instances (e.g., kernel). With the better ability of the operator to group pages
as per security and similarity, Selective KSM improves security and efficiency.
Other than virtualized environments, we also want Selective KSM to work
well in containerized environments.
An example API could look like this ( Alternatively we can do it through sysfs
without adding syscalls):
// This feature shall be gated by a KConfig: “CONFIG_SELECTIVE_KSM”
// Create a unique identifier known to userland.
char *ksm_name = “some_name”;
// ksm_open() creates and opens a new, or opens an existing, ksm partition obj.
// flags is a bit mask to determine if the merging is sync, etc.
// KSM_SYNC: Carryout synchronous merging (no-background scanning).
// KSM_CREAT: Creates a KSM partition obj if it does not exist.
// KSM_EXCL: If KSM partition obj with name already exists and
// KSM_CREAT is also specified, return err.
// modes is used to handle permissions:
// O_RDONLY, O_WRONLY, O_RDWR, S_IRUSR, S_IWUSR, S_IXUSR
// On success, returns a file descriptor (a nonnegative integer) and creates the
// sysfs path:
// /sys/kernel/mm/ksm/partition/<ksm_name>/
// On failure, it returns -1 and sets errno to indicate the error.
int ksm_fd = ksm_open(ksm_name, flag, mode);
// Destroy the name. The named object will be removed only after all open
// references are closed. On success, ksm_unlink() returns 0.
// On failure, it returns -1 and sets errno to indicate the error.
ksm_unlink(ksm_name);
// Trigger merge. Only valid if KSM_SYNC is set during ksm_open().
ksm_merge(ksm_fd, pid, addr, size);
// Trigger unmerge. Only valid if KSM_SYNC is set during ksm_open().
ksm_unmerge(ksm_fd, pid, addr, size);
With regards,
Sourav Panda