Document the usage of the memcg= mount option, as well as permission restrictions of its use and caveats with remote charging. Signed-off-by: Mina Almasry <almasrymina@xxxxxxxxxx> --- Changes in v4: - Added more info about the permissions to mount with memcg=, and the importance of restricting write access to the mount point. - Changed documentation to describe the ENOSPC/SIGBUS behavior rather than the ENOMEM behavior implemented in earlier patches. - I did not find a good place to put this documentation after making the mount option generic. Please let me know if there is a good place to add this, and if not I can add a new file. Thanks! --- Documentation/filesystems/tmpfs.rst | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst index 0408c245785e3..dc1f46e16eaf4 100644 --- a/Documentation/filesystems/tmpfs.rst +++ b/Documentation/filesystems/tmpfs.rst @@ -137,6 +137,34 @@ mount options. It can be added later, when the tmpfs is already mounted on MountPoint, by 'mount -o remount,mpol=Policy:NodeList MountPoint'. +If CONFIG_MEMCG is enabled, filesystems (including tmpfs) has a mount option to +specify the memory cgroup to be charged for page allocations. + +memcg=/sys/fs/cgroup/unified/test/: data page allocations are charged to +cgroup /sys/fs/cgroup/unified/test/. + +Only processes that have write access to +/sys/fs/cgroup/unified/test/cgroup.procs can mount a tmpfs with +memcg=/sys/fs/cgroup/unified/test. Thus, a process is able to charge memory to a +cgroup only if it itself is able to enter that cgroup and allocate memory +there. This is to prevent random processes from mounting filesystems in user +namespaces and intentionally DoSing random cgroups running on the system. + +Once a mount point is created with memcg=, any process that has write access to +this mount point is able to use this mount point and direct charges to the +cgroup provided. Thus, it is important to limit write access to the mount point +to the intended users if untrusted code is running on the machine. This is +generally required regardless of whether the mount is done with memcg= or not. + +When charging memory to the remote memcg (memcg specified with memcg=) and +hitting that memcg's limit, the oom-killer will be invoked (if enabled) and will +attempt to kill a process in the remote memcg. If no killable processes are +found, the remote charging process gets an ENOSPC error. If the remote charging +process is in the pagefault path, it gets a SIGBUS signal. It's recommended +that processes executing remote charges are able to handle a SIGBUS signal or +ENOSPC error that may arise during executing the remote charges. + + To specify the initial root directory you can use the following mount options: -- 2.34.0.rc2.393.gf8c9666880-goog