Hi. First, I hope you are fine and the same for your relatives. Capabilities are used to check if a thread can perform a given action [1]. For example, a thread with CAP_BPF set can use the bpf() syscall. Capabilities are used in the container world. In terms of code, several projects related to container maintain code where the capabilities are written alike include/uapi/linux/capability.h [2][3][4][5]. For these projects, their codebase should be updated when a new capability is added to the kernel. Some other projects rely on <sys/capability.h> [6]. In this case, this header file should reflect the capabilities offered by the kernel. The delay between adding a new capability to the kernel and this capability being used by "container stack" software users can be long. Indeed, CAP_BPF was added in a17b53c4a4b5 which was part of v5.8 released in August 2020. Almost 2 years later, none of the "container stack" software authorize using this capability in their last stable release. The only way to use CAP_BPF with moby is to use v22.06.0-beta.0 release which contains a commit enabling CAP_BPF, CAP_PERFMON and CAP_CHECKPOINT_RESTORE [7]. This situation can be easily explained by the following: 1. moby depends on containerd which in turns depends on runc. 2. runc depends on github.com/syndtr/gocapability which is golang package to deal with capabilities. This high number of dependencies explain the delay and the big amount of human work to add support in the "container stack" software for a new capability. A solution to this problem could be to add a way for the userspace to ask the kernel about the capabilities it offers. So, in this series, I added a new file to securityfs: /sys/kernel/security/capabilities. The goal of this file is to be used by "container world" software to know kernel capabilities at run time instead of compile time. The "file" is read-only and its content is the capability number associated with the capability name: root@vm-amd64:~# cat /sys/kernel/security/capabilities 0 CAP_CHOWN 1 CAP_DAC_OVERRIDE ... 40 CAP_CHECKPOINT_RESTORE root@vm-amd64:~# wc -c /sys/kernel/security/capabilities 698 /sys/kernel/security/capabilities So, the "container stack" software just have to read this file to know if they can use the capabilities the user asked for. For example, if user asks for CAP_BPF on kernel 5.8, then this capability will be present in the file and so it can be used. Nonetheless, if the underlying kernel is 5.4, this capability will not be present and so it cannot be used. The kernel already exposes the last capability number under: /proc/sys/kernel/cap_last_cap So, I think there should not be any issue exposing all the capabilities it offers. If there is any, please share it as I do not want to introduce issue with this series. Also, the data exchanged with userspace are less than 700 bytes long which represent 17% of PAGE_SIZE. Note that I am open to any better way for the userspace to ask the kernel for known capabilities. And if you see any way to improve this series please share it as it would increase this contribution quality. Change since: v3: * Use securityfs_create_file() to create securityfs file. v2: * Use a char * for cap_string instead of an array, each line of this char * contains the capability number and its name. * Move the file under /sys/kernel/security instead of /sys/kernel. Francis Laniel (2): capability: Add cap_string. security/inode.c: Add capabilities file. include/uapi/linux/capability.h | 1 + kernel/capability.c | 45 +++++++++++++++++++++++++++++++++ security/inode.c | 16 ++++++++++++ 3 files changed, 62 insertions(+) Best regards and thank you in advance for your reviews. --- [1] man capabilities [2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1 moby relies on containerd code. [4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47 [5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on syndtr package. [6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35 [7] https://github.com/moby/moby/commit/c1c973e81b0ff36c697fbeabeb5ea7d09566ddc0 -- 2.25.1