It's that time of the year again where we debate security settings for user namespaces ;) I’ve been experimenting with different approaches to address the gripe around user namespaces being used as attack vectors. After invaluable feedback from Serge and Christian offline, this is what I came up with. There are obviously a lot of things we could do differently but I feel this is the right balance between functionality, simplicity and security. This also serves as a good foundation and could always be extended if the need arises in the future. Notes: - Adding a new capability set is far from ideal, but trying to reuse the existing capability framework was deemed both impractical and questionable security-wise, so here we are. - We might want to add new capabilities for some of the checks instead of reusing CAP_SETPCAP every time. Serge mentioned something like CAP_SYS_LIMIT? - In the last patch, we could decide to have stronger requirements and perform checks inside cap_capable() in case we want to retroactively prevent capabilities in old namespaces, this might be an overreach though so I left it out. I'm also not fond of the ulong logic for setting the sysctl parameter, on the other hand, the usermodhelper code always uses two u32s which makes it very confusing to set in userspace. Jonathan Calmels (3): capabilities: user namespace capabilities capabilities: add securebit for strict userns caps capabilities: add cap userns sysctl mask fs/proc/array.c | 9 ++++ include/linux/cred.h | 3 ++ include/linux/securebits.h | 1 + include/linux/user_namespace.h | 7 +++ include/uapi/linux/prctl.h | 7 +++ include/uapi/linux/securebits.h | 11 ++++- kernel/cred.c | 3 ++ kernel/sysctl.c | 10 ++++ kernel/umh.c | 16 +++++++ kernel/user_namespace.c | 83 ++++++++++++++++++++++++++++++--- security/commoncap.c | 59 +++++++++++++++++++++++ security/keys/process_keys.c | 3 ++ 12 files changed, 204 insertions(+), 8 deletions(-) -- 2.45.0