On Tue, Apr 9, 2019 at 4:22 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > But there's a reason why user namespaces are all-or-nothing on these > things. If the kernel does not explicitly make a sysctl available to a > container, the sysctl has global effects, and therefore probably > shouldn't be exposed to anything other than someone with > administrative privileges across the whole system. If the kernel does > make it available to a container, the sysctl's effects are limited to > the container (or otherwise it's a kernel bug). > > Can you give examples of sysctls that you want to permit using from > containers, that wouldn't be accessible in a user namespace? I think this discussion has started with incorrect assumptions about the goal of the patch set. There is no _security_ part here. The sysctl hook is to prevent silly things to be done by chef and apps. Most interesting sysctls need root anyway. The root can detach all progs and do its thing. Consider tcp_mem sysctl. We've seen it's been misconfigured and caused performance issues. bpf prog can track what is being written, alarm, etc. User namespaces are not applicable here.