On Mon, Aug 08, 2022 at 03:16:16PM -0400, Paul Moore wrote: > On Mon, Aug 8, 2022 at 2:56 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > Paul Moore <paul@xxxxxxxxxxxxxx> writes: > > > On Mon, Aug 1, 2022 at 10:56 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > >> Frederick Lawler <fred@xxxxxxxxxxxxxx> writes: > > >> > > >> > While creating a LSM BPF MAC policy to block user namespace creation, we > > >> > used the LSM cred_prepare hook because that is the closest hook to prevent > > >> > a call to create_user_ns(). > > >> > > >> Re-nack for all of the same reasons. > > >> AKA This can only break the users of the user namespace. > > >> > > >> Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> > > >> > > >> You aren't fixing what your problem you are papering over it by denying > > >> access to the user namespace. > > >> > > >> Nack Nack Nack. > > >> > > >> Stop. > > >> > > >> Go back to the drawing board. > > >> > > >> Do not pass go. > > >> > > >> Do not collect $200. > > > > > > If you want us to take your comments seriously Eric, you need to > > > provide the list with some constructive feedback that would allow > > > Frederick to move forward with a solution to the use case that has > > > been proposed. You response above may be many things, but it is > > > certainly not that. > > > > I did provide constructive feedback. My feedback to his problem > > was to address the real problem of bugs in the kernel. > > We've heard from several people who have use cases which require > adding LSM-level access controls and observability to user namespace > creation. This is the problem we are trying to solve here; if you do > not like the approach proposed in this patchset please suggest another > implementation that allows LSMs visibility into user namespace > creation. Regarding the observability - can someone concisely lay out why just auditing userns creation would not suffice? Userspace could decide what to report based on whether the creating user_ns == /proc/1/ns/user... Regarding limiting the tweaking of otherwise-privileged code by unprivileged users, i wonder whether we could instead add smarts to ns_capable(). Point being, uid mapping would still work, but we'd break the "privileged against resources you own" part of user namespaces. I would want it to default to allow, but then when a 0-day is found which requires reaching ns_capable() code, admins could easily prevent exploitation until reboot from a fixed kernel.