> On Aug 25, 2022, at 3:10 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > On Thu, Aug 25, 2022 at 5:58 PM Song Liu <songliubraving@xxxxxx> wrote: >>> On Aug 25, 2022, at 12:19 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >>> >>> On Thu, Aug 25, 2022 at 2:15 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: >>>> Paul Moore <paul@xxxxxxxxxxxxxx> writes: >>>>> On Fri, Aug 19, 2022 at 10:45 AM Serge E. Hallyn <serge@xxxxxxxxxx> wrote: >>>>>> I am hoping we can come up with >>>>>> "something better" to address people's needs, make everyone happy, and >>>>>> bring forth world peace. Which would stack just fine with what's here >>>>>> for defense in depth. >>>>>> >>>>>> You may well not be interested in further work, and that's fine. I need >>>>>> to set aside a few days to think on this. >>>>> >>>>> I'm happy to continue the discussion as long as it's constructive; I >>>>> think we all are. My gut feeling is that Frederick's approach falls >>>>> closest to the sweet spot of "workable without being overly offensive" >>>>> (*cough*), but if you've got an additional approach in mind, or an >>>>> alternative approach that solves the same use case problems, I think >>>>> we'd all love to hear about it. >>>> >>>> I would love to actually hear the problems people are trying to solve so >>>> that we can have a sensible conversation about the trade offs. >>> >>> Here are several taken from the previous threads, it's surely not a >>> complete list, but it should give you a good idea: >>> >>> https://lore.kernel.org/linux-security-module/CAHC9VhQnPAsmjmKo-e84XDJ1wmaOFkTKPjjztsOa9Yrq+AeAQA@xxxxxxxxxxxxxx/ >>> >>>> As best I can tell without more information people want to use >>>> the creation of a user namespace as a signal that the code is >>>> attempting an exploit. >>> >>> Some use cases are like that, there are several other use cases that >>> go beyond this; see all of our previous discussions on this >>> topic/patchset. As has been mentioned before, there are use cases >>> that require improved observability, access control, or both. >>> >>>> As such let me propose instead of returning an error code which will let >>>> the exploit continue, have the security hook return a bool. With true >>>> meaning the code can continue and on false it will trigger using SIGSYS >>>> to terminate the program like seccomp does. >>> >>> Having the kernel forcibly exit the process isn't something that most >>> LSMs would likely want. I suppose we could modify the hook/caller so >>> that *if* an LSM wanted to return SIGSYS the system would kill the >>> process, but I would want that to be something in addition to >>> returning an error code like LSMs normally do (e.g. EACCES). >> >> I am new to user_namespace and security work, so please pardon me if >> anything below is very wrong. >> >> IIUC, user_namespace is a tool that enables trusted userspace code to >> control the behavior of untrusted (or less trusted) userspace code. >> Failing create_user_ns() doesn't make the system more reliable. >> Specifically, we call create_user_ns() via two paths: fork/clone and >> unshare. For both paths, we need the userspace to use user_namespace, >> and to honor failed create_user_ns(). >> >> On the other hand, I would echo that killing the process is not >> practical in some use cases. Specifically, allowing the application to >> run in a less secure environment for a short period of time might be >> much better than killing it and taking down the whole service. Of >> course, there are other cases that security is more important, and >> taking down the whole service is the better choice. >> >> I guess the ultimate solution is a way to enforce using user_namespace >> in the kernel (if it ever makes sense...). > > The LSM framework, and the BPF and SELinux LSM implementations in this > patchset, provide a mechanism to do just that: kernel enforced access > controls using flexible security policies which can be tailored by the > distro, solution provider, or end user to meet the specific needs of > their use case. In this case, I wouldn't call the kernel is enforcing access control. (I might be wrong). There are 3 components here: kernel, LSM, and trusted userspace (whoever calls unshare). AFAICT, kernel simply passes the decision made by LSM (BPF or SELinux) to the trusted userspace. It is up to the trusted userspace to honor the return value of unshare(). If the userspace simply ignores unshare failures, or does not call unshare(CLONE_NEWUSER), kernel and LSM cannot do much about it, right? This might still be useful in some cases. (I am far from an expert on these). I just feel this is not the typical solution to enforce something. Thanks, Song PS: If I said something very stupid, I would not feel offended if someone pointed it out loud. :)