> On Aug 26, 2022, at 8:02 AM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > On Thu, Aug 25, 2022 at 6:42 PM Song Liu <songliubraving@xxxxxx> wrote: >>> On Aug 25, 2022, at 3:10 PM, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >>> On Thu, Aug 25, 2022 at 5:58 PM Song Liu <songliubraving@xxxxxx> wrote: > > ... > >>>> I am new to user_namespace and security work, so please pardon me if >>>> anything below is very wrong. >>>> >>>> IIUC, user_namespace is a tool that enables trusted userspace code to >>>> control the behavior of untrusted (or less trusted) userspace code. >>>> Failing create_user_ns() doesn't make the system more reliable. >>>> Specifically, we call create_user_ns() via two paths: fork/clone and >>>> unshare. For both paths, we need the userspace to use user_namespace, >>>> and to honor failed create_user_ns(). >>>> >>>> On the other hand, I would echo that killing the process is not >>>> practical in some use cases. Specifically, allowing the application to >>>> run in a less secure environment for a short period of time might be >>>> much better than killing it and taking down the whole service. Of >>>> course, there are other cases that security is more important, and >>>> taking down the whole service is the better choice. >>>> >>>> I guess the ultimate solution is a way to enforce using user_namespace >>>> in the kernel (if it ever makes sense...). >>> >>> The LSM framework, and the BPF and SELinux LSM implementations in this >>> patchset, provide a mechanism to do just that: kernel enforced access >>> controls using flexible security policies which can be tailored by the >>> distro, solution provider, or end user to meet the specific needs of >>> their use case. >> >> In this case, I wouldn't call the kernel is enforcing access control. >> (I might be wrong). There are 3 components here: kernel, LSM, and >> trusted userspace (whoever calls unshare). > > The LSM layer, and the LSMs themselves are part of the kernel; look at > the changes in this patchset to see the LSM, BPF LSM, and SELinux > kernel changes. Explaining how the different LSMs work is quite a bit > beyond the scope of this discussion, but there is plenty of > information available online that should be able to serve as an > introduction, not to mention the kernel source itself. However, in > very broad terms you can think of the individual LSMs as somewhat > analogous to filesystem drivers, e.g. ext4, and the LSM itself as the > VFS layer. Thanks for the explanation. This matches my understanding with LSM. > >> AFAICT, kernel simply passes >> the decision made by LSM (BPF or SELinux) to the trusted userspace. It >> is up to the trusted userspace to honor the return value of unshare(). > > With a LSM enabled and enforcing a security policy on user namespace > creation, which appears to be the case of most concern, the kernel > would make a decision on the namespace creation based on various > factors (e.g. for SELinux this would be the calling process' security > domain and the domain's permission set as determined by the configured > security policy) and if the operation was rejected an error code would > be returned to userspace and the operation rejected. It is the exact > same thing as what would happen if the calling process is chrooted or > doesn't have a proper UID/GID mapping. Don't forget that the > create_user_ns() function already enforces a security policy and > returns errors to userspace; this patchset doesn't add anything new in > that regard, it just allows for a richer and more flexible security > policy to be built on top of the existing constraints. I believe I don't understand user namespace enough to agree or disagree here. I guess I should read more. Thanks, Song > >> If the userspace simply ignores unshare failures, or does not call >> unshare(CLONE_NEWUSER), kernel and LSM cannot do much about it, right? > > The process is still subject to any security policies that are active > and being enforced by the kernel. A malicious or misconfigured > application can still be constrained by the kernel using both the > kernel's legacy Discretionary Access Controls (DAC) as well as the > more comprehensive Mandatory Access Controls (MAC) provided by many of > the LSMs. > > -- > paul-moore.com