On 6/15/2022 7:14 AM, Paul Moore wrote:
On Wed, Jun 15, 2022 at 6:30 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
On Tue, Jun 14, 2022 at 01:59:08PM -0500, Frederick Lawler wrote:
On 6/14/22 11:30 AM, Eric W. Biederman wrote:
Frederick Lawler <fred@xxxxxxxxxxxxxx> writes:
On 6/13/22 11:44 PM, Eric W. Biederman wrote:
Frederick Lawler <fred@xxxxxxxxxxxxxx> writes:
Hi Eric,
On 6/13/22 12:04 PM, Eric W. Biederman wrote:
Frederick Lawler <fred@xxxxxxxxxxxxxx> writes:
While experimenting with the security_prepare_creds() LSM hook, we
noticed that our EPERM error code was not propagated up the callstack.
Instead ENOMEM is always returned. As a result, some tools may send a
confusing error message to the user:
$ unshare -rU
unshare: unshare failed: Cannot allocate memory
A user would think that the system didn't have enough memory, when
instead the action was denied.
This problem occurs because prepare_creds() and prepare_kernel_cred()
return NULL when security_prepare_creds() returns an error code. Later,
functions calling prepare_creds() and prepare_kernel_cred() return
ENOMEM because they assume that a NULL meant there was no memory
allocated.
Fix this by propagating an error code from security_prepare_creds() up
the callstack.
Why would it make sense for security_prepare_creds to return an error
code other than ENOMEM?
> That seems a bit of a violation of what that function is supposed to do
The API allows LSM authors to decide what error code is returned from the
cred_prepare hook. security_task_alloc() is a similar hook, and has its return
code propagated.
It is not an api. It is an implementation detail of the linux kernel.
It is a set of convenient functions that do a job.
The general rule is we don't support cases without an in-tree user. I
don't see an in-tree user.
I'm proposing we follow security_task_allocs() pattern, and add visibility for
failure cases in prepare_creds().
I am asking why we would want to. Especially as it is not an API, and I
don't see any good reason for anything but an -ENOMEM failure to be
supported.
We're writing a LSM BPF policy, and not a new LSM. Our policy aims to solve
unprivileged unshare, similar to Debian's patch [1]. We're in a position such
that we can't use that patch because we can't block _all_ of our applications
from performing an unshare. We prefer a granular approach. LSM BPF seems like a
good choice.
I am quite puzzled why doesn't /proc/sys/user/max_user_namespaces work
for you?
We have the following requirements:
1. Allow list criteria
2. root user must be able to create namespaces whenever
3. Everything else not in 1 & 2 must be denied
We use per task attributes to determine whether or not we allow/deny the
current call to unshare().
/proc/sys/user/max_user_namespaces limits are a bit broad for this level of
detail.
Because LSM BPF exposes these hooks, we should probably treat them as an
API. From that perspective, userspace expects unshare to return a EPERM
when the call is denied permissions.
The BPF code gets to be treated as a out of tree kernel module.
Without an in-tree user that cares it is probably better to go the
opposite direction and remove the possibility of return anything but
memory allocation failure. That will make it clearer to implementors
that a general error code is not supported and this is not a location
to implement policy, this is only a hook to allocate state for the LSM.
That's a good point, and it's possible we're using the wrong hook for the
policy. Do you know of other hooks we can look into?
Fwiw, from this commit it wasn't very clear what you wanted to achieve
with this. It might be worth considering adding a new security hook for
this. Within msft it recently came up SELinux might have an interest in
something like this as well.
Just to clarify things a bit, I believe SELinux would have an interest
in a LSM hook capable of implementing an access control point for user
namespaces regardless of Microsoft's current needs. I suspect due to
the security relevant nature of user namespaces most other LSMs would
be interested as well; it seems like a well crafted hook would be
welcome by most folks I think.
Smack isn't going to be interested in such a hook with the current
user namespace behavior. User namespaces are a discretionary access
control and privilege (capabilities) feature. Smack implements only
mandatory access control. I would still endorse adding the hook
as I could see MAC aspects (e.g. general xattr mapping) being
implemented as part of user namespaces.