On 12 Jul 2022, at 10:16, Eric W. Biederman wrote:
Adding the containers list to the discussion so more interested people
have a chance of seeing this.
Benjamin Coddington <bcodding@xxxxxxxxxx> writes:
A persistent unsolved problem exists: how can the kernel find and/or
create
the appropriate "container" within which to execute a userspace
program to
construct keys or satisfy users of call_usermodehelper()?
I believe the latest serious attempt to solve this problem was
David's "Make
containers kernel objects":
https://lore.kernel.org/lkml/149547014649.10599.12025037906646164347.stgit@xxxxxxxxxxxxxxxxxxxxxx/
Over in NFS' space, we've most recently pondered this issue while
looking at
ways to pass a kernel socket to userspace in order to handle TLS
events:
https://lore.kernel.org/linux-nfs/E2BF9CFF-9361-400B-BDEE-CF5E0AFDCA63@xxxxxxxxxx/
The problem is that containers are not kernel objects, rather a
collection
of namespaces, cgroups, etc. Attempts at making the kernel aware of
containers have been mired in discussion and problems. It has been
suggested that the best representation of a "container" from the
kernel's
perspective is a process.
Keyagents are processes represented by a key. If a keyagent's key is
linked
to a session_keyring, it can be sent a realtime signal when a calling
process requests a matching key_type. That signal will dispatch the
process
to construct the desired key within the keyagent process context.
Keyagents
are similar to ssh-agents. To use a keyagent, one must execute a
keyagent
process in the desired context, and then link the keyagent's key onto
other
process' session_keyrings.
This method of linking keyagent keys to session_keyrings can be used
to
construct the various mappings of callers to keyagents that
containers may
need. A single keyagent process can answer request-key upcalls
across
container boundaries, or upcalls can be restricted to specific
containers.
I'm aware that building on realtime signals may not be a popular
choice, but
using realtime signals makes this work simple and ensures delivery.
Realtime
signals are able to convey everything needed to construct keys in
userspace:
the under-construction key's serial number.
This work is not complete; it has security implications, it needs
documentation, it has not been reviewed by anyone. Thanks for
reading this
RFC. I wish to collect criticism and validate this approach.
At a high level I do agree that we need to send a message to a
userspace
process and that message should contain enough information to start
the
user mode helper.
Then a daemon or possibly the container init can receive the message
and dispatch the user mode helper.
Fundamentally that design solves all of the container issues, and I
think solves a few of the user mode helper issues as well.
The challenge with this design is that it requires someone standing up
a
daemon to receive the messages and call a user mode helper to retain
compatibility with current systems.
Yes..
I would prefer to see a file descriptor rather than a signal used to
deliver the message. Signals suck for many many reasons and a file
descriptor based notification potentially can be much simpler.
In the example keyagent on userspace side, signal handling is done with
signalfd(2), which greatly simplifies things.
One of those many reasons is that by not following the common pattern
for filling in kernel_siginfo you have left uninitialized padding in
your structure that will be copied to userspace thus creating a kernel
information leak. Similarly your code doesn't fill in about half the
fields that are present in the siginfo union for the _rt case.
Yes, I just used the stack and only filled in the bare minimum.
I think a file descriptor based design could additionally address the
back and forth your design needs with keys to figure out what event
has
happened and what user mode helper to invoke.
The keys have already built out a fairly rich interface for accepting
authorization keys, and instantiating partially-constructed keys. I
think
the only communication needed (currently) is to dispatch and pass the
key
serial value.
If we used file descriptors instead of rt signals, there'd be some
protocol
engineering to do.
Ideally I would also like to see a design less tied to keys. So that
we
could use this for the other user mode helper cases as well. That
said
solving request_key appears to be the truly important part, there
aren't
many other user mode helpers. Still it would be nice if in theory the
design could be used to dispatch the coredump helper as well.
What if there was a key_type "usermode_helper"? Requesting a key of
that
type executes the binary specified in the callout info. A keyagent
could
satisfy the creation of this key, which would allow the usermode_helper
process to execute in the context of a container. If no keyagent, fall
back
to the legacy call_usermode_helper.
Thanks for the look,
Ben