On 04/04/2022 20:47, Linus Torvalds wrote:
On Mon, Apr 4, 2022 at 11:40 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
It looks like this didn't get pulled for -rc1 even though it was sent
during the merge window and has been in -next for a while. It would be
really nice to get this landed since userspace can't make any forward
progress without the kernel support.
Honestly, I need a *lot* better reasoning for random new non-standard
system calls than this had.
And this kind of "completely random interface with no semantics except
for random 'future flags'" I will not pull even *with* good reasoning.
I think the semantic is well defined:
"This new syscall enables user space to ask the kernel: is this file
descriptor's content trusted to be used for this purpose?"
See the trusted_for_policy sysctl documentation:
https://lore.kernel.org/all/20220104155024.48023-3-mic@xxxxxxxxxxx/
There is currently only one defined and implemented purpose: execution
(or script interpretation). There is room for other flags because it is
a good practice to do so, and other purposes were proposed.
I already told Mickaël in private that I wouldn't pull this.
Honestly, we have a *horrible* history with non-standard system calls,
and that's been true even for well-designed stuff that actually
matters, that people asked for.
Something like this, which adds one very special system call and
where the whole thing is designed for "let's add something random
later because we don't even know what we want" is right out.
What the system call seems to actually *want* is basically a new flag
to access() (and faccessat()). One that is very close to what X_OK
already is.
I agree.
But that wasn't how it was sold.
So no. No way will this ever get merged, and whoever came up with that
disgusting "trusted_for()" (for WHAT? WHO TRUSTS? WHY?) should look
themselves in the mirror.
Well, naming is difficult, but I'm open to suggestion. :)
As explained in the description, the WHAT is the file descriptor
content, the WHO TRUSTS is the system security policy (e.g. the mount
point options) and the WHY is defined by the usage flag
(TRUSTED_FOR_EXECUTION).
This translates to: is this file descriptor's content trusted to be used
for this specified purpose/usage?
If you add a new X_OK variant to access(), maybe that could fly.
As answered in private, that was the approach I took for one of the
early versions but a dedicated syscall was requested by Al Viro:
https://lore.kernel.org/r/2ed377c4-3500-3ddc-7181-a5bc114ddf94@xxxxxxxxxxx
The main reason behind this request was that it doesn't have the exact
same semantic as faccessat(2). The changes for this syscall are
documented here:
https://lore.kernel.org/all/20220104155024.48023-3-mic@xxxxxxxxxxx/
The whole history is linked in the cover letter:
https://lore.kernel.org/all/2ed377c4-3500-3ddc-7181-a5bc114ddf94@xxxxxxxxxxx/
This initial proposal was using a new faccessat2(2) flag:
AT_INTERPRETED, see
https://lore.kernel.org/all/20200908075956.1069018-2-mic@xxxxxxxxxxx/
What do you think about that? I'm happy to get back to this version if
everyone is OK with it.