On Mon, Apr 4, 2022 at 1:29 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote: > > This initial proposal was using a new faccessat2(2) flag: > AT_INTERPRETED, see > https://lore.kernel.org/all/20200908075956.1069018-2-mic@xxxxxxxxxxx/ > What do you think about that? I'm happy to get back to this version if > everyone is OK with it. I'm certainly happi_er_ with that, but I find that particular patch odd for other reasons. In no particular order: - what's with the insane non-C syntax? Double parentheses have no actual meaning in C: if ((flags & AT_INTERPRETED)) { if ((mode & MAY_EXEC)) { so I don't understand why you'd use that strance thing. - why is this an AT_INTERPRETED flag? I don't understand the name, I don't understand the semantics. Why would that flag have the same value as AT_SYMLINK_FOLLOW? Why isn't this just a new _mode_ bit, which is what I feel is sensible? We only use three bits (with no bits set meaning "existence"), so we have *tons* of bits left in that namespace, and it would make much more sense to me to have #define EXECVE_OK 8 which is the same as the "group exec" bit, so it actually makes some kind of sense too. - related to that "I don't understand the semantics", the "documentation" for that thing doesn't make sense either: + The + main usage is for script + interpreters to enforce a policy + consistent with the kernel's one + (through sysctl configuration or LSM + policy). */ Now, what I *think* you mean is (1) user-space executable loaders want to be able to test the *same* policy as the kernel does for execve() (2) access(path, EXECVE_OK) will do the same permission checks as "execve()" would do for that path (3) if you already have the fd open, use "faccess(fd, NULL, F_OK_TO_EXECUTE, AT_EMPTY_PATH)" (4) maybe we want to add a flag for the "euid vs real uid", and that would be in the "flags" field, since that changes the actual *lookup* semantics Note that that (4) is something that some normal user space has wanted in the past too (GNU libcs has a "eaccess()" thing for "effective uid access"). And yes, we still need to talk details: - no backwards compatibility issues, because we've happily always checked 'mode' for being valid, so old kernels will always return -EINVAL. - POSIX namespace issues for EXECVE_OK means that the name probably needs some thinking (maybe we just need to call it __ACCESS_OK_EXECVE internally or something - the kernel actually doesn't even export the existing [FRWX]_OK values, because we "know" they map tho the usual "owner RWX" bits, with F being "no bit set") - I really want the exact semantics very clearly defined. I think it's ok to say "exact same security check as for 'execve()'", but even then we need to have that discussion about (a) "what about suid bits that user space cannot react to" (b) that whole "effective vs real" discussion So to recap: I'm very much ok with some access() extension, but I think even that very much needs clarification, and the existing patch is just odd in many many ways. Linus