On 19/05/2020 04:23, Aleksa Sarai wrote: > On 2020-05-15, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >> On Fri, May 15, 2020 at 04:43:37PM +0200, Florian Weimer wrote: >>> * Kees Cook: >>> >>>> On Fri, May 15, 2020 at 10:43:34AM +0200, Florian Weimer wrote: >>>>> * Kees Cook: >>>>> >>>>>> Maybe I've missed some earlier discussion that ruled this out, but I >>>>>> couldn't find it: let's just add O_EXEC and be done with it. It actually >>>>>> makes the execve() path more like openat2() and is much cleaner after >>>>>> a little refactoring. Here are the results, though I haven't emailed it >>>>>> yet since I still want to do some more testing: >>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/o_exec/v1 >>>>> >>>>> I think POSIX specifies O_EXEC in such a way that it does not confer >>>>> read permissions. This seems incompatible with what we are trying to >>>>> achieve here. >>>> >>>> I was trying to retain this behavior, since we already make this >>>> distinction between execve() and uselib() with the MAY_* flags: >>>> >>>> execve(): >>>> struct open_flags open_exec_flags = { >>>> .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, >>>> .acc_mode = MAY_EXEC, >>>> >>>> uselib(): >>>> static const struct open_flags uselib_flags = { >>>> .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, >>>> .acc_mode = MAY_READ | MAY_EXEC, >>>> >>>> I tried to retain this in my proposal, in the O_EXEC does not imply >>>> MAY_READ: >>> >>> That doesn't quite parse for me, sorry. >>> >>> The point is that the script interpreter actually needs to *read* those >>> files in order to execute them. >> >> I think I misunderstood what you meant (Mickaël got me sorted out >> now). If O_EXEC is already meant to be "EXEC and _not_ READ nor WRITE", >> then yes, this new flag can't be O_EXEC. I was reading the glibc >> documentation (which treats it as a permission bit flag, not POSIX, >> which treats it as a complete mode description). > > On the other hand, if we had O_EXEC (or O_EXONLY a-la O_RDONLY) then the > interpreter could re-open the file descriptor as O_RDONLY after O_EXEC > succeeds. Not ideal, but I don't think it's a deal-breaker. > > Regarding O_MAYEXEC, I do feel a little conflicted. > > I do understand that its goal is not to be what O_EXEC was supposed to > be (which is loosely what O_PATH has effectively become), so I think > that this is not really a huge problem -- especially since you could > just do O_MAYEXEC|O_PATH if you wanted to disallow reading explicitly. > It would be nice to have an O_EXONLY concept, but it's several decades > too late to make it mandatory (and making it optional has questionable > utility IMHO). > > However, the thing I still feel mildly conflicted about is the sysctl. I > do understand the argument for it (ultimately, whether O_MAYEXEC is > usable on a system depends on the distribution) but it means that any > program which uses O_MAYEXEC cannot rely on it to provide the security > guarantees they expect. Even if the program goes and reads the sysctl > value, it could change underneath them. If this is just meant to be a > best-effort protection then this doesn't matter too much, but I just > feel uneasy about these kinds of best-effort protections. I think there is a cognitive bias here. There is a difference between application-centric policies and system policies. For example, openat2 RESOLVE_* flags targets application developers and are self-sufficient: the kernel provides features (applied to FDs, owned and managed by user space) which must be known (by the application) to be supported (by the kernel), otherwise the application may give more privileges than expected. However, the O_MAYEXEC flag targets system administrators: it does not make sense to enable an application to know nor enforce the system(-wide) policy, but only to enable applications to follow this policy (i.e. best-effort *from the application developer point of view*). Indeed, access-control such as file executability depends on multiple layers (e.g. file permission, mount options, ACL, SELinux policy), most of them managed and enforced in a consistent way by (multiple parts of) the system. Applications should not and it does not make sense for them to expect anything from O_MAYEXEC. This flag only enables the system to enforce a security policy and that's all. It is really a different use case than FD management. This feature is meant to extend the system ability thanks to applications collaboration. Here the sysctl should not be looked at by applications, the same way an application should not look at the currently enforced SELinux policy nor the mount options. An application may be launched differently according to the system-wide policy, but this is again a system configuration. There is a difference between ABI compatibility (i.e. does this feature is supported by the kernel?) and system-wide security policy (what is the policy of the running system?), in which case (common) applications should not care about system-wide policy management but only care about policy enforcement (at their level, if it makes sense from the system point of view). If the feature is not provided by the system, then it is not the job of applications to change their behavior, which means applications do their job by using O_MAYEXEC but they do not care if it is enforce or not. It does not make sense for an application to stop because the system does not provide a system-centric security feature, moreover based on system introspection (i.e. through sysctl read). It is the system role to provide and *manage* other components executability. More explanation can be found in a separate thread: https://lore.kernel.org/lkml/d5df691d-bfcb-2106-08a2-cfe589b0a86c@xxxxxxxxxxx/ > > I do wonder if we could require that fexecve(3) can only be done with > file descriptors that have been opened with O_MAYEXEC (obviously this > would also need to be a sysctl -- *sigh*). This would tie in to some of > the magic-link changes I wanted to push (namely, upgrade_mask). > An O_EXEC flag could make sense for execveat(2), but O_MAYEXEC targets a different and complementary use case. See https://lore.kernel.org/lkml/1e2f6913-42f2-3578-28ed-567f6a4bdda1@xxxxxxxxxxx/ But again, see the above comment about the rational of system-wide policy management.