On Wed, Apr 24, 2024 at 01:52:46PM +0300, Stas Sergeev wrote: > This patch-set implements the OA2_INHERIT_CRED flag for openat2() syscall. > It is needed to perform an open operation with the creds that were in > effect when the dir_fd was opened. This allows the process to pre-open > some dirs and switch eUID (and other UIDs/GIDs) to the less-privileged > user, while still retaining the possibility to open/create files within > the pre-opened directory set. > > The sand-boxing is security-oriented: symlinks leading outside of a > sand-box are rejected. /proc magic links are rejected. > The more detailed description (including security considerations) > is available in the log messages of individual patches. > > Changes in v4: > - add optimizations suggested by David Laight <David.Laight@xxxxxxxxxx> > - move security checks to build_open_flags() > - force RESOLVE_NO_MAGICLINKS as suggested by Andy Lutomirski <luto@xxxxxxxxxx> > > Changes in v3: > - partially revert v2 changes to avoid overriding capabilities. > Only the bare minimum is overridden: fsuid, fsgid and group_info. > Document the fact the full cred override is unwanted, as it may > represent an unneeded security risk. > > Changes in v2: > - capture full struct cred instead of just fsuid/fsgid. > Suggested by Stefan Metzmacher <metze@xxxxxxxxx> This smells ripe enough to serve as an attack vector in non-obvious ways. And in general this has the potential to confuse the hell out unsuspecting userspace. They can now suddenly get sent such special-sauce files such as this that they have no way of recognizing as there's neither an FMODE_* flag nor is the OA2_* flag recorded so it's not available in F_GETFL. There's not even a way to restrict that new flag because no LSM ever sees it. So that behavior might break LSM assumptions as well. And it is effectively usable to steal credentials. If process A opens a directory with uid/gid 0 then sends that directory fd via AF_UNIX or something to process B then process B can inherit the uid/gid of process A by specifying OA2_* with no way for process A to prevent this - not even through an LSM. The permission checking model that we have right now is already baroque. I see zero reason to add more complexity for the sake of "lightweight sandboxing". We have LSMs and namespaces for stuff like this. NAK.