On Sat, Oct 22, 2022 at 6:22 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Fri, Oct 21, 2022 at 09:05:26AM -0700, Andrii Nakryiko wrote: > > On Thu, Oct 20, 2022 at 1:14 PM Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > > > > > > This is a followup to a previous commit of mine [0], which added the > > > allow_sys_admin_access && capable(CAP_SYS_ADMIN) check. This patch > > > rearranges the order of checks in fuse_allow_current_process without > > > changing functionality. > > > > > > [0] added allow_sys_admin_access && capable(CAP_SYS_ADMIN) check to the > > > beginning of the function, with the reasoning that > > > allow_sys_admin_access should be an 'escape hatch' for users with > > > CAP_SYS_ADMIN, allowing them to skip any subsequent checks. > > > > > > However, placing this new check first results in many capable() calls when > > > allow_sys_admin_access is set, where another check would've also > > > returned 1. This can be problematic when a BPF program is tracing > > > capable() calls. > > > > > > At Meta we ran into such a scenario recently. On a host where > > > allow_sys_admin_access is set but most of the FUSE access is from > > > processes which would pass other checks - i.e. they don't need > > > CAP_SYS_ADMIN 'escape hatch' - this results in an unnecessary capable() > > > call for each fs op. We also have a daemon tracing capable() with BPF and > > > doing some data collection, so tracing these extraneous capable() calls > > > has the potential to regress performance for an application doing many > > > FUSE ops. > > > > > > So rearrange the order of these checks such that CAP_SYS_ADMIN 'escape > > > hatch' is checked last. Previously, if allow_other is set on the > > > fuse_conn, uid/gid checking doesn't happen as current_in_userns result > > > is returned. It's necessary to add a 'goto' here to skip past uid/gid > > > check to maintain those semantics here. > > > > > > [0]: commit 9ccf47b26b73 ("fuse: Add module param for CAP_SYS_ADMIN access bypassing allow_other") > > > > > > Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx> > > > Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > Cc: Christian Brauner <brauner@xxxxxxxxxx> > > > --- > > > > LGTM! > > > > Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > > > > > But I would also be curious to hear from Miklos or others whether > > skipping uid/gid check if fc->allow_other is true wa an intentional > > logic, oversight, or just doesn't matter. Because doing: > > Originally, setting fc->allow_other granted access to everyone skipping > all further permission checks. > > When Seth (Cced) made it possible to mount fuse in userns > fc->allow_other needed to be restricted to callers who are in the > superblock's userns or a descendant of it. The reason is that an > unprivileged user can mount a fuse filesystem with fc->allow_other > turned on. But then the user could mess with processes outside its > userns when they access the filesystem. > > Without fc->allow_other it doesn't matter what userns the filesystem is > accessed from as long as the uidgid is permissible. This could e.g., > happen if you unshare a userns but dont set{g,u}id. > > In general, I see two obvious permission models. In the first model we > restrict access to the owner of the mount by uidgid but also require > callers to be within the userns hierarchy. If allow_other is specified > the requirement would be relaxed to only require the caller to be within > the userns hierarchy. That would make the permission checking > consistent. > > The second permission model would only require permissible uidgid by > default and for allow other either permissible uidgid or for the caller > to be within the userns hierarchy (Which is what you're asking about > afaiu.). > Great, thanks for explaining! It does feel the latter is a more natural behavior, but I was just mostly curious, don't really have any strong opinion either way. > I don't see any obvious reasons why this wouldn't work from a security > perspective; maybe Seth has additional insights. The problem however is > that it would be a non-trivial change for containers to lift the > restriction now. It is quite useful to be able to mount a fuse > filesystem with allow_other and not have it accessible by anything > outside your userns hierarchy. So I wouldn't like to see that changed. > > If so we should probably make it yet another fuse module option or sm. > But in any case that should be a separate patch with a proper > justification why this semantic change is needed other than that it > simplifies the logic.