On 11/15/21 10:28 AM, Miklos Szeredi wrote: > On Sat, 13 Nov 2021 at 00:29, Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > >>> If your tracing daemon runs in init_user_ns with CAP_SYS_ADMIN why can't >>> it simply use a helper process/thread to >>> setns(userns_fd/pidfd, CLONE_NEWUSER) >>> to the target userns? This way we don't need to special-case >>> init_user_ns at all. >> >> helper process + setns could work for my usecase. But the fact that there's no >> way to say "I know what I am about to do is potentially stupid and dangerous, >> but I am root so let me do it", without spawning a helper process in this case, >> feels like it'll result in special-case userspace workarounds for anyone doing >> symbolication of backtraces. > > Note: any mechanism that grants filesystem access to users that have > higher privileges than the daemon serving the filesystem will > potentially open DoS attacks against the higher privilege task. This > would be somewhat mitigated if the filesystem is only mounted in a > private mount namespace, but AFAICS that's not guaranteed. > > The above obviously applies to your original patch but it also applies > to any other mechanism where the high privilege user doesn't > explicitly acknowledge and accept the consequences. IOW granting the > exception has to be initiated by the high privleged user. > > Thanks, > Miklos > Sorry to ressurect this old thread. My proposed alternate approach of "special ioctl to grant exception to descendant userns check" proved unnecessarily complex: ioctls also go through fuse_allow_current_process check, so a special carve-out would be necessary for in both ioctl and fuse_permission check in order to make it possible for non-descendant-userns user to opt in to exception. How about a version of this patch with CAP_DAC_READ_SEARCH check? This way there's more of a clear opt-in vs CAP_SYS_ADMIN. FWIW we've been running CAP_SYS_ADMIN version of this patch internally and can confirm it fixes tracing tools' ability to symbolicate binaries in FUSE. > >> >> e.g. perf will have to add some logic: "did I fail >> to grab this exe that some process had mapped? Is it in a FUSE mounted by some >> descendant userns? let's fork a helper process..." Not the end of the world, >> but unnecessary complexity nonetheless. >> >> That being said, I agree that this patch's special-casing of init_user_ns is >> hacky. What do you think about a more explicit and general "let me do this >> stupid and dangerous thing" mechanism - perhaps a new struct fuse_conn field >> containing a set of exception userns', populated with ioctl or similar. > > > >> >>> >>>> >>>> Note: I was unsure whether CAP_SYS_ADMIN or CAP_SYS_PTRACE was the best >>>> choice of capability here. Went with the former as it's checked >>>> elsewhere in fs/fuse while CAP_SYS_PTRACE isn't. >>>> >>>> fs/fuse/dir.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c >>>> index 0654bfedcbb0..2524eeb0f35d 100644 >>>> --- a/fs/fuse/dir.c >>>> +++ b/fs/fuse/dir.c >>>> @@ -1134,7 +1134,7 @@ int fuse_allow_current_process(struct fuse_conn *fc) >>>> const struct cred *cred; >>>> >>>> if (fc->allow_other) >>>> - return current_in_userns(fc->user_ns); >>>> + return current_in_userns(fc->user_ns) || capable(CAP_SYS_ADMIN); >>>> >>>> cred = current_cred(); >>>> if (uid_eq(cred->euid, fc->user_id) && >>>> -- >>>> 2.30.2 >>>> >>>> >>