On Thu, Apr 11, 2024 at 11:04:59AM +0200, Christian Brauner wrote: > On Wed, Apr 10, 2024 at 07:39:49PM -0700, Linus Torvalds wrote: > > On Wed, 10 Apr 2024 at 17:10, Linus Torvalds > > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > + if (flags & LOOKUP_DFD_MATCH_CREDS) { > > > + if (f.file->f_cred != current_cred() && > > > + !capable(CAP_DAC_READ_SEARCH)) { > > > + fdput(f); > > > + return ERR_PTR(-ENOENT); > > > + } > > > + } > > > > Side note: I suspect that this could possibly be relaxed further, by > > making the rule be that if something has been explicitly opened to be > > used as a path (ie O_PATH was used at open time), we can link to it > > even across different credentials. > > I had a similar discussion a while back someone requested that we relax > permissions so linkat can be used in containers. And I drafted the > following patch back then: > > https://lore.kernel.org/all/20231113-undenkbar-gediegen-efde5f1c34bc@brauner > > IOW, I had intended to make this work with containers so that we check > CAP_DAC_READ_SEARCH in the namespace of the opener of the file. My > thinking had been that this can serve as a way to say "Hey, I could've > opened this file in the openers namespace therefore let me make a path > to it.". I didn't actually send it because I thought the original author > would but imho, that would be a worthwhile addition to your patch if > this makes sense... For example, say someone opened an O_PATH fd in the initial user ns and then send that file over an AF_UNIX socket to some other container the ns_capable(f_cred->user_ns, CAP_DAC_READ_SEARCH) would always be false. The other way around though would work. Which imho is exactly what we want to make such cross-container interactions with linkat() safe. And this didn't aim to solve the problem of allowing unprivileged users in the initial namespace to do linkat(), of course which yours does. Btw, I think we should try to avoid putting this into path_init() and confine this to linkat() itself imho. The way I tried to do it was by presetting a root for filename_lookup(); means we also don't need a LOOKUP_* flag for this as this is mostly a linkat thing. So maybe your suggestion combined with my own attempt would make this work for unprivileged users and containers? if (f.file->f_cred != current_cred() && !ns_capable(f.file->f_cred->user_ns, CAP_DAC_READ_SEARCH)) Worst case we get a repeat of the revert and get to make this a 10 year anniversary patch attempt?