Re: [PATCH 0/2] fs: fix capable() call in simple_xattr_list()

Ondrej Mosnacek <omosnace@xxxxxxxxxx> · Thu, 3 Nov 2022 10:04:25 +0100

On Wed, Nov 2, 2022 at 7:25 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote:
> > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote:
> > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote:
> > > > > The goal of these patches is to avoid calling capable() unconditionally
> > > > > in simple_xattr_list(), which causes issues under SELinux (see
> > > > > explanation in the second patch).
> > > > >
> > > > > The first patch tries to make this change safer by converting
> > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called
> > > > > while the xattrs->lock is held. I didn't find evidence that this is an
> > > > > issue in the current code, but it can't hurt to make that change
> > > > > either way (and it was quite straightforward).
> > > >
> > > > Hey Ondrey,
> > > >
> > > > There's another patchset I'd like to see first which switches from a
> > > > linked list to an rbtree to get rid of performance issues in this code
> > > > that can be used to dos tmpfs in containers:
> > > >
> > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@xxxxxxxxxx
> > > >
> > > > I don't think Vasily has time to continue with this so I'll just pick it
> > > > up hopefully this or the week after LPC.
> > >
> > > Hm... does rbtree support lockless traversal? Because if not, that
> >
> > The rfc that Vasily sent didn't allow for that at least.
> >
> > > would make it impossible to fix the issue without calling capable()
> > > inside the critical section (or doing something complicated), AFAICT.
> > > Would rhashtable be a workable alternative to rbtree for this use
> > > case? Skimming <linux/rhashtable.h> it seems to support both lockless
> > > lookup and traversal using RCU. And according to its manpage,
> > > *listxattr(2) doesn't guarantee that the returned names are sorted.
> >
> > I've never used the rhashtable infrastructure in any meaningful way. All
> > I can say from looking at current users that it looks like it could work
> > well for us here:
> >
> > struct simple_xattr {
> >       struct rhlist_head rhlist_head;
> >       char *name;
> >       size_t size;
> >       char value[];
> > };
> >
> > static const struct rhashtable_params simple_xattr_rhashtable = {
> >       .head_offset = offsetof(struct simple_xattr, rhlist_head),
> >       .key_offset = offsetof(struct simple_xattr, name),
> >
> > or sm like this.
>
> I have a patch in rough shape that converts struct simple_xattr to use
> an rhashtable:
>
> https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/
>
> Light testing, not a lot useful comments and no meaningful commit
> message as of yet but I'll get to that.

Looks mostly good at first glance. I left comments for some minor
stuff I noticed.

> Even though your issue is orthogonal to the performance issues I'm
> trying to fix I went back to your patch, Ondrej to apply it on top.
> But I think it has one problem.
>
> Afaict, by moving the capable() call from the top of the function into
> the actual traversal portion an unprivileged user can potentially learn
> whether a file has trusted.* xattrs set. At least if dmesg isn't
> restricted on the kernel. That may very well be the reason why the
> capable() call is on top.

Technically it would be possible, for example with SELinux if the
audit daemon is dead. Not a likely situation, but I agree it's better
to be safe.

> (Because the straightforward fix for this would be to just call
> capable() a single time if at least one trusted xattr is encountered and
> store the result. That's pretty easy to do by making turning the trusted
> variable into an int, setting it to -1, and only if it's -1 and a
> trusted xattr has been found call capable() and store the result.)

That would also run into the conundrum of holding a lock while
(potentially) calling into the LSM subsystem. And would it even fix
the information leak? Unless I'm missing something it would only
prevent a leak of the trusted xattr count, but not the presence of any
trusted xattr.

> One option to fix all of that is to switch simple_xattr_list() to use
>
>         ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)
>
> which doesn't generate an audit event.
>
> I think this is even the correct thing to do as listing xattrs isn't a
> targeted operation. IOW, if the the user had used getxattr() to request
> a trusted.* xattr then logging a denial makes sense as the user
> explicitly wanted to retrieve a trusted.* xattr. But if the user just
> requested to list all xattrs then silently skipping trusted without
> logging an explicit denial xattrs makes sense.
>
> Does that sound acceptable?

Yes, I can't see any reason why that wouldn't be the best solution.
Why haven't I thought of that? :)

I guess you will want to submit a patch for it along with your
rhashtable patch to avoid a conflict? Or would you like me to submit
it separately?

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.