On Thu, Nov 3, 2022 at 10:12 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > On Thu, Nov 03, 2022 at 10:04:25AM +0100, Ondrej Mosnacek wrote: > > On Wed, Nov 2, 2022 at 7:25 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > On Mon, Sep 05, 2022 at 05:30:36PM +0200, Christian Brauner wrote: > > > > On Mon, Sep 05, 2022 at 12:15:01PM +0200, Ondrej Mosnacek wrote: > > > > > On Mon, Sep 5, 2022 at 11:08 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > On Thu, Sep 01, 2022 at 05:26:30PM +0200, Ondrej Mosnacek wrote: > > > > > > > The goal of these patches is to avoid calling capable() unconditionally > > > > > > > in simple_xattr_list(), which causes issues under SELinux (see > > > > > > > explanation in the second patch). > > > > > > > > > > > > > > The first patch tries to make this change safer by converting > > > > > > > simple_xattrs to use the RCU mechanism, so that capable() is not called > > > > > > > while the xattrs->lock is held. I didn't find evidence that this is an > > > > > > > issue in the current code, but it can't hurt to make that change > > > > > > > either way (and it was quite straightforward). > > > > > > > > > > > > Hey Ondrey, > > > > > > > > > > > > There's another patchset I'd like to see first which switches from a > > > > > > linked list to an rbtree to get rid of performance issues in this code > > > > > > that can be used to dos tmpfs in containers: > > > > > > > > > > > > https://lore.kernel.org/lkml/d73bd478-e373-f759-2acb-2777f6bba06f@xxxxxxxxxx > > > > > > > > > > > > I don't think Vasily has time to continue with this so I'll just pick it > > > > > > up hopefully this or the week after LPC. > > > > > > > > > > Hm... does rbtree support lockless traversal? Because if not, that > > > > > > > > The rfc that Vasily sent didn't allow for that at least. > > > > > > > > > would make it impossible to fix the issue without calling capable() > > > > > inside the critical section (or doing something complicated), AFAICT. > > > > > Would rhashtable be a workable alternative to rbtree for this use > > > > > case? Skimming <linux/rhashtable.h> it seems to support both lockless > > > > > lookup and traversal using RCU. And according to its manpage, > > > > > *listxattr(2) doesn't guarantee that the returned names are sorted. > > > > > > > > I've never used the rhashtable infrastructure in any meaningful way. All > > > > I can say from looking at current users that it looks like it could work > > > > well for us here: > > > > > > > > struct simple_xattr { > > > > struct rhlist_head rhlist_head; > > > > char *name; > > > > size_t size; > > > > char value[]; > > > > }; > > > > > > > > static const struct rhashtable_params simple_xattr_rhashtable = { > > > > .head_offset = offsetof(struct simple_xattr, rhlist_head), > > > > .key_offset = offsetof(struct simple_xattr, name), > > > > > > > > or sm like this. > > > > > > I have a patch in rough shape that converts struct simple_xattr to use > > > an rhashtable: > > > > > > https://gitlab.com/brauner/linux/-/commits/fs.xattr.simple.rework/ > > > > > > Light testing, not a lot useful comments and no meaningful commit > > > message as of yet but I'll get to that. > > > > Looks mostly good at first glance. I left comments for some minor > > stuff I noticed. > > > > > Even though your issue is orthogonal to the performance issues I'm > > > trying to fix I went back to your patch, Ondrej to apply it on top. > > > But I think it has one problem. > > > > > > Afaict, by moving the capable() call from the top of the function into > > > the actual traversal portion an unprivileged user can potentially learn > > > whether a file has trusted.* xattrs set. At least if dmesg isn't > > > restricted on the kernel. That may very well be the reason why the > > > capable() call is on top. > > > > Technically it would be possible, for example with SELinux if the > > audit daemon is dead. Not a likely situation, but I agree it's better > > to be safe. > > > > > (Because the straightforward fix for this would be to just call > > > capable() a single time if at least one trusted xattr is encountered and > > > store the result. That's pretty easy to do by making turning the trusted > > > variable into an int, setting it to -1, and only if it's -1 and a > > > trusted xattr has been found call capable() and store the result.) > > > > That would also run into the conundrum of holding a lock while > > (potentially) calling into the LSM subsystem. And would it even fix > > the information leak? Unless I'm missing something it would only > > prevent a leak of the trusted xattr count, but not the presence of any > > trusted xattr. > > No it wouldn't. I just meant this to illustrate that with your patch we > could've made it so that capable() would've only been called once. > > > > > > One option to fix all of that is to switch simple_xattr_list() to use > > > > > > ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN) > > > > > > which doesn't generate an audit event. > > > > > > I think this is even the correct thing to do as listing xattrs isn't a > > > targeted operation. IOW, if the the user had used getxattr() to request > > > a trusted.* xattr then logging a denial makes sense as the user > > > explicitly wanted to retrieve a trusted.* xattr. But if the user just > > > requested to list all xattrs then silently skipping trusted without > > > logging an explicit denial xattrs makes sense. > > > > > > Does that sound acceptable? > > > > Yes, I can't see any reason why that wouldn't be the best solution. > > Why haven't I thought of that? :) > > > > I guess you will want to submit a patch for it along with your > > rhashtable patch to avoid a conflict? Or would you like me to submit > > it separately? > > I think you can send a patch for this separately as we don't need to > massage the data structure for this. Ok, will do. > I think we can reasonably give this a > > Fixes: 38f38657444d ("xattr: extract simple_xattr code from tmpfs") # no backport > > But note the "# no backport" as imho it isn't worth backporting this to > older kernels unless that's really desirable. Actually, it would be valuable to have it backported to linux-stable at least, since we have users encountering this on Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=2122888 In the end it's up to the backporter to assess each commit, but at least I wouldn't want to outright discourage the backport in the commit message. -- Ondrej Mosnacek Senior Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.