On Thu, 2024-08-08 at 21:22 -0400, Paul Moore wrote: > On Thu, Aug 8, 2024 at 8:33 PM Jeff Layton <jlayton@xxxxxxxxxx> > wrote: > > On Thu, 2024-08-08 at 20:28 -0400, Paul Moore wrote: > > > On Thu, Aug 8, 2024 at 7:43 PM Jeff Layton <jlayton@xxxxxxxxxx> > > > wrote: > > > > On Thu, 2024-08-08 at 17:12 -0400, Paul Moore wrote: > > > > > On Thu, Aug 8, 2024 at 1:11 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > On Thu 08-08-24 12:36:07, Christian Brauner wrote: > > > > > > > On Wed, Aug 07, 2024 at 10:36:58AM GMT, Jeff Layton > > > > > > > wrote: > > > > > > > > On Wed, 2024-08-07 at 16:26 +0200, Christian Brauner > > > > > > > > wrote: > > > > > > > > > > +static struct dentry *lookup_fast_for_open(struct > > > > > > > > > > nameidata *nd, int open_flag) > > > > > > > > > > +{ > > > > > > > > > > + struct dentry *dentry; > > > > > > > > > > + > > > > > > > > > > + if (open_flag & O_CREAT) { > > > > > > > > > > + /* Don't bother on an O_EXCL create > > > > > > > > > > */ > > > > > > > > > > + if (open_flag & O_EXCL) > > > > > > > > > > + return NULL; > > > > > > > > > > + > > > > > > > > > > + /* > > > > > > > > > > + * FIXME: If auditing is enabled, > > > > > > > > > > then we'll have to unlazy to > > > > > > > > > > + * use the dentry. For now, don't > > > > > > > > > > do this, since it shifts > > > > > > > > > > + * contention from parent's i_rwsem > > > > > > > > > > to its d_lockref spinlock. > > > > > > > > > > + * Reconsider this once dentry > > > > > > > > > > refcounting handles heavy > > > > > > > > > > + * contention better. > > > > > > > > > > + */ > > > > > > > > > > + if ((nd->flags & LOOKUP_RCU) && > > > > > > > > > > !audit_dummy_context()) > > > > > > > > > > + return NULL; > > > > > > > > > > > > > > > > > > Hm, the audit_inode() on the parent is done > > > > > > > > > independent of whether the > > > > > > > > > file was actually created or not. But the > > > > > > > > > audit_inode() on the file > > > > > > > > > itself is only done when it was actually created. > > > > > > > > > Imho, there's no need > > > > > > > > > to do audit_inode() on the parent when we immediately > > > > > > > > > find that file > > > > > > > > > already existed. If we accept that then this makes > > > > > > > > > the change a lot > > > > > > > > > simpler. > > > > > > > > > > > > > > > > > > The inconsistency would partially remain though. When > > > > > > > > > the file doesn't > > > > > > > > > exist audit_inode() on the parent is called but by > > > > > > > > > the time we've > > > > > > > > > grabbed the inode lock someone else might already > > > > > > > > > have created the file > > > > > > > > > and then again we wouldn't audit_inode() on the file > > > > > > > > > but we would have > > > > > > > > > on the parent. > > > > > > > > > > > > > > > > > > I think that's fine. But if that's bothersome the > > > > > > > > > more aggressive thing > > > > > > > > > to do would be to pull that audit_inode() on the > > > > > > > > > parent further down > > > > > > > > > after we created the file. Imho, that should be > > > > > > > > > fine?... > > > > > > > > > > > > > > > > > > See > > > > > > > > > https://gitlab.com/brauner/linux/-/commits/vfs.misc.jeff/?ref_type=heads > > > > > > > > > for a completely untested draft of what I mean. > > > > > > > > > > > > > > > > Yeah, that's a lot simpler. That said, my experience > > > > > > > > when I've worked > > > > > > > > with audit in the past is that people who are using it > > > > > > > > are _very_ > > > > > > > > sensitive to changes of when records get emitted or > > > > > > > > not. I don't like > > > > > > > > this, because I think the rules here are ad-hoc and > > > > > > > > somewhat arbitrary, > > > > > > > > but keeping everything working exactly the same has > > > > > > > > been my MO whenever > > > > > > > > I have to work in there. > > > > > > > > > > > > > > > > If a certain access pattern suddenly generates a > > > > > > > > different set of > > > > > > > > records (or some are missing, as would be in this > > > > > > > > case), we might get > > > > > > > > bug reports about this. I'm ok with simplifying this > > > > > > > > code in the way > > > > > > > > you suggest, but we may want to do it in a patch on top > > > > > > > > of mine, to > > > > > > > > make it simple to revert later if that becomes > > > > > > > > necessary. > > > > > > > > > > > > > > Fwiw, even with the rearranged checks in v3 of the patch > > > > > > > audit records > > > > > > > will be dropped because we may find a positive dentry but > > > > > > > the path may > > > > > > > have trailing slashes. At that point we just return > > > > > > > without audit > > > > > > > whereas before we always would've done that audit. > > > > > > > > > > > > > > Honestly, we should move that audit event as right now > > > > > > > it's just really > > > > > > > weird and see if that works. Otherwise the change is > > > > > > > somewhat horrible > > > > > > > complicating the already convoluted logic even more. > > > > > > > > > > > > > > So I'm appending the patches that I have on top of your > > > > > > > patch in > > > > > > > vfs.misc. Can you (other as well ofc) take a look and > > > > > > > tell me whether > > > > > > > that's not breaking anything completely other than later > > > > > > > audit events? > > > > > > > > > > > > The changes look good as far as I'm concerned but let me CC > > > > > > audit guys if > > > > > > they have some thoughts regarding the change in generating > > > > > > audit event for > > > > > > the parent. Paul, does it matter if open(O_CREAT) doesn't > > > > > > generate audit > > > > > > event for the parent when we are failing open due to > > > > > > trailing slashes in > > > > > > the pathname? Essentially we are speaking about moving: > > > > > > > > > > > > audit_inode(nd->name, dir, AUDIT_INODE_PARENT); > > > > > > > > > > > > from open_last_lookups() into lookup_open(). > > > > > > > > > > Thanks for adding the audit mailing list to the CC, Jan. I > > > > > would ask > > > > > for others to do the same when discussing changes that could > > > > > impact > > > > > audit (similar requests for the LSM framework, SELinux, > > > > > etc.). > > > > > > > > > > The inode/path logging in audit is ... something. I have a > > > > > longstanding todo item to go revisit the audit inode logging, > > > > > both to > > > > > fix some known bugs, and see what we can improve (I'm > > > > > guessing quite a > > > > > bit). Unfortunately, there is always something else which is > > > > > burning > > > > > a little bit hotter and I haven't been able to get to it yet. > > > > > > > > > > > > > It is "something" alright. The audit logging just happens at > > > > strange > > > > and inconvenient times vs. what else we're trying to do wrt > > > > pathwalking > > > > and such. In particular here, the fact __audit_inode can block > > > > is what > > > > really sucks. > > > > > > > > Since we're discussing it... > > > > > > > > ISTM that the inode/path logging here is something like a > > > > tracepoint. > > > > In particular, we're looking to record a specific set of > > > > information at > > > > specific points in the code. One of the big differences between > > > > them > > > > however is that tracepoints don't block. The catch is that we > > > > can't > > > > just drop messages if we run out of audit logging space, so > > > > that would > > > > have to be handled reasonably. > > > > > > Yes, the buffer allocation is the tricky bit. Audit does > > > preallocate > > > some structs for tracking names which ideally should handle the > > > vast > > > majority of the cases, but yes, we need something to handle all > > > of the > > > corner cases too without having to resort to audit_panic(). > > > > > > > I wonder if we could leverage the tracepoint infrastructure to > > > > help us > > > > record the necessary info somehow? Copy the records into a > > > > specific > > > > ring buffer, and then copy them out to the audit infrastructure > > > > in > > > > task_work? > > > > > > I believe using task_work will cause a number of challenges for > > > the > > > audit subsystem as we try to bring everything together into a > > > single > > > audit event. We've had a lot of problems with io_uring doing > > > similar > > > things, some of which are still unresolved. > > > > > > > I don't have any concrete ideas here, but the path/inode audit > > > > code has > > > > been a burden for a while now and it'd be good to think about > > > > how we > > > > could do this better. > > > > > > I've got some grand ideas on how to cut down on a lot of our > > > allocations and string generation in the critical path, not just > > > with > > > the inodes, but with audit records in general. Sadly I just > > > haven't > > > had the time to get to any of it. > > > > > > > > The general idea with audit is that you want to record the > > > > > information > > > > > both on success and failure. It's easy to understand the > > > > > success > > > > > case, as it is a record of what actually happened on the > > > > > system, but > > > > > you also want to record the failure case as it can provide > > > > > some > > > > > insight on what a process/user is attempting to do, and that > > > > > can be > > > > > very important for certain classes of users. I haven't dug > > > > > into the > > > > > patches in Christian's tree, but in general I think Jeff's > > > > > guidance > > > > > about not changing what is recorded in the audit log is > > > > > probably good > > > > > advice (there will surely be exceptions to that, but it's > > > > > still good > > > > > guidance). > > > > > > > > > > > > > In this particular case, the question is: > > > > > > > > Do we need to emit a AUDIT_INODE_PARENT record when opening an > > > > existing > > > > file, just because O_CREAT was set? We don't emit such a record > > > > when > > > > opening without O_CREAT set. > > > > > > I'm not as current on the third-party security requirements as I > > > used > > > to be, but I do know that oftentimes when a file is created the > > > parent > > > directory is an important bit of information to have in the audit > > > log. > > > > > > > Right. We'd still have that here since we have to unlazy to > > actually > > create the file. > > > > The question here is about the case where O_CREAT is set, but the > > file > > already exists. Nothing is being created in that case, so do we > > need to > > emit an audit record for the parent? > > As long as the full path information is present in the existing > file's > audit record it should be okay. > O_CREAT is ignored when the dentry already exists, so doing the same thing that we do when O_CREAT isn't set seems reasonable. We do call this in do_open, which would apply in this case: if (!(file->f_mode & FMODE_CREATED)) audit_inode(nd->name, nd->path.dentry, 0); That should have the necessary path info. If that's the case, then I think Christian's cleanup series on top of mine should be OK. I think that the only thing that would be missing is the AUDIT_INODE_PARENT record for the directory in the case where the dentry already exists, which should be superfluous. ISTR that Red Hat has a pretty extensive testsuite for audit. We might want to get them to run their tests on Christian's changes to be sure there are no surprises, if they are amenable. -- Jeff Layton <jlayton@xxxxxxxxxx>