On Thu, Feb 22, 2024 at 12:10:32PM +1100, Dave Chinner wrote: > On Wed, Feb 21, 2024 at 03:25:36PM -0800, Darrick J. Wong wrote: > > On Thu, Feb 22, 2024 at 09:47:23AM +1100, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > syzbot reported an ext4 panic during a page fault where found a > > > journal handle when it didn't expect to find one. The structure > > > it tripped over had a value of 'TRAN' in the first entry in the > > > structure, and that indicates it tripped over a struct xfs_trans > > > instead of a jbd2 handle. > > > > > > The reason for this is that the page fault was taken during a > > > copy-out to a user buffer from an xfs bulkstat operation. XFS uses > > > an "empty" transaction context for bulkstat to do automated metadata > > > buffer cleanup, and so the transaction context is valid across the > > > copyout of the bulkstat info into the user buffer. > > > > > > We are using empty transaction contexts like this in XFS in more > > > places to reduce the risk of failing to release objects we reference > > > during the operation, especially during error handling. Hence we > > > really need to ensure that we can take page faults from these > > > contexts without leaving landmines for the code processing the page > > > fault to trip over. > > > > > > We really only use current->journal_info for a single warning check > > > in xfs_vm_writepages() to ensure we aren't doing writeback from a > > > transaction context. Writeback might need to do allocation, so it > > > can need to run transactions itself. Hence it's a debug check to > > > warn us that we've done something silly, and largely it is not all > > > that useful. > > > > > > So let's just remove all the use of current->journal_info in XFS and > > > get rid of all the potential issues from nested contexts where > > > current->journal_info might get misused by another filesytsem > > > context. > > > > I wonder if this is too much, though? > > We ran XFS for 15+ years without setting current->journal_info, so I > don't see it as a necessary feature... Indeed, I don't see it as a necessary feature for XFS, merely a "bs coming in from other parts of the kernel" detector. And boy howdy did I ever find bs. > > Conceptually, I'd rather we set current->journal_info to some random > > number whenever we start a !NO_WRITECOUNT (aka a non-empty) transaction > > and clear it during transaction commit/cancel. That way, *we* can catch > > the case where some other filesystem starts a transaction and > > accidentally bounces into an updating write fault inside XFS. > > I could just leave the ASSERT(current->journal_info == NULL); in > xfs_trans_set_context() and we would catch that case. But, really, > having a page fault from some other filesystem bounce into XFS where > we then have to run a transaction isn't a bug at all. > > The problem is purely that we now have two different contexts that > now think they own current->journal_info. IOWs, no filesystem can > allow page faults while current->journal_info is set by the > filesystem because the page fault processing might use > current->journal_info itself. > > If we end up with nested XFS transactions from a page fault whilst > holding an empty transaction, then it isn't an issue as the outer > transaction does not hold a log reservation. The only problem that > might occur is a deadlock if the page fault tries to take the same > locks the upper context holds, but that's not a problem that setting > current->journal_info would solve, anyway... This, however, is a much better justification (IMHO) for removing the places where xfs touches journal_info. Would you mind adding that to the commit message? > Hence if XFS doesn't use current->journal_info, then we just don't > care what context we are running the transaction in, above or below > the fault. > > > That might be outweighed by the weird semantics of ext4 where the > > zeroness of journal_info changes its behavior in ways I don't want to > > understand. Thoughts? > > That's exactly the problem we're trying to avoid. Either we don't > allow faults in (empty) transaction context, or we don't use > current->journal_info. I prefer the latter as it gives us much more > implementation freedom with empty transaction contexts.. Heh. I also found much much more bs and shenanigans of exactly the type you'd expect from a global(ish) void pointer. With the commit message amended, you can add: Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> --D > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >