On Wed, Jun 22, 2022 at 7:45 PM Darrick J. Wong <djwong@xxxxxxxxxx> wrote: > > On Fri, Jun 17, 2022 at 01:06:37PM +0300, Amir Goldstein wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > commit 5f9b4b0de8dc2fb8eb655463b438001c111570fe upstream. > > > > [backported from CIL scalability series for dependency] > > > > In doing an investigation into AIL push stalls, I was looking at the > > log force code to see if an async CIL push could be done instead. > > This lead me to xfs_log_force_lsn() and looking at how it works. > > > > xfs_log_force_lsn() is only called from inode synchronisation > > contexts such as fsync(), and it takes the ip->i_itemp->ili_last_lsn > > value as the LSN to sync the log to. This gets passed to > > xlog_cil_force_lsn() via xfs_log_force_lsn() to flush the CIL to the > > journal, and then used by xfs_log_force_lsn() to flush the iclogs to > > the journal. > > > > The problem is that ip->i_itemp->ili_last_lsn does not store a > > log sequence number. What it stores is passed to it from the > > ->iop_committing method, which is called by xfs_log_commit_cil(). > > The value this passes to the iop_committing method is the CIL > > context sequence number that the item was committed to. > > > > As it turns out, xlog_cil_force_lsn() converts the sequence to an > > actual commit LSN for the related context and returns that to > > xfs_log_force_lsn(). xfs_log_force_lsn() overwrites it's "lsn" > > variable that contained a sequence with an actual LSN and then uses > > that to sync the iclogs. > > > > This caused me some confusion for a while, even though I originally > > wrote all this code a decade ago. ->iop_committing is only used by > > a couple of log item types, and only inode items use the sequence > > number it is passed. > > > > Let's clean up the API, CIL structures and inode log item to call it > > a sequence number, and make it clear that the high level code is > > using CIL sequence numbers and not on-disk LSNs for integrity > > synchronisation purposes. > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Reviewed-by: Allison Henderson <allison.henderson@xxxxxxxxxx> > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > This /looks/ right, but given the invasiveness of this patch, I'm > curious how much you had to change from upstream to get here? IIRC, it wasn't so bad, there may have been a lot of conflicts, but most of them were resolved by doing search&replace on the 5.10 code and it was also pretty clean and the compiler will catch errors if something that needed to be replaced wasn't in a conflicting hunk. > > Also, did you look at the other log fixes in 5.14 and decide none of > them were needed/worth the risk, or is that merely pending for the next > round? There are other log fixes pending, which look like they fix scary bugs. I just promoted this one to accommodate the joint 5.10/5.15 series. Here are the fixes pending in xfs-5.10.y-4: * 6e0d4f5c1a0f - (xfs-5.10.y-4) xfs: Enforce attr3 buffer recovery order * 610edc215903 - xfs: logging the on disk inode LSN can make it go backwards * da28279083ee - xfs: reset child dir '..' entry when unlinking child * 35efb2d1e3a7 - xfs: remove dead stale buf unpin handling code * fa7ae3691e6f - xfs: hold buffer across unpin and potential shutdown processing * 0f6d6b3e8da3 - xfs: force the log offline when log intent item recovery fails * 1883c074e795 - xfs: fix log intent recovery ENOSPC shutdowns when inactivating inodes I would be happy to get to post these. If you have time to ACK the 5.13 selection [1], I will start testing xfs-5.10.y-4. Thanks! Amir. [1] https://lore.kernel.org/linux-xfs/20220606160537.689915-1-amir73il@xxxxxxxxx/