On Tue, Mar 29, 2022 at 10:11:09AM +1100, Dave Chinner wrote: > On Mon, Mar 28, 2022 at 03:44:52PM -0700, Darrick J. Wong wrote: > > On Thu, Mar 24, 2022 at 11:20:58AM +1100, Dave Chinner wrote: > > > xfs_iflush_abort( > > > struct xfs_inode *ip) > > > { > > > struct xfs_inode_log_item *iip = ip->i_itemp; > > > - struct xfs_buf *bp = NULL; > > > + struct xfs_buf *bp; > > > > > > - if (iip) { > > > - /* > > > - * Clear the failed bit before removing the item from the AIL so > > > - * xfs_trans_ail_delete() doesn't try to clear and release the > > > - * buffer attached to the log item before we are done with it. > > > - */ > > > - clear_bit(XFS_LI_FAILED, &iip->ili_item.li_flags); > > > - xfs_trans_ail_delete(&iip->ili_item, 0); > > > + if (!iip) { > > > + /* clean inode, nothing to do */ > > > + xfs_iflags_clear(ip, XFS_IFLUSHING); > > > + return; > > > + } > > > + > > > + /* > > > + * Remove the inode item from the AIL before we clear it's internal > > > > Nit: "it's" is a contraction, "its" is possessive. > > > > > + * state. Whilst the inode is in the AIL, it should have a valid buffer > > > + * pointer for push operations to access - it is only safe to remove the > > > + * inode from the buffer once it has been removed from the AIL. > > > + * > > > + * We also clear the failed bit before removing the item from the AIL > > > + * as xfs_trans_ail_delete()->xfs_clear_li_failed() will release buffer > > > + * references the inode item owns and needs to hold until we've fully > > > + * aborted the inode log item and detatched it from the buffer. > > > > Nit: detached > > > > > + */ > > > + clear_bit(XFS_LI_FAILED, &iip->ili_item.li_flags); > > > > I wonder, is there any chance the AIL will stumble onto the inode item > > right here? > > It can, but now it will fail to get the buffer lock because we > currently hold it and so can't do anything writeback related with > the inode item regardless of whether this bit is set or not. i.e. > This patch actually fixes the race you are refering to.... Ok. I was thinking that these changes would eliminate that possibility, but wasn't 100% sure. > > > + xfs_trans_ail_delete(&iip->ili_item, 0); > > > + > > > + /* > > > + * Capture the associated buffer and lock it if the caller didn't > > > + * pass us the locked buffer to begin with. > > > > I agree that we're capturing the buffer here, but this function is not > > locking the buffer since the comment says that the caller has to hold > > the buffer lock already, correct? And AFAICT from looking at all the > > callers, they all hold the buffer locked, like the comment requires. > > I thought I killed that comment - you noted it was incorrect in > the previous iteration. I'll kill it properly when I update the > spulling misteaks above. hehehe. --D > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx