On Fri, May 19, 2017 at 10:22:27AM +1000, Dave Chinner wrote: > On Wed, May 17, 2017 at 12:41:11PM +0200, Carlos Maiolino wrote: > > Hi Dave, > > > > On Wed, May 17, 2017 at 10:57:49AM +1000, Dave Chinner wrote: > > > On Thu, May 11, 2017 at 03:57:33PM +0200, Carlos Maiolino wrote: > > > > +xfs_inode_item_error( > > > > + struct xfs_log_item *lip, > > > > + unsigned int bflags) > > > > +{ > > > > + > > > > + /* > > > > + * The buffer writeback containing this inode has been failed > > > > + * mark it as failed and unlock the flush lock, so it can be retried > > > > + * again > > > > + */ > > > > + if (bflags & XBF_WRITE_FAIL) > > > > + lip->li_flags |= XFS_LI_FAILED; > > > > +} > > > > > > I'm pretty sure we can't modify the log item state like this in > > > IO context because the parent object is not locked by this context. > > > We can modify the state during IO submission because we hold the > > > parent object lock, but we don't hold it here. > > > > > > i.e. we can race with other log item state changes done during a > > > concurrent transaction (e.g. marking the log item dirty in > > > xfs_trans_log_inode()). > > > > > > > + > > > > + if (lip->li_flags & XFS_LI_FAILED) > > > > + lip->li_flags &= XFS_LI_FAILED; > > > > + lip = next; > > > > + } > > > > > > Same race here - we hold the lock for this inode, but not all the > > > other inodes linked to the buffer. > > > > > > This implies that lip->li_flags needs to use atomic bit updates so > > > there are no RMW update races like this. > > > > > > > I believe refers to both cases above, while setting and removing the flag? > > > > I can simply use set_bit() and clear_bit() here, if this is enough, although, to > > use them I'll need to change lip->li_flags size to `unsigned long` if people are > > ok with that. > > Yup, and you need to change every other flag in li_flags to use > atomic bitops, too. That also means using test_bit() for value > checks, test_and_clear_bit() instead of "if (bit) {clear bit}" and > so on. > > i.e. it's no good just making one flag atomic and all the other > updates using RMW accesses to change the state because that doesn't > close the update race conditions. > Am I following this correctly that the only potential race here looks like it is with XFS_LI_ABORTED (the dirty transaction cancel case, specifically)? Note that functions like xfs_trans_log_inode() set dirty state on the log item descriptor, which is associated with the transaction, not the log item itself. The only other log item flag is LI_IN_AIL and that is going to remain set during the lifetime of LI_FAILED by design. Maybe I'm missing something else going on here.. But otherwise given the above, that this is primarily I/O error path code, and that we presumably already have serialization mechanisms in place in the form of parent object locks, why not just grab the inode lock in the appropriate places? Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html