On Sat, Nov 12, 2022 at 07:52:50AM +1100, Dave Chinner wrote: > On Mon, Nov 07, 2022 at 10:36:48PM +0800, Long Li wrote: > > The following error occurred during the fsstress test: > > > > XFS: Assertion failed: VFS_I(ip)->i_nlink >= 2, file: fs/xfs/xfs_inode.c, line: 2925 > > > > The problem was that inode race condition causes incorrect i_nlink to be > > written to disk, and then it is read into memory. Consider the following > > call graph, inodes that are marked as both XFS_IFLUSHING and > > XFS_IRECLAIMABLE, i_nlink will be reset to 1 and then restored to original > > value in xfs_reinit_inode(). Therefore, the i_nlink of directory on disk > > may be set to 1. > > > > xfsaild > > xfs_inode_item_push > > xfs_iflush_cluster > > xfs_iflush > > xfs_inode_to_disk > > > > xfs_iget > > xfs_iget_cache_hit > > xfs_iget_recycle > > xfs_reinit_inode > > inode_init_always > > > > So skip inodes that being flushed and markded as XFS_IRECLAIMABLE, prevent > > concurrent read and write to inodes. > > urk. > > xfs_reinit_inode() needs to hold the ILOCK_EXCL as it is changing > internal inode state and can race with other RCU protected inode > lookups. Have a look at what xfs_iflush_cluster() does - it > grabs the ILOCK_SHARED while under rcu + ip->i_flags_lock, and so > xfs_iflush/xfs_inode_to_disk() are protected from racing inode > updates (during transactions) by that lock. > > Hence it looks to me that I_FLUSHING isn't the problem here - it's > that we have a transient modified inode state in xfs_reinit_inode() > that is externally visisble... Before xfs_reinit_inode(), XFS_IRECLAIM will be set in ip->i_flags, this looks like can prevent race with other RCU protected inode lookups. Can it be considered that don't modifying the information about the on-disk values in the VFS inode in xfs_reinit_inode()? if so lock can be avoided. Thanks, Long Li