On Sat, Sep 19, 2020 at 05:39:23AM -0400, Shijie Luo wrote: > There is a race condition between destroy_inode and writeback_sb_inodes, > thread-1 thread-2 > wb_workfn > writeback_inodes_wb > __writeback_inodes_wb > writeback_sb_inodes > wbc_attach_and_unlock_inode > iget_locked > destroy_inode > inode_detach_wb > inode->i_wb = NULL; > > inode_to_wb_and_lock_list > locked_inode_to_wb_and_lock_list > wb_get > oops > > so destroy inode after adding I_FREEING to inode state and the I_SYNC state > being cleared. > > Reported-by: Tianxiong Lu <lutianxiong@xxxxxxxxxx> > Signed-off-by: Shijie Luo <luoshijie1@xxxxxxxxxx> > Signed-off-by: Haotian Li <lihaotian9@xxxxxxxxxx> > --- > fs/inode.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/fs/inode.c b/fs/inode.c > index 72c4c347afb7..b28a2a9e15d5 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1148,10 +1148,17 @@ struct inode *iget5_locked(struct super_block *sb, unsigned long hashval, > struct inode *new = alloc_inode(sb); > > if (new) { > + spin_lock(&new->i_lock); > new->i_state = 0; > + spin_unlock(&new->i_lock); This part is unnecessary. We just allocated 'new' two lines above; nobody else can see 'new' yet. We make it visible with hlist_add_head_rcu() which uses rcu_assign_pointer() whch contains a memory barrier, so it's impossible for another CPU to see a stale i_state. > inode = inode_insert5(new, hashval, test, set, data); > - if (unlikely(inode != new)) > + if (unlikely(inode != new)) { > + spin_lock(&new->i_lock); > + new->i_state |= I_FREEING; > + spin_unlock(&new->i_lock); > + inode_wait_for_writeback(new); > destroy_inode(new); This doesn't make sense either. If an inode is returned here which is not 'new', then adding 'new' to the hash failed, and new was never visible to another CPU. > @@ -1218,6 +1225,11 @@ struct inode *iget_locked(struct super_block *sb, unsigned long ino) > * allocated. > */ > spin_unlock(&inode_hash_lock); > + > + spin_lock(&inode->i_lock); > + inode->i_state |= I_FREEING; > + spin_unlock(&inode->i_lock); > + inode_wait_for_writeback(inode); > destroy_inode(inode); Again, this doesn't make sense. This is also a codepath which failed to make 'inode' visible to any other thread. I don't understand how this patch could fix anything.