Re: [PATCH] ext4: Fix possible corruption when moving a directory with RENAME_EXCHANGE

Jan Kara <jack@xxxxxxx> · Wed, 24 May 2023 16:18:52 +0200

On Wed 24-05-23 16:11:13, Amir Goldstein wrote:
> On Wed, May 24, 2023 at 2:27 PM Jan Kara <jack@xxxxxxx> wrote:
> >
> > On Tue 23-05-23 13:50:01, David Laight wrote:
> > > From: Jan Kara
> > > > Sent: 23 May 2023 14:14
> > > >
> > > > Commit 0813299c586b ("ext4: Fix possible corruption when moving a
> > > > directory") forgot that handling of RENAME_EXCHANGE renames needs the
> > > > protection of inode lock when changing directory parents for moved
> > > > directories. Add proper locking for that case as well.
> > > >
> > > > CC: stable@xxxxxxxxxxxxxxx
> > > > Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory")
> > > > Reported-by: "Darrick J. Wong" <djwong@xxxxxxxxxx>
> > > > Signed-off-by: Jan Kara <jack@xxxxxxx>
> > > > ---
> > > >  fs/ext4/namei.c | 23 +++++++++++++++++++++--
> > > >  1 file changed, 21 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> > > > index 45b579805c95..b91abea1c781 100644
> > > > --- a/fs/ext4/namei.c
> > > > +++ b/fs/ext4/namei.c
> > > > @@ -4083,10 +4083,25 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
> > > >     if (retval)
> > > >             return retval;
> > > >
> > > > +   /*
> > > > +    * We need to protect against old.inode and new.inode directory getting
> > > > +    * converted from inline directory format into a normal one. The lock
> > > > +    * ordering does not matter here as old and new are guaranteed to be
> > > > +    * incomparable in the directory hierarchy.
> > > > +    */
> > > > +   if (S_ISDIR(old.inode->i_mode))
> > > > +           inode_lock(old.inode);
> > > > +   if (S_ISDIR(new.inode->i_mode))
> > > > +           inode_lock_nested(new.inode, I_MUTEX_NONDIR2);
> > > > +
> > >
> > > What happens if there is another concurrent rename from new.inode
> > > to old.inode?
> > > That will try to acquire the locks in the other order.
> >
> > That is not really possible because these two renames cannot happen in
> > parallel due to VFS locking - either old & new share parent which is locked
> > by VFS (so there cannot be another rename in that directory) or they have
> > different parents which are also locked by VFS (so again it is not possible
> > to race with another rename in these two dirs).
> 
> Unless D1/A ; D1/B are hardlinks of D2/B ; D2/A respectively
> and exchange (D1/A, D1/B) is racing with exchange (D2/B, D2/A)

Well, but these are *directories*. So no hardlinks possible ;) I agree with
regular files we'd have to be more careful but then VFS would take care of
the locking anyway. I'm still convinced VFS should be taking care of
locking of directories as well but Al disagreed [1] and wants only filesystems
that need this to handle the directory locking.

> There is a simple solution of course, same as xfs_lock_two_inodes()
> 
> Another possible deadlock (I think) is if D/A ; D/B are subdirs that
> are exchanged and after taking inode_lock of D and A, rename comes
> in D/B/foo => D/A/foo and lock_rename() tries to
> lock_two_directories(B, A).
> 
> So it seems that both lock_two_directories() and to be helper
> lock_two_inodes() need to order the two inodes by address?

Right, so this case indeed looks possible and I didn't think about it.
Thanks for spotting this! Let me try to persuade Al again to do the
necessary locking in VFS as it is getting really hairy and needs VFS
changes anyway.

								Honza

[1] https://lore.kernel.org/all/Y8bTk1CsH9AaAnLt@ZenIV
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR