Locking for RENAME_EXCHANGE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!

This is again about the problem with directory renames I've already
reported in [1]. To quickly sum it up some filesystems (so far we know at
least about xfs, ext4, udf, reiserfs) need to lock the directory when it is
being renamed into another directory. This is because we need to update the
parent pointer in the directory in that case and if that races with other
operation on the directory, bad things can happen.

So far we've done the locking in the filesystem code but recently Darrick
pointed out [2] that we've missed the RENAME_EXCHANGE case in our ext4 fix.
That one is particularly nasty because RENAME_EXCHANGE can arbitrarily mix
regular files and directories. Couple nasty arising cases:

1) We need to additionally lock two exchanged directories. Suppose a
situation like:

mkdir P; mkdir P/A; mkdir P/B; touch P/B/F

CPU1						CPU2
renameat2("P/A", "P/B", RENAME_EXCHANGE);	renameat2("P/B/F", "P/A", 0);

Both operations need to lock A and B directories which are unrelated in the
tree. This means we must establish stable lock ordering on directory locks
even for the case when they are not in ancestor relationship.

2) We may need to lock a directory and a non-directory and they can be in
parent-child relationship when hardlinks are involved:

mkdir A; mkdir B; touch A/F; ln A/F B/F
renameat2("A/F", "B");

And this is really nasty because we don't have a way to find out whether
"A/F" and "B" are in any relationship - in particular whether B happens to
be another parent of A/F or not.

What I've decided to do is to make sure we always lock directory first in
this mixed case and that *should* avoid all the deadlocks but I'm spelling
this out here just in case people can think of some even more wicked case
before I'll send patches.

Also I wanted to ask (Miklos in particular as RENAME_EXCHANGE author): Why
do we lock non-directories in RENAME_EXCHANGE case? If we didn't have to do
that things would be somewhat simpler...

								Honza

[1] https://lore.kernel.org/all/20230117123735.un7wbamlbdihninm@quack3
[2] https://lore.kernel.org/all/20230517045836.GA11594@frogsfrogsfrogs

-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux