On Wed, Jan 18, 2023 at 10:10:36AM +0100, Jan Kara wrote: > > Yes, we can lock the source inode in ->rename() if we need it. The snag is > that if 'target' exists, it is already locked so when locking 'source' we > are possibly not following the VFS lock ordering of i_rwsem by inode > address (I don't think it can cause any real dealock but still it looks > suspicious). Also we'll have to lock with I_MUTEX_NONDIR2 lockdep class to > make lockdep happy but that's just a minor annoyance. Finally, we'll have > to check for RENAME_EXCHANGE because in that case, both source and target > will be already locked. Thus if we do the additional locking in the > filesystem, we will leak quite some details about rename locking into the > filesystem which seems undesirable to me. Rules for inode locks are simple: * directories before non-directories * ancestors before descendents * for non-directories the ordering is by in-core inode address So the instances that need that extra lock would do that when source is a directory and non RENAME_EXCHANGE is given. Having the target already locked is irrelevant - if it exists, it's already checked to be a directory as well, and had it been a descendent of source, we would have already found that and failed with -ELOOP. If A and B are both directories, there's no ordering between them unless one is an ancestor of another - such can be locked in any order. However, one of the following must be true: * C is locked and both A and B had been observed to be children of C after the lock on C had been acquired, or * ->s_vfs_rename_mutex is held for the filesystem containing both A and B. Note that ->s_vfs_rename_mutex is there to stabilize the tree topology and make "is A an ancestor of B?" possible to check for more than "A is locked, B is a child of A, so A will remain its ancestor until unlocked"...