Hello, On Wed 18-01-23 08:44:57, Dave Chinner wrote: > On Tue, Jan 17, 2023 at 01:37:35PM +0100, Jan Kara wrote: > > I've some across an interesting issue that was spotted by syzbot [1]. The > > report is against UDF but AFAICS the problem exists for ext4 as well and > > possibly other filesystems. The problem is the following: When we are > > renaming directory 'dir' say rename("foo/dir", "bar/") we lock 'foo' and > > 'bar' but 'dir' is unlocked because the locking done by vfs_rename() is > > > > if (!is_dir || (flags & RENAME_EXCHANGE)) > > lock_two_nondirectories(source, target); > > else if (target) > > inode_lock(target); > > > > However some filesystems (e.g. UDF but ext4 as well, I suspect XFS may be > > hurt by this as well because it converts among multiple dir formats) need > > to update parent pointer in 'dir' and nothing protects this update against > > a race with someone else modifying 'dir'. Now this is mostly harmless > > because the parent pointer (".." directory entry) is at the beginning of > > the directory and stable however if for example the directory is converted > > from packed "in-inode" format to "expanded" format as a result of > > concurrent operation on 'dir', the filesystem gets corrupted (or crashes as > > in case of UDF). > > No, xfs_rename() does not have this problem - we pass four inodes to > the function - the source directory, source inode, destination > directory and destination inode. > > In the above case, "dir/" is passed to XFs as the source inode - the > src_dir is "foo/", the target dir is "bar/" and the target inode is > null. src_dir != target_dir, so we set the "new_parent" flag. the > srouce inode is a directory, so we set the src_is_directory flag, > too. > > We lock all three inodes that are passed. We do various things, then > run: > > if (new_parent && src_is_directory) { > /* > * Rewrite the ".." entry to point to the new > * directory. > */ > error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot, > target_dp->i_ino, spaceres); > ASSERT(error != -EEXIST); > if (error) > goto out_trans_cancel; > } > > which replaces the ".." entry in source inode atomically whilst it > is locked. Any directory format changes that occur during the > rename are done while the ILOCK is held, so they appear atomic to > outside observers that are trying to parse the directory structure > (e.g. readdir). Thanks for explanation! I've missed the ILOCK locking in xfs_rename() when I was glancing over the function... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR