On Fri, Feb 18, 2022 at 04:32:49PM +0100, Miklos Szeredi wrote: > There has been a longstanding race condition between rename(2) and link(2), > when those operations are done in parallel: > > 1. Moving a file to an existing target file (eg. mv file target) > 2. Creating a link from the target file to a third file (eg. ln target > link) > > By the time vfs_link() locks the target inode, it might already be unlinked > by rename. This results in vfs_link() returning -ENOENT in order to > prevent linking to already unlinked files. This check was introduced in > v2.6.39 by commit aae8a97d3ec3 ("fs: Don't allow to create hardlink for > deleted file"). > > This breaks apparent atomicity of rename(2), which is described in > standards and the man page: > > "If newpath already exists, it will be atomically replaced, so that > there is no point at which another process attempting to access > newpath will find it missing." > > The simplest fix is to exclude renames for the complete link operation. > > This patch introduces a global rw_semaphore that is locked for read in > rename and for write in link. To prevent excessive contention, do not take > the lock in link on the first try. If the source of the link was found to > be unlinked, then retry with the lock held. AFAICS, that deadlocks if lock_rename() is taken in ecryptfs_rename() (with lock_rename() already taken by its caller) after another thread blocks trying to take your link_rwsem exclusive.