Re: [PATCH v2] vfs: fix link vs. rename race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 13 Sep 2022, Al Viro wrote:
> On Tue, Sep 13, 2022 at 06:20:10AM +0100, Al Viro wrote:
> 
> > > Alternately, lock the "from" directory as well as the "to" directory.
> > > That would mean using lock_rename() and generally copying the structure
> > > of do_renameat2() into do_linkat()
> > 
> > Ever done cp -al?  Cross-directory renames are relatively rare; cross-directory
> > links can be fairly heavy on some payloads, and you'll get ->s_vfs_rename_mutex
> > held a _lot_.
> > 
> > > I wonder if you could get a similar race trying to create a file in
> > > (empty directory) /tmp/foo while /tmp/bar was being renamed over it.
> > 
> > 	Neil, no offense, but... if you have plans regarding changes in
> > directory locking, you might consider reading through the file called
> > Documentation/filesystems/directory-locking.rst
> > 
> > 	Occasionally documentation is where one could expect to find it...
> 
> ... and that "..." above should've been ";-)" - it was not intended as
> a dig, especially since locking in that area has subtle and badly
> underdocumented parts (in particular, anything related to fh_to_dentry(),
> rules regarding the stability of ->d_name and ->d_parent, mount vs. dentry
> invalidation and too many other things), but the basic stuff like that
> is actually covered.
> 
> FWIW, the answer to your question is that the victim of overwriting
> rename is held locked by caller of ->rename(); combined with the lock
> held on directory by anyone who modifies it that prevents the race
> you are asking about.
> 
> See
>         if (!is_dir || (flags & RENAME_EXCHANGE))
>                 lock_two_nondirectories(source, target);
>         else if (target)
>                 inode_lock(target);
> in vfs_rename().
> 

I don't think those locks address the race I was thinking of.

Suppose a /tmp/shared-dir is an empty directory and one thread runs
  rename("/tmp/tmp-dir", "/tmp/shared-dir")
while another thread runs
  open("/tmp/shared-dir/Afile", O_CREAT | O_WRONLY)

If the first wins the file is created in what was tmp-dir.
If the second wins, the rename fails because shared-dir isn't empty.

But they can race.  The open completes the lookup of /tmp/share-dir and
holds the dentry, but the rename succeeds with inode_lock(target) in the
code fragment you provided above before the open() can get the lock.
By the time open() does get the lock, the dentry it holds will be marked
S_DEAD and the __lookup_hash() will fail.
So the open() returns -ENOENT - unexpected.

Test code below, based on the test code for the link race.

I wonder if S_DEAD should result in -ESTALE rather than -ENOENT.
That would cause the various retry_estale() loops to retry the whole
operation.  I suspect we'd actually need more subtlety than just that
simple change, but it might be worth exploring.

NeilBrown


/* Linux rename vs. create race condition.
 * Rationale: both (1) moving a directory to an empty target and
 *   (2) creating a file in the target can race.
 * The create should always succeed, but there should always be
 * directory there, either old or new.
 * The rename might fail if the target isn't empty.
 * Sample file courtesy of Xavier Grand at Algolia
 * g++ -pthread createbug.c -o createbug
 */

#include <thread>
#include <unistd.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <iostream>
#include <string.h>

static const char* producedDir = "/tmp/shared-dir";
static const char* producedTmpDir = "/tmp/new-dir";
static const char* producedThreadFile = "/tmp/shared-dir/Afile";

bool createFile(const char* path)
{
	const int fdOut = open(path,
			       O_WRONLY | O_CREAT | O_TRUNC | O_EXCL | O_CLOEXEC,
			       S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH);
	if (fdOut < 0)
		return false;
	assert(write(fdOut, "Foo", 4) == 4);
	assert(close(fdOut) == 0);
	return true;
}

void func()
{
	int nbSuccess = 0;
	// Loop producedThread create file in dir
	while (true) {
		if (createFile(producedThreadFile) != true) {
		std::cout << "Failed after " << nbSuccess << " with " << strerror(errno) << std::endl;
			exit(EXIT_FAILURE);
		} else {
			nbSuccess++;
		}
		unlink(producedThreadFile);
	}
}

int main()
{
	// Setup env
	rmdir(producedTmpDir);
	unlink(producedThreadFile);
	rmdir(producedDir);
	mkdir(producedDir, 0777);

	// Async thread doing a hardlink and moving it
	std::thread t(func);
	// Loop creating a .tmp and moving it
	while (true) {
		assert(mkdir(producedTmpDir, 0777) == 0);
		while (rename(producedTmpDir, producedDir) != 0)
			unlink(producedThreadFile);
	}
	return 0;
}




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux