Re: Is rename(2) atomic on FAT?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris!

The first question is what do you mean by "atomic". Either if is
"atomic" at process level, that any process which access filesystem see
consistent data at any time, or if by atomic you mean consistency of
filesystem on underlying block device itself, or you mean atomicity at
disk storage level.

On Monday 21 October 2019 23:44:25 Richard Weinberger wrote:
> Chris,
> 
> [CC'ing fsdevel and Pali]
> 
> On Mon, Oct 21, 2019 at 9:59 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
> >
> > http://man7.org/linux/man-pages/man2/rename.2.html
> >
> > Use case is atomically updating bootloader configuration on EFI System
> > partitions. Some bootloader implementations have configuration files
> > bigger than 512 bytes, which could possibly be torn on write. But I'm
> > also not sure what write order FAT uses.
> >
> > 1.
> > FAT32 file system is mounted at /boot/efi
> >
> > 2.
> > # echo "hello" > /boot/efi/tmp/test.txt
> > # mv /boot/efi/tmp/test.txt /boot/efi/EFI/fedora/
> >
> > 3.
> > When I strace the above mv command I get these lines:
> > ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
> > renameat2(AT_FDCWD, "/boot/efi/tmp/test.txt", AT_FDCWD,
> > "/boot/efi/EFI/fedora/", RENAME_NOREPLACE) = -1 EEXIST (File exists)
> > stat("/boot/efi/EFI/fedora/", {st_mode=S_IFDIR|0700, st_size=1024, ...}) = 0
> > renameat2(AT_FDCWD, "/boot/efi/tmp/test.txt", AT_FDCWD,
> > "/boot/efi/EFI/fedora/test.txt", RENAME_NOREPLACE) = 0
> > lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
> > close(0)
> >
> > I can't tell from documentation if renameat2() with flag
> > RENAME_NOREPLACE is atomic, assuming the file doesn't exist at
> > destination.

RENAME_NOREPLACE is atomic at VFS level, independently of used
filesystem. There is no race condition when multiple processes access
that directory at same time.

> > 4.
> > Do it again exactly as before, small change
> > # echo "hello" > /boot/efi/tmp/test.txt
> > # mv /boot/efi/tmp/test.txt /boot/efi/EFI/fedora/
> >
> > 5.
> > The strace shows fallback to rename()
> >
> > ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
> > renameat2(AT_FDCWD, "/boot/efi/tmp/test.txt", AT_FDCWD,
> > "/boot/efi/EFI/fedora/", RENAME_NOREPLACE) = -1 EEXIST (File exists)
> > stat("/boot/efi/EFI/fedora/", {st_mode=S_IFDIR|0700, st_size=1024, ...}) = 0
> > renameat2(AT_FDCWD, "/boot/efi/tmp/test.txt", AT_FDCWD,
> > "/boot/efi/EFI/fedora/test.txt", RENAME_NOREPLACE) = -1 EEXIST (File
> > exists)
> > lstat("/boot/efi/tmp/test.txt", {st_mode=S_IFREG|0700, st_size=7, ...}) = 0
> > newfstatat(AT_FDCWD, "/boot/efi/EFI/fedora/test.txt",
> > {st_mode=S_IFREG|0700, st_size=6, ...}, AT_SYMLINK_NOFOLLOW) = 0
> > geteuid()                               = 0
> > rename("/boot/efi/tmp/test.txt", "/boot/efi/EFI/fedora/test.txt") = 0
> > lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
> > close(0)                                = 0
> >
> >
> > Per documentation that should be atomic. So the questions are, are
> > both atomic, or neither atomice, and if not what should be used to
> > ensure bootloader updates are atomic.

At VFS level both are atomic independently of filesystem.

> According of my understanding of FAT rename() is not atomic at all.
> It can downgrade to a hardlink. i.e. rename("foo", "bar") can result in having
> both "foo" and "bar."
> ...or worse.

Generally rename() may really cause that at some period of time both
"foo" and "bar" may points to same inode. (But is this a really problem
for your scenario?)

But looking at vfat source code (file namei_vfat.c), both rename and
lookup operation are locked by mutex, so during rename operation there
should not be access to read directory and therefore race condition
should not be there (which would cause reading inconsistent directory
during rename operation).

If you want atomic rename of two files independently of filesystem, you
can use RENAME_EXCHANGE flag. It exchanges that two specified files
atomically, so there would not be that race condition like in rename()
that in some period of time both "foo" and "bar" would point to same
inode.


But... if you are asking for consistency and atomicity at filesystem
level (e.g. you turn off disk / power supply during rename operation)
then this is not atomic and probably it cannot be implemented. When FAT
filesystem is mounted (either by Windows or Linux kernel) it is marked
by "dirty" flag and later when doing unmount, "dirty" flag is cleared.

This is there to ensure that operations like rename were finished and
were not stopped/killed in between. So future when you read from FAT
filesystem you would know if it is in consistent state or not.

> Pali has probably more input to share. :-)
> 
> > There are plausibly three kinds:
> >
> > A. write a new file with file name that doesn't previously exist
> > B. write a new file with a new file name, then do a rename stomping on
> > the old one
> > C. overwrite an existing file
> >
> > It seems C is risky. It probably isn't atomic and can't be made to be
> > atomic on FAT.

Option C is really risky. Overwriting file means following operations:

1. truncate file to zero size
2. write first N blocks
3. write second N blocks
...
4. write last M blocks


Option B is a common practise. IIRC also config files in KDE are updated
in this way.

> >
> > --
> > Chris Murphy
> 

-- 
Pali Rohár
pali.rohar@xxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux