Re: [PATCH v3] ext4: fix bug for rename with RENAME_WHITEOUT

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Thu, Jan 14, 2021 at 5:53 AM Theodore Ts'o <tytso@xxxxxxx> wrote:
>
> On Tue, Jan 05, 2021 at 02:28:57PM +0800, yangerkun wrote:
> > We got a "deleted inode referenced" warning cross our fsstress test. The
> > bug can be reproduced easily with following steps:
> >
> >   cd /dev/shm
> >   mkdir test/
> >   fallocate -l 128M img
> >   mkfs.ext4 -b 1024 img
> >   mount img test/
> >   dd if=/dev/zero of=test/foo bs=1M count=128
> >   mkdir test/dir/ && cd test/dir/
> >   for ((i=0;i<1000;i++)); do touch file$i; done # consume all block
> >   cd ~ && renameat2(AT_FDCWD, /dev/shm/test/dir/file1, AT_FDCWD,
> >     /dev/shm/test/dir/dst_file, RENAME_WHITEOUT) # ext4_add_entry in
> >     ext4_rename will return ENOSPC!!
> >   cd /dev/shm/ && umount test/ && mount img test/ && ls -li test/dir/file1
> >   We will get the output:
> >   "ls: cannot access 'test/dir/file1': Structure needs cleaning"
> >   and the dmesg show:
> >   "EXT4-fs error (device loop0): ext4_lookup:1626: inode #2049: comm ls:
> >   deleted inode referenced: 139"
> >
> > ext4_rename will create a special inode for whiteout and use this 'ino'
> > to replace the source file's dir entry 'ino'. Once error happens
> > latter(the error above was the ENOSPC return from ext4_add_entry in
> > ext4_rename since all space has been consumed), the cleanup do drop the
> > nlink for whiteout, but forget to restore 'ino' with source file. This
> > will trigger the bug describle as above.
> >
> > Signed-off-by: yangerkun <yangerkun@xxxxxxxxxx>
>

Apropos RENAME_WHITEOUT, it seems to be missing __ext4_fc_track_link().
I guess test coverage of RENAME_WHITEOUT in fstests is not much.
I have been seeing trickles of bug fixes for RENAME_WHITEOUT for almost
every filesystem that supports it.

But I must say it would have been very hard to catch missing ext4_fc_track_*
without specialized fs fuzzer such as the CrashMonkey generated tests.

And as long as I am ranting, I'd like to point out that it is a shame
that whiteout
was not implemented as a special (constant) inode whose nlink is irrelevant
(or a special dirent with d_ino 0 and d_type DT_WHT for that matter).
It would have been a rather small RO_COMPAT on-disk change for ext4.
It could also be implemented in slightly more backward compat manner by
maintaining a valid nlink and postpone setting the RO_COMPAT flag until
EXT4_LINK_MAX is reached.

As things stand now, overlayfs makes an effort to maintain a singleton
hardlinked whiteout inode, without being able to use it with RENAME_WHITEOUT
and filesystems have to take special care to journal the metadata of all
individual whiteout inodes, without any added value to the only user
(overlayfs).

But I guess that train has left the station long ago...

Thanks,
Amir.



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux