Re: [PATCH/RFC 0/4] Attempt to make progress with btrfs dev number strangeness.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/8/21 11:55 PM, NeilBrown wrote:
I continue to search for a way forward for btrfs so that its behaviour
with respect to device numbers and subvols is somewhat coherent.

This series implements some of the ideas in my "A Third perspective"[1],
though with changes is various details.

I introduce two new mount options, which default to
no-change-in-behaviour.

  -o inumbits=  causes inode numbers to be more unique across a whole btrfs
                filesystem, and is many cases completely unique.  Mounting
                with "-i inumbits=56" will resolve the NFS issues that
                started me tilting at this particular windmill.

  -o numdevs=  can reduce the number of distinct devices reported by
               stat(), either to 2 or to 1.
               Both ease problems for sites that exhaust their supply of
               device numbers.
               '2' allows "du -x" to continue to work, but is otherwise
               rather strange.
               '1' breaks the use of "du -x" and similar to examine a
               single subvol which might have subvol descendants, but
               provides generally sane behaviour
               "-o numdevs=1" also forces inumbits to have a useful value.

I introduce a "tree id" which can be discovered using statx().  Two
files with the same dev and ino might still be different if the tree-ids
are different.  Connected files with the same tree-id may be usefully
considered to be related.

I also change various /proc files (only when numdevs=1 is used) to
provide extra information so they are useful with btrfs despite subvols.
/proc/maps /proc/smaps /proc/locks /proc/X/fdinfo/Y are affected.
The inode number becomes "XX:YY" where XX is the subvol number (tree id)
and YY is the inode number.

An alternate might be to report a number which might use up to 128 bits.
Which is less likely to seriously break code?

Note that code which ignores badly formatted lines is safe, because it
will never currently find a match for a btrfs file in these files
anyway.  The device number they report is never returned in st_dev for
stat() on any file.

The audit subsystem and one or two other places report dev/ino and so
need enhanced, but I haven't tried to address those.

Various trace points also report dev/ino.  I haven't tried thinking
about those either.

I think this is a step in the right direction, but I want to figure out a way to accomplish this without magical mount points that users must be aware of.

I think the stat() st_dev ship as sailed, we're stuck with that. However Christoph does have a valid point where it breaks the various info spit out by /proc. You've done a good job with the treeid here, but it still makes it impossible for somebody to map the st_dev back to the correct mount.

I think we aren't going to solve that problem, at least not with stat(). I think with statx() spitting out treeid we have given userspace a way to differentiate subvolumes, and so we should fix statx() to spit out the the super block device, that way new userspace things can do their appropriate lookup if they so choose.

This leaves the problem of nfsd. Can you just integrate this new treeid into nfsd, and use that to either change the ino within nfsd itself, or do something similar to what your first patchset did and generate a fsid based on the treeid?

Mount options are messy, and are just going to lead to distro's turning them on without understanding what's going on and then we have to support them forever. I want to get this fixed in a way that we all hate the least with as little opportunity for confused users to make bad decisions. Thanks,

Josef




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux