Re: [PATCH/RFC] NFSD: handle BTRFS subvolumes better.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 20 Jul 2021, Christoph Hellwig wrote:
> On Tue, Jul 20, 2021 at 09:54:44AM +1000, NeilBrown wrote:
> > Do you have any pointers to other breakage caused by this particular
> > behaviour of btrfs? It would to have all requirements clearly on the
> > table while designing a solution.
> 
> A quick google find:
> 
> https://lore.kernel.org/linux-btrfs/b5e7e64a-741c-baee-bc4d-cd51ca9b3a38@xxxxxxxxx/T/
> https://savannah.gnu.org/bugs/?50859
> https://github.com/coreos/bugs/issues/301
> https://bugs.kde.org/show_bug.cgi?id=317127
> https://github.com/borgbackup/borg/issues/4009
> https://bugs.python.org/issue37339
> http://mail.openjdk.java.net/pipermail/nio-dev/2017-June/004292.html
> 
> and that is just the first 2 or three pages of trivial search results.
> 


Thanks a lot for these!  Very helpful.

The details vary, but the core problem seems to be that the device
number found in /proc/self/mountinfo is the same for all mounts from a
given btrfs filesystem, no matter which subvol happens to be found at or
beneath that mountpoint.  So it can even be that 'stat' on a mountpoint
returns different numbers to what is found for that mountpoint in
/proc/self/mountinfo.

To address these issues we would need to:
1/ make every btrfs subvol which is not already a mountpoint into an
   automount point which mounts the subvol (similar to the use of
   automount in NFS).
2/ either give each subvol a separate 'struct super_block' (which is
   apparently a bad idea) or change show_mountinfo() to allow an
   alternate dev_t to be used. e.g. some new s_op which is given
   mnt->mnt_root and returns a dev_t.  If the new s_op is not
   available, sb->s_dev is used.

For nfsd to be able to work with this, those automount points need to
have an inode in the parent filesystem with a distinct inode number, and
the mount must be marked in some way that nfsd can tell that it is
"internal".  Possibly a helper function that tests if mnt_parent has the
same mnt.mnt_sb would be sufficient, though it might be nice to export
this fact to user-space somehow.

Also exportfs_decode_fh() needs to be enhanced, probably to return a
'struct path'.

Does anything there seem unreasonable to you?

Thanks,
NeilBrown

 




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux