Re: [PATCH/RFC 00/11] expose btrfs subvols in mount table correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2021-07-30 17:48, Josef Bacik wrote:
On 7/30/21 11:17 AM, J. Bruce Fields wrote:
On Fri, Jul 30, 2021 at 02:23:44PM +0800, Qu Wenruo wrote:
OK, forgot it's an opt-in feature, then it's less an impact.

But it can still sometimes be problematic.

E.g. if the user want to put some git code into one subvolume, while
export another subvolume through NFS.

Then the user has to opt-in, affecting the git subvolume to lose the
ability to determine subvolume boundary, right?

Totally naive question: is it be possible to treat different subvolumes
differently, and give the user some choice at subvolume creation time
how this new boundary should behave?

It seems like there are some conflicting priorities that can only be
resolved by someone who knows the intended use case.


This is the crux of the problem.  We have no real interfaces or anything to deal with this sort of paradigm.  We do the st_dev thing because that's the most common way that tools like find or rsync use to determine they've wandered into a "different" volume.  This exists specifically because of usescases like Zygo's, where he's taking thousands of snapshots and manually excluding them from find/rsync is just not reasonable.

We have no good way to give the user information about what's going on, we just have these old shitty interfaces.  I asked our guys about filling up /proc/self/mountinfo with our subvolumes and they had a heart attack because we have around 2-4k subvolumes on machines, and with monitoring stuff in place we regularly read /proc/self/mountinfo to determine what's mounted and such.

And then there's NFS which needs to know that it's walked into a new inode space.

This is all super shitty, and mostly exists because we don't have a good way to expose to the user wtf is going on.

Personally I would be ok with simply disallowing NFS to wander into subvolumes from an exported fs.  If you want to export subvolumes then export them individually, otherwise if you walk into a subvolume from NFS you simply get an empty directory.

This doesn't solve the mountinfo problem where a user may want to figure out which subvol they're in, but this is where I think we could address the issue with better interfaces.  Or perhaps Neil's idea to have a common major number with a different minor number for every subvol.

Either way this isn't as simple as shoehorning it into automount and being done with it, we need to take a step back and think about how should this actually look, taking into account we've got 12 years of having Btrfs deployed with existing usecases that expect a certain behavior.  Thanks,

Josef


As a user and sysadmin I really appreciate the way Btrfs currently works.

We use hourly snapshots which are exposed over Samba as "Previous Versions" to Windows users. This amounts to thousands of snapshots, all user serviceable. A great feature!

In Samba world we have a mount option[1] called "noserverino" which lets the client generate unique inode numbers, rather than using the server provided inode numbers. This allows Linux clients to work well against servers exposing subvolumes and snapshots.

NFS has really old roots and had to make choices that we don't really have to make today. Can we not provide something similar to mount.cifs that generate unique inode numbers for the clients. This could be either an nfsd export option (such as /mnt/foo *(rw,uniq_inodes)) or a mount option on the clients.

One worry I have with making subvolumes automountpoints is that it might affect the possibility to cp --reflink across that boundary.



[1] https://www.samba.org/~ab/output/htmldocs/manpages-3/mount.cifs.8.html






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux