On Wed, 18 Aug 2021, kreijack@xxxxxxxxx wrote: > On 8/15/21 11:53 PM, NeilBrown wrote: > > On Mon, 16 Aug 2021, kreijack@xxxxxxxxx wrote: > >> On 8/15/21 9:35 PM, Roman Mamedov wrote: > >> > >> However looking at the 'exports' man page, it seems that NFS has already an > >> option to cover these cases: 'crossmnt'. > >> > >> If NFSd detects a "child" filesystem (i.e. a filesystem mounted inside an already > >> exported one) and the "parent" filesystem is marked as 'crossmnt', the client mount > >> the parent AND the child filesystem with two separate mounts, so there is not problem of inode collision. > > > > As you acknowledged, you haven't read the whole back-story. Maybe you > > should. > > > > https://lore.kernel.org/linux-nfs/20210613115313.BC59.409509F4@xxxxxxxxxxxx/ > > https://lore.kernel.org/linux-nfs/162848123483.25823.15844774651164477866.stgit@noble.brown/ > > https://lore.kernel.org/linux-btrfs/162742539595.32498.13687924366155737575.stgit@noble.brown/ > > > > The flow of conversation does sometimes jump between threads. > > > > I'm very happy to respond you questions after you've absorbed all that. > > Hi Neil, > > I read the other threads. And I still have the opinion that the nfsd > crossmnt behavior should be a good solution for the btrfs subvolumes. Thanks for reading it all. Let me join the dots for you. "crossmnt" doesn't currently work because "subvolumes" aren't mount points. We could change btrfs so that subvolumes *are* mountpoints. They would have to be automounts. I posted patches to do that. They were broadly rejected because people have many thousands of submounts that are concurrently active and so /proc/mounts would be multiple megabytes is size and working with it would become impractical. Also, non-privileged users can create subvols, and may want the path names to remain private. But these subvols would appear in the mount table and so would no longer be private. Alternately we could change the "crossmnt" functionality to treat a change of st_dev as though it were a mount point. I posted patches to do this too. This hits the same sort of problems in a different way. If NFSD reports that is has crossed a "mount" by providing a different filesystem-id to the client, then the client will create a new mount point which will appear in /proc/mounts. It might be less likely that many thousands of subvolumes are accessed over NFS than locally, but it is still entirely possible. I don't want the NFS client to suffer a problem that btrfs doesn't impose locally. And 'private' subvolumes could again appear on a public list if they were accessed via NFS. Thanks, NeilBrown