On Thu, 24 Jun 2021, J. Bruce Fields wrote: > On Thu, Jun 24, 2021 at 08:04:57AM +1000, NeilBrown wrote: > > On Thu, 24 Jun 2021, J. Bruce Fields wrote: > > One other thing I'm not sure about: how do cold cache lookups of > filehandles for (possibly not-yet-mounted) subvolumes work? Ahhhh... that's a good point. Filehandle lookup depends on the target filesystem being mounted. NFS exporting filesystems which are auto-mounted on demand would be ... interesting. That argues in favour of nfsd treating a btrfs filesystem as a single filesystem and gaining some knowledge about different subvolumes within a filesystem. This has implications for NFS re-export. If a filehandle is received for an NFS filesystem that needs to be automounted, I expect it would fail. Or do we want to introduce a third level in the filehandle: filesystem, subvol, inode. So just the "filesystem" is used to look things up in /proc/mounts, but "filesystem+subvol" is used to determine the fsid. Maybe another way to state this is that the filesystem could identify a number of bytes from the fs-local part of the filehandle that should be mixed in to the fsid. That might be a reasonably clean interface. > > > All we really need is: > > 1/ someone to write the code > > 2/ someone to review the code > > 3/ someone to accept the code > > Hah. Still, the special exceptions for btrfs seem to be accumulating. > I wonder if that's happening outside nfs as well. I have some colleagues who work on btrfs and based on my occasional discussions, I think that: yes, btrfs is a bit "special". There are a number of corner-cases where it doesn't quite behave how one would hope. This is probably inevitable given they way it is pushing the boundaries of functionality. It can be a challenge to determine if that "hope" is actually reasonable, and to figure out a good solution that meets the need cleanly without imposing performance burdens elsewhere. NeilBrown