On 8/12/21 9:45 PM, NeilBrown wrote:
[[This patch is a minimal patch which addresses the current problems with nfsd and btrfs, in a way which I think is most supportable, least surprising, and least likely to impact any future attempts to more completely fix the btrfs file-identify problem]] BTRFS does not provide unique inode numbers across a filesystem. It *does* provide unique inode numbers with a subvolume and uses synthetic device numbers for different subvolumes to ensure uniqueness for device+inode. nfsd cannot use these varying device numbers. If nfsd were to synthesise different stable filesystem ids to give to the client, that would cause subvolumes to appear in the mount table on the client, even though they don't appear in the mount table on the server. Also, NFSv3 doesn't support changing the filesystem id without a new explicit mount on the client (this is partially supported in practice, but violates the protocol specification). So currently, the roots of all subvolumes report the same inode number in the same filesystem to NFS clients and tools like 'find' notice that a directory has the same identity as an ancestor, and so refuse to enter that directory. This patch allows btrfs (or any filesystem) to provide a 64bit number that can be xored with the inode number to make the number more unique. Rather than the client being certain to see duplicates, with this patch it is possible but extremely rare. The number than btrfs provides is a swab64() version of the subvolume identifier. This has most entropy in the high bits (the low bits of the subvolume identifer), while the inoe has most entropy in the low bits. The result will always be unique within a subvolume, and will almost always be unique across the filesystem.
This is a reasonable approach to me, solves the problem without being overly complicated and side-steps the thornier issues around how we deal with subvolumes. I'll leave it up to the other maintainers of the other fs'es to weigh in, but for me you can add
Acked-by: Josef Bacik <josef@xxxxxxxxxxxxxx> Thanks, Josef