On Wed, Sep 01, 2021 at 08:20:06AM +0100, Christoph Hellwig wrote: > On Tue, Aug 31, 2021 at 02:59:05PM +1000, NeilBrown wrote: > > Making the change purely in btrfs is simply not possible. There is no > > way for btrfs to provide nfsd with a different inode number. To move > > the bulk of the change into btrfs code we would need - at the very least > > - some way for nfsd to provide the filehandle when requesting stat > > information. We would also need to provide a reference filehandle when > > requesting a dentry->filehandle conversion. Cluttering the > > export_operations like that just for btrfs doesn't seem like the right > > balance. I agree that cluttering kstat is not ideal, but it was a case > > of choosing the minimum change for the maximum effect. > > So you're papering over a btrfs bug by piling up cludges in the nsdd > code that has not business even knowing about this btrfs bug, while > leaving other users of inodes numbers and file handles broken? > > If you only care about file handles: this is what the export operations > are for. If you care about inode numbers: well, it is up to btrfs > to generate uniqueue inode numbers. It currently doesn't do that, and > no amount of papering over that in nfsd is going to fix the issue. > > If XORing a little more entropy It's stronger than "a little more entropy". We know enough about how the numbers being XOR'd grow to know that collisions are only going to happen in some extreme use cases. (If I understand correctly.) > into the inode number is a good enough band aid (and I strongly > disagree with that), do it inside btrfs for every place they report > the inode number. There is nothing NFS-specific about that. Neil tried something like that: https://lore.kernel.org/linux-nfs/162761259105.21659.4838403432058511846@xxxxxxxxxxxxxxxxxxxxx/ "The patch below, which is just a proof-of-concept, changes btrfs to report a uniform st_dev, and different (64bit) st_ino in different subvols." (Though actually you're proposing keeping separate st_dev?) I looked back through a couple threads to try to understand why we couldn't do that (on new filesystems, with a mkfs option to choose new or old behavior) and still don't understand. But the threads are long. There are objections to a new mount option (which seem obviously wrong; this should be a persistent feature of the on-disk filesystem). --b.