On Tue, 03 Aug 2021, J. Bruce Fields wrote: > On Tue, Aug 03, 2021 at 07:10:44AM +1000, NeilBrown wrote: > > On Mon, 02 Aug 2021, J. Bruce Fields wrote: > > > On Mon, Aug 02, 2021 at 02:18:29PM +1000, NeilBrown wrote: > > > > For btrfs, the "location" is root.objectid ++ file.objectid. I think > > > > the inode should become (file.objectid ^ swab64(root.objectid)). This > > > > will provide numbers that are unique until you get very large subvols, > > > > and very many subvols. > > > > > > If you snapshot a filesystem, I'd expect, at least by default, that > > > inodes in the snapshot to stay the same as in the snapshotted > > > filesystem. > > > > As I said: we need to challenge and revise user-space (and meat-space) > > expectations. > > The example that came to mind is people that export a snapshot, then > replace it with an updated snapshot, and expect that to be transparent > to clients. > > Our client will error out with ESTALE if it notices an inode number > changed out from under it. Will it? If the inode number changed, then the filehandle would change. Unless the filesystem were exported with subtreecheck, the old filehandle would continue to work (unless the old snapshot was deleted). File-name lookups from the root would find new files... "replace with an updated snapshot" is no different from "replace with an updated directory tree". If you delete the old tree, then currently-open files will break. If you don't you get a reasonably clean transition. > > I don't know if there are other such cases. It seems like surprising > behavior to me, though. If you refuse to risk breaking anything, then you cannot make progress. Providing people can choose when things break, and have advanced warning, they often cope remarkable well. Thanks, NeilBrown > > --b. > > > In btrfs, you DO NOT snapshot a FILESYSTEM. Rather, you effectively > > create a 'reflink' for a subtree (only works on subtrees that have been > > correctly created with the poorly named "btrfs subvolume" command). > > > > As with any reflink, the original has the same inode number that it did > > before, the new version has a different inode number (though in current > > BTRFS, half of the inode number is hidden from user-space, so it looks > > like the inode number hasn't changed). > >