On Wed, Oct 25, 2023 at 08:34:21AM -0700, Christoph Hellwig wrote: > On Wed, Oct 25, 2023 at 04:50:45PM +0300, Amir Goldstein wrote: > > Jan, > > > > This patch set implements your suggestion [1] for handling fanotify > > events for filesystems with non-uniform f_fsid. > > File systems nust never report non-uniform fsids (or st_dev) for that > matter. btrfs is simply broken here and needs to be fixed. We keep going around and around on this so I'd like to get a set of steps laid out for us to work towards to resolve this once and for all. HYSTERICAL RAISINS (why we do st_dev) ------------------------------------- Chris made this decision forever ago because things like rsync would screw up with snapshots and end up backing up the same thing over and over again. We saw it was using st_dev (as were a few other standard tools) to distinguish between file systems, so we abused this to make userspace happy. The other nice thing this provided was a solution for the fact that we re-use inode numbers in the file system, as they're unique for the subvolume only. PROBLEMS WE WANT TO SOLVE ------------------------- 1) Stop abusing st_dev. We actually want this as btrfs developers because it's kind of annoying to figure out which device is mounted when st_dev doesn't map to any of the devices in /proc/mounts. 2) Give user space a way to tell it's on a subvolume, so it can not be confused by the repeating inode numbers. POSSIBLE SOLUTIONS ------------------ 1) A statx field for subvolume id. The subvolume id's are unique to the file system, so subvolume id + inode number is unique to the file system. This is a u64, so is nice and easy to export through statx. 2) A statx field for the uuid/fsid of the file system. I'd like this because again, being able to easily stat a couple of files and tell they're on the same file system is a valuable thing. We have a per-fs uuid that we can export here. 3) A statx field for the uuid of the subvolume. Our subvolumes have their own unique uuid. This could be an alternative for the subvolume id option, or an addition. Either 1 or 3 are necessary to give userspace a way to tell they've wandered into a different subvolume. I'd like to have all 3, but I recognize that may be wishful thinking. 2 isn't necessary, but if we're going to go about messing with statx then I'd like to do it all at once, and I want this for the reasons stated above. SEQUENCE OF EVENTS ------------------ We do one of the statx changes, that rolls into a real kernel. We run around and submit patches for rsync and anything else we can think of to take advantage of the statx feature. Then we wait, call it 2 kernel releases after the initial release. Then we go and rip out the dev_t hack. Does this sound like a reasonable path forward to resolve everybody's concerns? I feel like I'm missing some other argument here, but I'm currently on vacation and can't think of what it is nor have the energy to go look it up at the moment. Thanks, Josef