On Wed 25-10-23 21:02:45, Amir Goldstein wrote: > On Wed, Oct 25, 2023 at 8:17 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > On Wed, Oct 25, 2023 at 4:50 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > Jan, > > > > > > This patch set implements your suggestion [1] for handling fanotify > > > events for filesystems with non-uniform f_fsid. > > > > > > With these changes, events report the fsid as it would be reported > > > by statfs(2) on the same objet, i.e. the sub-volume's fsid for an inode > > > in sub-volume. > > > > > > This creates a small challenge to watching program, which needs to map > > > from fsid in event to a stored mount_fd to use with open_by_handle_at(2). > > > Luckily, for btrfs, fsid[0] is uniform and fsid[1] is per sub-volume. > > > > > > I have adapted fsnotifywatch tool [2] to be able to watch btrfs sb. > > > The adapted tool detects the special case of btrfs (a bit hacky) and > > > indexes the mount_fd to be used for open_by_handle_at(2) by fsid[0]. > > > > > > Note that this hackacry is not needed when the tool is watching a > > > single filesystem (no need for mount_fd lookup table), because btrfs > > > correctly decodes file handles from any sub-volume with mount_fd from > > > any other sub-volume. > > > > Jan, > > > > Now that I've implemented the userspace part of btrfs sb watch, > > I realize that if userspace has to be aware of the fsid oddity of btrfs > > anyway, maybe reporting the accurate fsid of the object in event is > > not that important at all. > > > > Facts: > > 1. file_handle is unique across all sub-volumes and can be resolved > > from any fd on any sub-volume > > 2. fsid[0] can be compared to match an event to a btrfs sb, where any > > fd can be used to resolve file_handle > > 3. userspace needs to be aware of this fsid[0] fact if it watches more > > than a single sb and userspace needs not care about the value of > > fsid in event at all when watching a single sb > > 4. even though fanotify never allowed setting sb mark on a path inside > > btrfs sub-volume, it always reported events on inodes in sub-volumes > > to btrfs sb watch - those events always carried the "wrong" fsid (i.e. > > the btrfs root volume fsid) > > 5. we already agreed that setting up inode marks on inodes inside > > sub-volume should be a no brainer > > > > If we allow reporting either sub-vol fsid or root-vol fsid (exactly as > > we do for inodes in sub-vol in current upstream), > > Another way to put it is that fsid in event describes the object > that was used to setup the mark not the target object. > > If an event is received via an inode/sb/mount mark, the fsid > would always describe the fsid of the inode that was used to setup > the mark and that is always the fsid that userspace would query > statfs(2) at the time of calling the fanotify_mark(2) call. > > Maybe it is non trivial to document, but for a library that returns > an opaque "watch descriptor", the "watch descriptor" can always > be deduced from the event. > > Does this make sense? Yes, it makes sense if we always reported event with fsid of the object used for placing the mark. For filesystems with homogeneous fsid there's no difference, for btrfs it looks like the least surprising choice and works well for inode marks as well. The only catch is in the internal fsnotify implementation AFAICT - if we have multiple marks for the same btrfs superblock, each mark on different subvolume, then we should be reporting one event with different fsids for different marks. So we need to cache the fsid in the mark and not in the connector. But that should be easy to do. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR