On Wed 21-11-18 18:13:17, Amir Goldstein wrote: > > > > > 2. exportfs_encode_fh(dentry,... in fanotify_alloc_fid() > > > Can be converted to exportfs_encode_inode_fh(inode,... > > > > If you have the parent inode then yes. In lot of cases we do have it, not > > sure if in all of them (but likely yes so that we can do proper > > FS_EVENT_ON_CHILD) handling. > > > > We do not need the parent inode, because we are not encoding a > "connectable" file handle. we need: > exportfs_encode_inode_fh(inode, (struct fid *)fid->f_handle, > &dwords, NULL); Ah, you're right. I thought any handle has to be "connectable" as you say but apparently open-by-handle is fine even with not connectable ones. Thanks for enlightening me :) > > > 3. statfs_by_dentry(dentry,... in fanotify_alloc_fid() > > > Can be converted to statfs_by_dentry(inode->i_sb->sb_root,... > > > > This might be problematic e.g. with btrfs subvolumes which have each > > different fsid so the fsid-handle pair might be actually invalid or even > > point to a file with different contents. Maybe we could just store the > > fsid in the fsnotify_mark when it is created and use it when generating > > events? That would also get rid of the overhead of calling statfs for each > > generated event which I don't like... > > > > OK, I was not being accurate when I wrote that these are the only 3 places > we use dentry. Those are the only 3 places in fanotify, but we also use > dentry->d_sb in fsnotify() to get to the sb mark of course, so we will be using > inode->i_sb instead. Sure. > w.r.t btrfs - btrfs has a single sb for multiple subvolumes so by > definition the FAN_MARK_FILESYSTEM feature is only capable of watching > ALL subvolumes. Agreed but then if some event happens on inode A in subvolume S, you have to be sure the handle+fsid you return will allow you to open the file with that content and not inode A on subvolume T which has a different contents... > If we wanted to implement subvolume watch for btrfs, we would need > to support attaching a mark to a btrfs subvolume struct (or fs_view [1]). No, that's not what I meant. > Basically, the purpose, of fid->fsid is: > 1. A key for finding a mount point to use as mount_fd argument to > open_by_handle_at() > 2. Make fid unique across the system Understood, that's what I thought. > Since btrfs file handles encode the subvolume root object id in the file > handle, fid->fsid is only needed to find a btrfs mount of the same sb > (blockdev). Ah good. I didn't realize btrfs will record subvolume root in the file handle. On a second thought how else could NFS mount work, right, but it didn't occur to me yesterday. > Regardless, IIUC, btrfs_statfs() returns an fsid which is associated with > the single super block struct, so all dentries in all subvolumes will > return the same fsid: btrfs_sb(dentry->d_sb)->fsid. This is not true AFAICT. Looking at btrfs_statfs() I can see: buf->f_fsid.val[0] = be32_to_cpu(fsid[0]) ^ be32_to_cpu(fsid[2]); buf->f_fsid.val[1] = be32_to_cpu(fsid[1]) ^ be32_to_cpu(fsid[3]); /* Mask in the root object ID too, to disambiguate subvols */ buf->f_fsid.val[0] ^= BTRFS_I(d_inode(dentry))->root->root_key.objectid >> 32; buf->f_fsid.val[1] ^= BTRFS_I(d_inode(dentry))->root->root_key.objectid; So subvolume root is xored into the FSID. Thus dentries from different subvolumes are going to return different fsids... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR