On Wed, Dec 15, 2021 at 07:29:29PM +0200, Amir Goldstein wrote: > > > > > > The mistake in your premise at 1) is to state that "fuse does not > > > support persistent file handles" > > > without looking into what that statement means. > > > What it really means is that user cannot always open_by_handle_at() > > > from a previously > > > obtained file handle, which has obvious impact on exporting fuse to NFS (*). > > > > Hi Amir, > > > > What good is file handle if one can't use it for open_by_handle_at(). I > > mean, are there other use cases? > > commit 44d705b0: > "...There are several ways that an application can use this information: > > 1. When watching a single directory, the name is always relative to > the watched directory, so application need to fstatat(2) the name > relative to the watched directory. > > 2. When watching a set of directories, the application could keep a map > of dirfd for all watched directories and hash the map by fid obtained > with name_to_handle_at(2). When getting a name event, the fid in the > event info could be used to lookup the base dirfd in the map and then > call fstatat(2) with that dirfd. Ok, so case 1 and 2 still might be doable. > > 3. When watching a filesystem (FAN_MARK_FILESYSTEM) or a large set of > directories, the application could use open_by_handle_at(2) with the fid > in event info to obtain dirfd for the directory where event happened and > call fstatat(2) with this dirfd. > > The last option scales better for a large number of watched directories. > The first two options may be available in the future also for non > privileged fanotify watchers, because open_by_handle_at(2) requires > the CAP_DAC_READ_SEARCH capability. > " This is one is not possible as it needs open_by_handle_at(). > > fsnotifywait [1] has an example of use case #2. > Essentially, when watching inodes, the fanotify file identifier is not very much > different from the inotify "watch descriptor" - it identifies the watched object > and the watched object is pinned to cache as long as the inode mark is set > so file handle would not change also in fuse. Ok, so if we are maintaining a hash map keyed by file handle, then first we need to pin down the inode and then call name_to_handle_at() for the watched object and add to hash table. Something like this. A. foo_fd = open(foo.txt) B. name_to_handle_at(.., foo.txt,...) C. Add info in hash table using foo_handle as key. D. Add watch on foo.txt (fanotify_mark()). E. close(foo_fd). One could probably skip step A and E. And do this instead. A. Add watch on foo.txt (fanotify_mark()) B. name_to_handle_at(.., foo.txt,...) C. Add info in hash table using foo_handle as key. But this is little bit racy. You might start getting events with file handles of foo.txt before you could complete B or C. > > [1] https://github.com/inotify-tools/inotify-tools/pull/134 > > > > > IIUC, file handle for the same object can change if inode had been flushed > > out of guest cache and brought back in later. So if application say > > generated file handle for an object and saved it and later put a watch > > on that object, by that time file handle of the object might have changed > > (as seen by fuse). So one can't even use to match it with previous saved > > file handle. > > > > The argument is not applicable for inode watches. Fair enough. I could see a very limited use case and thought that's not enough. But looks like you seem to be ok with that. > Filesystem and mount watches are not going to be supported with virtiofs > or any filesystem that does not support persistent file handles. Ok, so no filesystem and mount watches for virtiofs to begin with. > > > So I can't use file handle for open_by_handle_at(). I can't use it to > > match it with previously saved file handle. So what can I use it for? > > > > IOW, I could not imagine supporting fanotify file handles without > > fixing the file handles properly in fuse. And it needs fixing in > > virtiofs as well as we can't trust random file handles from guest > > for regular files. > > > > Partly correct statements, but when looking at the details, they are > not relevant to the case of fanotify inode watch. > > Note that at the moment, fuse does not even support local fanotify > watch with file handles because of fanotify_test_fsid() - fuse does > not set f_fsid (not s_uuid), so it's not really about supporting fanotify > on fuse now. Hmm..., that means we first will have to look into supporting local fanotify events with file handles on fuse. Without that we can't even test our remote fsnotify changes looks like. This sounds like another blocker (or dependency project to complete first) before one can make progress with remote inotify/fanotify/fsnotify. > It's about the vfs APIs for remote fsnotify that should not be inotify > specific. I understand that part. But at the same time, remote fsnotify API will probably evolve as you keep on adding more functionality. What if there is another notification mechanism tomorrow say newfancynotify(), we might have to modify remote fsnoitfy again to accomodate that. IOW, fsnotify seems to be just underlying plumbing and whatever you add today might not be enough to support tomorrow's features. That's why I wanted to start with a minimal set of functionality and add more to it later. Thanks Vivek