On Tue, Mar 14, 2017 at 4:58 PM, Filip Štědronský <r.lkml@xxxxxxxxxx> wrote: > Hi, > > On Tue, Mar 14, 2017 at 01:18:01PM +0200, Amir Goldstein wrote: >> I claim that fanotify filters event by mount not because it >> was a requirement, but because it was an implementation challenge >> to do otherwise. >> >> And I claim that what mount watchers are really interested in is >> "all the changes that happen in the file system in the area >> that is visible to me through this mount point". >> >> In other words, an indexer needs to know if files were modified\ >> create/deleted if that indexer sits in container host namespace >> regardless if those files were modified from within a container >> namespace. >> >> It's not a matter of security/isolation. It's a matter of functionality. >> I agree that for some event (e.g. permission events) it is possible >> to argue both ways (i.e. that the namespace context should be used >> as a filter for events). >> But for the new proposed events (FS_MODIFY_DIR), I really don't >> see the point in isolation by mount/namespace. > > there are basically two classes of uses for a fantotify-like > interface: > > (1) Keeping an up-to-date representation of the file system. > For this, superblock watches are clearly what you want. > > * You are interested to know the current state of the > filesystem so you need to know about every change, > regardless of where it came from. > * As I mentioned earlier, in case of remote, ditributed > and virtual filesystems, the change might come from > within the filesystem itself (if the protocol supports > reporting such changes). This can probably be > implemented only with superblock-scoped watches because > the change is fundamentally not related to any mount. > * Some filesystems might also support change journalling > and it might be concievable to extend the API in the > future to report "past" events (for example by passing > sequence number of last seen event or similar). > * The argument about containers escaping change notification > you mentioned earlier. > > All those factors speak greatly in favour of superblock > watches. > > (2) Tracking filesystem *activity*. Now you are not building > an image of current filesystem state but rather a log of > what happened. Perhaps you are also interested in who > (user/process/...) did what. Permission events also fit > mostly in this category. > > For those it *might* make sense to have mount-scoped > watches, for example if you want to monitor only one > container or a subset of processes. > > We both concentrate on the first but we shouldn't forget about > the second, which was one of the original motivations for > fanotify. > > Thus I conclude that it might be desirable to implement > mount-scoped filename events in the long run. Even though > I agree that the sb-scoped events are more important because > they cover more use cases and you can do additional filtering > (e.g. by pid) if deemed necessary. > > This would require: > > (a) Sprinkling the callers of vfs_* with fanotify calls > as I did, or > (b) Creating wrapper functions like vfs_path_unlink & co. > that would make the necessary fanotify call (and probably > tell the lower function not to generate another > notification), as I suggested earlier. > (c) Give the vfs_* functions an *optional* vfsmount argument. > > In the end I probably find (c) the most elegant but this > can be discussed later, even after your changes are merged. > Agreed. That is an independent question. Thanks for the thorough summary. Amir.