I wanted to get your feedback on an idea I have been playing with. It started as a poor man's alternative to the old subtree watch problem.
I do agree that we should NOT add "subtree filter" functionality to fanotify (or any other filter) and that instead, we should add support for attaching an eBPF program that implements is_subdir(). I found this [1] convection with Tycho where you had suggested this idea. I wonder if Tycho got to explore this path further? [1] https://lore.kernel.org/linux-fsdevel/20200828084603.GA7072@xxxxxxxxxxxxxx/
Hi Amir, Hi Jan, Thanks for pinging back on me. Indeed I did "explore this path further". In my project https://github.com/tycho-kirchner/shournal the goal is to track read/written files of a process tree and all it's child-processes and connect this data to a given shell-command. In fact after Amir's and mine last correspondence I implemented a kernel module which instruments ftrace and tracepoints to trace fput-events (kernel/event_handler.c:event_handler_fput) of specific tasks, which are then further processed in a dedicated kernel thread. I considered eBPF for this task but found no satisfying approach to have dynamic, different filter-rules (e.g. include-paths) for each process tree of each user. Regarding improvement of fanotify let's discriminate two cases: system-monitoring and tracing. Regarding system-monitoring: I'm not sure how exactly FAN_MARK_VOLATILE would work (Amir, could you please elaborate?) but what do you think about the following approach, in order to solve the subtree watch problem: - Store the include/exlude-paths of interest as *strings* in a hashset. - on fsevent, lookup the path by calling d_path() only once and cache, whether events for the given path are of interest. This can either happen with a reference on the path (clear older paths periodically in a work queue) or with a timelimit in which potentially wrong paths are accepted (path pointer freed and address reused). The second approach I use myself in kernel/event_consumer_cache.c. See also kpathtree.c for a somewhat efficient subpath-lookup. Regarding tracing I think fanotify would really benefit from a FAN_MARK_PID (with optional follow fork-mode). That way one of the first filter-steps would be whether events for the given task are of interest, so we have no performance problem for all other tasks. The possibility to mark specific processes would also have another substantial benefit: fanotify could be used without root privileges by only allowing the user to mark his/her own processes. That way existing inotify-users could finally switch to the cleaner/more powerful fanotify. Thanks and kind regards Tycho