Re: [RFC] Volatile fanotify marks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 1, 2022 at 2:26 PM Tycho Kirchner <tychokirchner@xxxxxxx> wrote:
>
>
>
> >>> I wanted to get your feedback on an idea I have been playing with.
> >>> It started as a poor man's alternative to the old subtree watch problem.
>
>
> > I do agree that we should NOT add "subtree filter" functionality to fanotify
> > (or any other filter) and that instead, we should add support for attaching an
> > eBPF program that implements is_subdir().
> > I found this [1] convection with Tycho where you had suggested this idea.
> > I wonder if Tycho got to explore this path further?
> >
> > [1] https://lore.kernel.org/linux-fsdevel/20200828084603.GA7072@xxxxxxxxxxxxxx/
>
> Hi Amir, Hi Jan,
> Thanks for pinging back on me. Indeed I did "explore this path further".
> In my project
> https://github.com/tycho-kirchner/shournal
>
> the goal is to track read/written files of a process tree and all it's child-processes and connect this data to a given shell-command. In fact after Amir's and mine last correspondence I implemented a kernel module which instruments ftrace and tracepoints to trace fput-events (kernel/event_handler.c:event_handler_fput) of specific tasks, which are then further processed in a dedicated kernel thread. I considered eBPF for this task but found no satisfying approach to have dynamic, different filter-rules (e.g. include-paths) for each process tree of each user.
>
>
> Regarding improvement of fanotify let's discriminate two cases: system-monitoring and tracing.
> Regarding system-monitoring: I'm not sure how exactly FAN_MARK_VOLATILE would work (Amir, could you please elaborate?)

FAN_MARK_VOLATILE is not a solution for "include" filters.
It is a solution for "exclude" filters implemented in userspace.
If monitoring program gets an event and decides that its path should be excluded
it may set a "volatile" exclude mark on that directory that will
suppress further
events from that directory for as long as the directory inode remains
in inode cache.
After directory inode has not been accessed for a while and evicted
from inode cache
the monitoring program can get an event in that directory again and then it can
re-install the volatile ignore mark if it wants to.

> but what do you think about the following approach, in order to solve the subtree watch problem:
> - Store the include/exlude-paths of interest as *strings* in a hashset.
> - on fsevent, lookup the path by calling d_path() only once and cache, whether events for the given path are of interest. This
>    can either happen with a reference on the path (clear older paths periodically in a work queue)
>    or with a timelimit in which potentially wrong paths are accepted (path pointer freed and address reused).
>    The second approach I use myself in kernel/event_consumer_cache.c. See also kpathtree.c for a somewhat efficient
>    subpath-lookup.

I would implement filtering with is_subdir() and not with d_path(),
but there are
advantages to either approach.
In any case, I see there is BPF_FUNC_d_path, so why can't your approach be
implemented using an eBPF program?

>
> Regarding tracing I think fanotify would really benefit from a FAN_MARK_PID (with optional follow fork-mode). That way one of the first filter-steps would be whether events for the given task are of interest, so we have no performance problem for all other tasks. The possibility to mark specific processes would also have another substantial benefit: fanotify could be used without root privileges by only allowing the user to mark his/her own processes.
> That way existing inotify-users could finally switch to the cleaner/more powerful fanotify.

We already have partial support for unprivileged fanotify.
Which features are you missing with unprivileged fanotify?
and why do you think that filtering by process tree will allow those
features to be enabled?
A child process may well have more privileges to read directories than
its parent.

Thanks,
Amir.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux