Re: [RFC] Volatile fanotify marks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 1, 2022 at 1:07 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Mon 28-02-22 19:40:07, Amir Goldstein wrote:
> > On Mon, Feb 28, 2022 at 4:06 PM Jan Kara <jack@xxxxxxx> wrote:
> > >
> > > Hi Amir!
> > >
> > > On Wed 23-02-22 20:42:37, Amir Goldstein wrote:
> > > > I wanted to get your feedback on an idea I have been playing with.
> > > > It started as a poor man's alternative to the old subtree watch problem.
> > > > For my employer's use case, we are watching the entire filesystem using
> > > > a filesystem mark, but would like to exclude events on a subtree
> > > > (i.e. all files underneath .private/).
> > > >
> > > > At the moment, those events are filtered in userspace.
> > > > I had considered adding directory marks with an ignored mask on every
> > > > event that is received for a directory path under .private/, but that has the
> > > > undesired side effect of pinning those directory inodes to cache.
> > > >
> > > > I have this old fsnotify-volatile branch [1] that I am using for an overlayfs
> > > > kernel internal fsnotify backend. I wonder what are your thoughts on
> > > > exposing this functionally to fanotify UAPI (i.e. FAN_MARK_VOLATILE).
> > >
> > > Interesting idea. I have some reservations wrt to the implementation (e.g.
> > > fsnotify_add_mark_list() convention of returning EEXIST when it updated
> > > mark's mask, or the fact that inode reclaim should now handle freeing of
> > > mark connector and attached marks - which may get interesting locking wise)
> > > but they are all fixable.
> >
> > Can you give me a hint as to how to implement the freeing of marks?
>
> OK, now I can see that fsnotify_inode_delete() gets called from
> __destroy_inode() and thus all marks should be freed even for inodes
> released by inode reclaim. Good.
>
> > > I'm wondering a bit whether this is really useful enough (and consequently
> > > whether we will not get another request to extend fanotify API in some
> > > other way to cater better to some other usecase related to subtree watches
> > > in the near future). I understand ignore marks are mainly a performance
> > > optimization and as such allowing inodes to be reclaimed (which means they
> > > are not used much and hence ignored mark is not very useful anyway) makes
> >
> > The problem is that we do not know in advance which of the many dirs in
> > the subtree are accessed often and which are accessed rarely (and that may
> > change over time), so volatile ignore marks are a way to set up ignore marks
> > on the most accessed dirs dynamically.
>
> Yes, I understand.
>
> > > sense. Thinking about this more, I guess it is useful to improve efficiency
> > > when you want to implement any userspace event-filtering scheme.
> > >
> > > The only remaining pending question I have is whether we should not go
> > > further and allow event filtering to happen using an eBPF program. That
> > > would be even more efficient (both in terms of memory and CPU). What do you
> > > think?
> > >
> >
> > I think that is an unrelated question.
> >
> > I do agree that we should NOT add "subtree filter" functionality to fanotify
> > (or any other filter) and that instead, we should add support for attaching an
> > eBPF program that implements is_subdir().
> > I found this [1] convection with Tycho where you had suggested this idea.
> > I wonder if Tycho got to explore this path further?
> >
> > But I think that it is one thing to recommend users to implement their
> > filters as
> > eBPF programs and another thing to stand in the way of users that prefer to
> > implement userspace event filtering. It could be that the filter
> > cannot be easily
> > described by static rules to an eBPF program (e.g. need to query a database).
> >
> > In my POV, FAN_MARK_VOLATILE does not add any new logic/filtering rule.
> > It adds resource control by stating that the ignore mark is "best effort".
> >
> > Does it make sense?
>
> OK, makes sense. So I agree the functionality is worth it. Will you post
> the patches for review of technical details?

Of course, I need to add a patch for UAPI change and write a test.
I just wanted to get a tentative ACK before I put in more effort.
I will address your comment about -EEXIST return value.

Thanks,
Amir.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux