Re: [RFC] Volatile fanotify marks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 28-02-22 19:40:07, Amir Goldstein wrote:
> On Mon, Feb 28, 2022 at 4:06 PM Jan Kara <jack@xxxxxxx> wrote:
> >
> > Hi Amir!
> >
> > On Wed 23-02-22 20:42:37, Amir Goldstein wrote:
> > > I wanted to get your feedback on an idea I have been playing with.
> > > It started as a poor man's alternative to the old subtree watch problem.
> > > For my employer's use case, we are watching the entire filesystem using
> > > a filesystem mark, but would like to exclude events on a subtree
> > > (i.e. all files underneath .private/).
> > >
> > > At the moment, those events are filtered in userspace.
> > > I had considered adding directory marks with an ignored mask on every
> > > event that is received for a directory path under .private/, but that has the
> > > undesired side effect of pinning those directory inodes to cache.
> > >
> > > I have this old fsnotify-volatile branch [1] that I am using for an overlayfs
> > > kernel internal fsnotify backend. I wonder what are your thoughts on
> > > exposing this functionally to fanotify UAPI (i.e. FAN_MARK_VOLATILE).
> >
> > Interesting idea. I have some reservations wrt to the implementation (e.g.
> > fsnotify_add_mark_list() convention of returning EEXIST when it updated
> > mark's mask, or the fact that inode reclaim should now handle freeing of
> > mark connector and attached marks - which may get interesting locking wise)
> > but they are all fixable.
> 
> Can you give me a hint as to how to implement the freeing of marks?

OK, now I can see that fsnotify_inode_delete() gets called from
__destroy_inode() and thus all marks should be freed even for inodes
released by inode reclaim. Good.

> > I'm wondering a bit whether this is really useful enough (and consequently
> > whether we will not get another request to extend fanotify API in some
> > other way to cater better to some other usecase related to subtree watches
> > in the near future). I understand ignore marks are mainly a performance
> > optimization and as such allowing inodes to be reclaimed (which means they
> > are not used much and hence ignored mark is not very useful anyway) makes
> 
> The problem is that we do not know in advance which of the many dirs in
> the subtree are accessed often and which are accessed rarely (and that may
> change over time), so volatile ignore marks are a way to set up ignore marks
> on the most accessed dirs dynamically.

Yes, I understand.

> > sense. Thinking about this more, I guess it is useful to improve efficiency
> > when you want to implement any userspace event-filtering scheme.
> >
> > The only remaining pending question I have is whether we should not go
> > further and allow event filtering to happen using an eBPF program. That
> > would be even more efficient (both in terms of memory and CPU). What do you
> > think?
> >
> 
> I think that is an unrelated question.
> 
> I do agree that we should NOT add "subtree filter" functionality to fanotify
> (or any other filter) and that instead, we should add support for attaching an
> eBPF program that implements is_subdir().
> I found this [1] convection with Tycho where you had suggested this idea.
> I wonder if Tycho got to explore this path further?
> 
> But I think that it is one thing to recommend users to implement their
> filters as
> eBPF programs and another thing to stand in the way of users that prefer to
> implement userspace event filtering. It could be that the filter
> cannot be easily
> described by static rules to an eBPF program (e.g. need to query a database).
> 
> In my POV, FAN_MARK_VOLATILE does not add any new logic/filtering rule.
> It adds resource control by stating that the ignore mark is "best effort".
> 
> Does it make sense?

OK, makes sense. So I agree the functionality is worth it. Will you post
the patches for review of technical details?

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux