On Wed 02-03-22 20:14:29, Amir Goldstein wrote: > On Wed, Mar 2, 2022 at 12:04 PM Tycho Kirchner <tychokirchner@xxxxxxx> wrote: > > >> > > >> Regarding tracing I think fanotify would really benefit from a FAN_MARK_PID (with optional follow fork-mode). That way one of the first filter-steps would be whether events for the given task are of interest, so we have no performance problem for all other tasks. The possibility to mark specific processes would also have another substantial benefit: fanotify could be used without root privileges by only allowing the user to mark his/her own processes. > > >> That way existing inotify-users could finally switch to the cleaner/more powerful fanotify. > > > > > > We already have partial support for unprivileged fanotify. > > > Which features are you missing with unprivileged fanotify? > > > and why do you think that filtering by process tree will allow those > > > features to be enabled? > > > > > > I am missing the ability to filter for (close-)events of large > > directory trees in a race-free manner, so that no events are lost on > > newly created dirs. Even without the race, monitoring my home-directory > > is impossible (without privileges) as I have far more than 8192 > > directories (393941 as of writing (; ). Monitoring mounts solves these > > problems but introduces two others: First it requires privileges, > > second a potentially large number of events *not of interest* have to > > be copied to user-space (except unshared mount namespaces are used). > > Allowing a user to only monitor his/her own processes would make > > mark_mount privileges unnecessary (please correct me if I'm wrong). > > While still events above the directory of interest are reported, at > > least events from other users are filtered beforehand. > > I don't know. Security model is hard. > What do you mean by "his/her own processes"? processes owned by the same uid? > With simple look it sounds right, but other security policy may be in > play (e.g. sepolicy) > which can grand different processes owned by same user different file access > permissions and not any process may be allowed to ptrace other processes. > userns has more clear semantics, so monitoring all processes/mounts inside > an unprivileged userns may be easier to prove. I see two problems with limiting events to those generated by a particular process / user: 1) Fanotify is a filesystem notification system. As such it is primarily aimed at (more or less efficient) answering of a question - did something in the filesystem change, was some data from the filesystem used? If you start to limit visible events to processes / users you are no longer able to reliably answer this question. As such we would get complaints "but this is not good enough for our usecase" sooner rather than later. In filesystem change notification space we have a long history of partial solutions that then forced us into full redesign and all the pain associated with that. 2) Limiting events to those generated by a particular user may somewhat reduce the amount of generated events but for a lot of usecases that is not really significant. So the push for some middle ground between - watching a file / dir and watching whole fs will still stay. Regarding the security model for unpriviledged watches: IMO a sensible security model for fs notification could be like: "If you can read the file, you should be able to watch for changes. If you own the file, you should be able to watch for accesses." But the trouble is that for filesystem wide marks, it is not easy to verify these conditions. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR