On Fri, Jun 26, 2020 at 12:49 PM Jan Kara <jack@xxxxxxx> wrote: > > On Thu 25-06-20 13:12:34, Linus Torvalds wrote: > > On Thu, Jun 25, 2020 at 11:19 AM Jan Kara <jack@xxxxxxx> wrote: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify_for_v5.8-rc3 > > > > > > to get a performance improvement to reduce impact of fsnotify for inodes > > > where it isn't used. > > > > Pulled. > > Thanks. > > > I do note that there's some commonality here with commit ef1548adada5 > > ("proc: Use new_inode not new_inode_pseudo") and the discussion there > > around "maybe new_inode_pseudo should disable fsnotify instead". > > > > See > > > > https://lore.kernel.org/lkml/CAHk-=wh7nZNf81QPcgpDh-0jzt2sOF3rdUEB0UcZvYFHDiMNkw@xxxxxxxxxxxxxx/ > > > > and in particular the comment there: > > > > I do wonder if we should have kept the new_inode_pseudo(), > > and instead just make fsnotify say "you can't notify on an inode that > > isn't on the superblock list" > > > > which is kind of similar to this alloc_file_pseudo() change.. > > > > There it wasn't so much about performance, as about an actual bug > > (quoting from that commit): > > > > Recently syzbot reported that unmounting proc when there is an ongoing > > inotify watch on the root directory of proc could result in a use > > after free when the watch is removed after the unmount of proc > > when the watcher exits. > > > > but the fnsotify connection and the "pseudo files/inodes can't be > > notified about" is the same. > > Thanks for notification. I think I've seen the original syzbot report but > then lost track of how Eric solved it. Ideally I'd just forbid fsnotify on > every virtual fs (proc, sysfs, sockets, ...) because it is not very useful > and only causes issues - besides current issues, there were also issues in > the past which resulted in 0b3b094ac9a7 "fanotify: Disallow permission > events for proc filesystem". The only events that get reliably generated > for these virtual filesystems are FS_OPEN / FS_CLOSE ones - and that's why > I didn't yet disable fsnotify on the large scale for the virtual > filesystems. Also I'm slightly concerned that someone might be mistakenly > putting notification marks on virtual inodes where they don't generate any > events but if we started to disallow such marks, the app would break > because it doesn't expect error. But maybe we could try doing this a see > whether someone complains... As I wrote in review for v1 I think that would be preferred. I just wasn't sure what characterizes the inodes that should be blacklisted for setting watches, because I was thinking only of the performance aspect. The sb list sounds like a good criteria to blacklist by. > > WRT "you can't notify on an inode that isn't on the superblock list" - > that's certainly a requirement for fsnotify to work reliably but because we > can add notification marks not only for inodes but also for mountpoint or > superblock, I'd rather go for a superblock flag or file_system_type flag > like I did in above mentioned 0b3b094ac9a7 because that seems more robust > and I don't see a need for finer grained control of whether fsnotify makes > sence or not. > If you want to be able to distinguish between tmpfs and the shm_mnt kern_mount, it would need to be a sb flag, although smem inodes are all on the sb list, so it would just be for performance. Thanks, Amir.