Hi Linus, Here's a set of patches to add notifications for mount topology events, such as mounting, unmounting, mount expiry, mount reconfiguration. The first patch in the series adds a hard limit on the number of watches that any particular user can add. The RLIMIT_NOFILE value for the process adding a watch is used as the limit. Even if you don't take the rest of the series, can you at least take this one? An LSM hook is included for an LSM to rule on whether or not a mount watch may be set on a particular path. This series is intended to be taken in conjunction with the fsinfo series which I'll post a pull request for shortly and which is dependent on it. Karel Zak[*] has created preliminary patches that add support to libmount and Ian Kent has started working on making systemd use them. [*] https://github.com/karelzak/util-linux/commits/topic/fsinfo Note that there have been some last minute changes to the patchset: you wanted something adding and Miklós wanted some bits taking out/changing. I've placed a tag, fsinfo-core-20200724 on the aggregate of these two patchsets that can be compared to fsinfo-core-20200803. To summarise the changes: I added the limiter that you wanted; removed an unused symbol; made the mount ID fields in the notificaion 64-bit (the fsinfo patchset has a change to convey the mount uniquifier instead of the mount ID); removed the event counters from the mount notification and moved the event counters into the fsinfo patchset. ==== WHY? ==== Why do we want mount notifications? Whilst /proc/mounts can be polled, it only tells you that something changed in your namespace. To find out, you have to trawl /proc/mounts or similar to work out what changed in the mount object attributes and mount topology. I'm told that the proc file holding the namespace_sem is a point of contention, especially as the process of generating the text descriptions of the mounts/superblocks can be quite involved. The notification generated here directly indicates the mounts involved in any particular event and gives an idea of what the change was. This is combined with a new fsinfo() system call that allows, amongst other things, the ability to retrieve in one go an { id, change_counter } tuple from all the children of a specified mount, allowing buffer overruns to be dealt with quickly. This is of use to systemd to improve efficiency: https://lore.kernel.org/linux-fsdevel/20200227151421.3u74ijhqt6ekbiss@xxxxxxxxxxx/ And it's not just Red Hat that's potentially interested in this: https://lore.kernel.org/linux-fsdevel/293c9bd3-f530-d75e-c353-ddeabac27cf6@xxxxxxxxx/ David --- The following changes since commit ba47d845d715a010f7b51f6f89bae32845e6acb7: Linux 5.8-rc6 (2020-07-19 15:41:18 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/mount-notifications-20200803 for you to fetch changes up to 841a0dfa511364fa9a8d67512e0643669f1f03e3: watch_queue: sample: Display mount tree change notifications (2020-08-03 12:15:38 +0100) ---------------------------------------------------------------- Mount notifications ---------------------------------------------------------------- David Howells (5): watch_queue: Limit the number of watches a user can hold watch_queue: Make watch_sizeof() check record size watch_queue: Add security hooks to rule on setting mount watches watch_queue: Implement mount topology and attribute change notifications watch_queue: sample: Display mount tree change notifications Documentation/watch_queue.rst | 12 +- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/arm64/include/asm/unistd.h | 2 +- arch/arm64/include/asm/unistd32.h | 2 + arch/ia64/kernel/syscalls/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + fs/Kconfig | 9 ++ fs/Makefile | 1 + fs/mount.h | 18 +++ fs/mount_notify.c | 222 ++++++++++++++++++++++++++++ fs/namespace.c | 22 +++ include/linux/dcache.h | 1 + include/linux/lsm_hook_defs.h | 3 + include/linux/lsm_hooks.h | 6 + include/linux/sched/user.h | 3 + include/linux/security.h | 8 + include/linux/syscalls.h | 2 + include/linux/watch_queue.h | 7 +- include/uapi/asm-generic/unistd.h | 4 +- include/uapi/linux/watch_queue.h | 31 +++- kernel/sys_ni.c | 3 + kernel/watch_queue.c | 8 + samples/watch_queue/watch_test.c | 41 ++++- security/security.c | 7 + 37 files changed, 422 insertions(+), 6 deletions(-) create mode 100644 fs/mount_notify.c