On Thu, Jan 19, 2023 at 04:14:55PM -0500, Eric Chanudet wrote: > From: Alexander Larsson <alexl@xxxxxxxxxx> > > Use call_rcu to defer releasing the umount'ed or detached filesystem > when calling namepsace_unlock(). > > Calling synchronize_rcu_expedited() has a significant cost on RT kernel > that default to rcupdate.rcu_normal_after_boot=1. > > For example, on a 6.2-rt1 kernel: > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt > 0.07464 +- 0.00396 seconds time elapsed ( +- 5.31% ) > > With this change applied: > perf stat -r 10 --null --pre 'mount -t tmpfs tmpfs mnt' -- umount mnt > 0.00162604 +- 0.00000637 seconds time elapsed ( +- 0.39% ) > > Waiting for the grace period before completing the syscall does not seem > mandatory. The struct mount umount'ed are queued up for release in a > separate list and no longer accessible to following syscalls. Again, NAK. If a filesystem is expected to be shut down by umount(2), userland expects it to have been already shut down by the time the syscall returns. It's not just visibility in namespace; it's "can I pull the disk out?". Or "can the shutdown get to taking the network down?", for that matter.