On Wed, 11 Dec 2024 at 11:00, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Tue, Dec 10, 2024 at 04:10:45PM +0100, Miklos Szeredi wrote: > > On Sat, 7 Dec 2024 at 22:17, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > I took another look at f{a,s}notify. There's no copy_to_user() happening > > > when adding events via fsnotify(). It happens when the caller retrieves > > > events via read() from the relevant notify file descriptor. We should > > > still move calls to notify_mounts() out of the namespace semaphore > > > whenever we can though. > > > > Doesn't work. After unlocking namespace_sem deref of mnt->prev_ns > > might lead to UAF. > > Hm, a UAF could only be triggered by mounts that were unmounted due to > umount propagation into another mount namespaces. The caller's mount > namespace in mnt_ns->prev_ns cannot go away until all mounts are put. Why? E.g. one does umount -l on a subtree in a private namespace, then destroys the namespace immediately. There's no serialization between the two other than namespace_sem, so if the former releases namespace_sem the namespace destruction can run to completion while the detached subtree's mounts are still being processed. > The simple fix is to take a passive reference count. But I'm not sure > what would be more expensive (holding the lock or the reference counts). Right, that would work, but I think holding namespace_sem for read while calling fsnotify() is both simpler and more efficient. Thanks, Miklos