On Mon, 9 Sept 2024 at 02:18, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > Generally, new vfs apis always try hard to call helpers that copy to or > > from userspace without any locks held as my understanding has been that > > this is best practice as to avoid risking taking page faults while > > holding a mutex or semaphore even though that's supposedly safe. It's indeed "best practices" to strive to do user copies without locks, but it's not always possible to reasonably avoid. IOW, accessing user space with a lock held *can* cause some nasty issues, but is not necessarily wrong. The worst situation is where that lock then may be needed to *deal* with user space page faults, and that complicates the write() paths in particular (generic_perform_write() and friends using copy_folio_from_iter_atomic() and other magical games). But that's actually fairly unusual. The much more common situation is just a random lock, and we have user accesses under them all the time. You still want to be careful, because if the lock is important enough, it can cause users to be able to effectively DoS some subsystem and/or just be a huge nuisance (we used to have that in the tty layer). And no, the size of the user copy doesn't much matter. A __put_user() isn't much better than a big copy_from_user() - it may be faster for the simple case where things are in memory, but it's the "it's paged out" case that causes issues, and then it's the IO (and possible extra user-controlled fuse paths in particular) that are an issue, not whether it's "just one 64-bit word". Epoll is disgusting. But the real problems with epoll tend to be about the random file descriptor recursions, not the epoll mutex that only epoll cares about. Linus