On Wed, 21 Aug 2024 at 23:22, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > Without a kernel enforced timeout, the only way out of this is to > abort the connection. A userspace timeout wouldn't help in this case > with getting the server unstuck. With the kernel timeout, this forces > the kernel handling of the write request to proceed, whihc will drop > the folio lock and resume the server back to a functioning state. > > I don't think situations like this are uncommon. For example, it's not > obvious or clear to developers that fuse_lowlevel_notify_inval_inode() > shouldn't be called inside of a write handler in their server code. Documentation is definitely lacking. In fact a simple rule is: never call a notification function from within a request handling function. Notifications are async events that should happen independently of handling regular operations. Anything else is an abuse of the interface. > > For your concern about potential unintended side effects of timed out > requests without the server's knowledge, could you elaborate more on > the VFS locking example? In my mind, a request that times out is the > same thing as a request that behaves normally and completes with an > error code, but perhaps not? - user calls mknod(2) on fuse directory - VFS takes inode lock on parent directory - calls into fuse to create the file - fuse sends request to server - file creation is slow and times out in the kernel - fuse returns -ETIMEDOUT - VFS releases inode lock - meanwhile the server is still working on creating the file and has no idea that something went wrong - user calls the same mknod(2) again - same things happen as last time - server starts to create the file *again* knowing that the VFS takes care of concurrency - server crashes due to corruption > I think also, having some way for system admins to enforce request > timeouts across the board might be useful as well - for example, if a > malignant fuse server doesn't reply to any requests, the requests hog > memory until the server is killed. As I said, I'm not against enforcing a response time for fuse servers, as long as a timeout results in a complete abort and not just an error on the timed out request. Thanks, Miklos