On Mon, 7 Oct 2024 at 20:43, Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > There are situations where fuse servers can become unresponsive or > stuck, for example if the server is deadlocked. Currently, there's no > good way to detect if a server is stuck and needs to be killed manually. > > This commit adds an option for enforcing a timeout (in minutes) for > requests where if the timeout elapses without the server responding to > the request, the connection will be automatically aborted. > > Please note that these timeouts are not 100% precise. The request may > take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max timeout > due to how it's internally implemented. One thing I worry about is adding more roadblocks on the way to making request queuing more scalable. Currently there's fc->num_waiting that's touched on all requests and bg_queue/bg_lock that are touched on background requests. We should be trying to fix these bottlenecks instead of adding more. Can't we use the existing lists to scan requests? It's more complex, obviously, but at least it doesn't introduce yet another per-fc list to worry about. Thanks, Miklos