On Fri, 2024-12-06 at 11:19 -0800, Joanne Koong wrote: > On Thu, Dec 5, 2024 at 4:37 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > On Thu, 2024-11-14 at 11:13 -0800, Joanne Koong wrote: > > > There are situations where fuse servers can become unresponsive or > > > stuck, for example if the server is deadlocked. Currently, there's no > > > good way to detect if a server is stuck and needs to be killed manually. > > > > > > This commit adds an option for enforcing a timeout (in minutes) for > > > requests where if the timeout elapses without the server responding to > > > the request, the connection will be automatically aborted. > > > > > > > I haven't been keeping up with the earlier series, but I think I agree > > with Sergey that this timeout would be better expressed in seconds. > > > > Most filesystems that deal with timeouts (NFS, CIFS, etc.) specify them > > as a number of seconds, and expressing this in minutes goes against > > that convention. It also seems rather coarse-grained. I could easily > > see a situation where 5 minutes is too short, but 6 minutes is too > > long. > > Sounds good, I'll change the timeout to seconds. The reason it was set > in minutes is because the timeouts have an upper margin of error > (right now, up to 1 minute) and I didn't want to give a misleading > illusion of precision. > Understood. That is a concern, but I think you can mitigate it by documenting that the timeout just makes it eligible to be reaped, rather than a hard timeout guarantee. That would also allow you to change the implementation in the future to reap more (or less) aggressively, without breaking the interface. > It sounds like it'd be useful if the timer is run more frequently (eg > instead of every 1 minute, maybe running this every 15 or 30 secs) as > well. I'll make this change in the next version. > Yeah, 1 minute seems like quite a long time. I don't know FUSE well enough to get a feel for how heavyweight fuse_check_timeout() is, but it might be good to check more often, especially if you don't expect it to take long in the common case. > > > > With that too, you probably wouldn't need patch #1. You could treat it > > as a 32-bit integer and just clamp the value as necessary. > > > > > Please note that these timeouts are not 100% precise. The request may > > > take an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested max > > > timeout due to how it's internally implemented. > > > > > > Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx> > > > Reviewed-by: Bernd Schubert <bschubert@xxxxxxx> > > > --- > > > fs/fuse/dev.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++ > > > fs/fuse/fuse_i.h | 21 +++++++++++++ > > > fs/fuse/inode.c | 21 +++++++++++++ > > > 3 files changed, 122 insertions(+) > > > > ... > > > > -- > > Jeff Layton <jlayton@xxxxxxxxxx> -- Jeff Layton <jlayton@xxxxxxxxxx>